• bitcoinBitcoin(BTC)$70,242.00-0.31%
  • ethereumEthereum(ETH)$2,132.79-2.40%
  • tetherTether(USDT)$1.000.00%
  • rippleXRP(XRP)$1.45-1.54%
  • binancecoinBNB(BNB)$640.60-1.19%
  • usd-coinUSDC(USDC)$1.000.00%
  • solanaSolana(SOL)$88.75-1.55%
  • tronTRON(TRX)$0.3063501.17%
  • Figure HelocFigure Heloc(FIGR_HELOC)$1.00-2.26%
  • dogecoinDogecoin(DOGE)$0.093682-1.28%
  • whitebitWhiteBIT Coin(WBT)$55.06-1.88%
  • USDSUSDS(USDS)$1.00-0.14%
  • cardanoCardano(ADA)$0.267932-1.34%
  • HyperliquidHyperliquid(HYPE)$39.42-1.31%
  • bitcoin-cashBitcoin Cash(BCH)$462.620.64%
  • leo-tokenLEO Token(LEO)$9.190.27%
  • chainlinkChainlink(LINK)$9.07-0.87%
  • moneroMonero(XMR)$341.81-0.76%
  • Ethena USDeEthena USDe(USDE)$1.00-0.03%
  • stellarStellar(XLM)$0.166636-0.72%
  • CantonCanton(CC)$0.142895-1.58%
  • USD1USD1(USD1)$1.000.04%
  • daiDai(DAI)$1.000.01%
  • litecoinLitecoin(LTC)$55.58-0.09%
  • RainRain(RAIN)$0.008756-2.18%
  • avalanche-2Avalanche(AVAX)$9.50-1.04%
  • paypal-usdPayPal USD(PYUSD)$1.000.00%
  • hedera-hashgraphHedera(HBAR)$0.093197-0.53%
  • zcashZcash(ZEC)$233.42-4.75%
  • suiSui(SUI)$0.970.33%
  • shiba-inuShiba Inu(SHIB)$0.0000062.70%
  • crypto-com-chainCronos(CRO)$0.075287-1.01%
  • the-open-networkToncoin(TON)$1.24-1.73%
  • MemeCoreMemeCore(M)$1.65-13.19%
  • BittensorBittensor(TAO)$284.0213.50%
  • tether-goldTether Gold(XAUT)$4,649.28-0.67%
  • polkadotPolkadot(DOT)$1.52-1.32%
  • World Liberty FinancialWorld Liberty Financial(WLFI)$0.090050-7.01%
  • mantleMantle(MNT)$0.760.52%
  • Circle USYCCircle USYC(USYC)$1.12-0.02%
  • pax-goldPAX Gold(PAXG)$4,654.84-0.81%
  • uniswapUniswap(UNI)$3.60-0.19%
  • BlackRock USD Institutional Digital Liquidity FundBlackRock USD Institutional Digital Liquidity Fund(BUIDL)$1.000.00%
  • okbOKB(OKB)$88.57-2.38%
  • Pi NetworkPi Network(PI)$0.1893896.88%
  • Global DollarGlobal Dollar(USDG)$1.000.02%
  • Falcon USDFalcon USD(USDF)$1.000.05%
  • nearNEAR Protocol(NEAR)$1.34-1.42%
  • aaveAave(AAVE)$111.59-3.14%
  • AsterAster(ASTER)$0.69-0.15%
TradePoint.io
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop
No Result
View All Result
TradePoint.io
No Result
View All Result

A Coding Guide to High-Quality Image Generation, Control, and Editing Using HuggingFace Diffusers

February 21, 2026
in AI & Technology
Reading Time: 5 mins read
A A
A Coding Guide to High-Quality Image Generation, Control, and Editing Using HuggingFace Diffusers
ShareShareShareShareShare

In this tutorial, we design a practical image-generation workflow using the Diffusers library. We start by stabilizing the environment, then generate high-quality images from text prompts using Stable Diffusion with an optimized scheduler. We accelerate inference with a LoRA-based latent consistency approach, guide composition with ControlNet under edge conditioning, and finally perform localized edits via inpainting. Also, we focus on real-world techniques that balance image quality, speed, and controllability.

!pip -q uninstall -y pillow Pillow || true
!pip -q install --upgrade --force-reinstall "pillow<12.0"
!pip -q install --upgrade diffusers transformers accelerate safetensors huggingface_hub opencv-python


import os, math, random
import torch
import numpy as np
import cv2
from PIL import Image, ImageDraw, ImageFilter
from diffusers import (
   StableDiffusionPipeline,
   StableDiffusionInpaintPipeline,
   ControlNetModel,
   StableDiffusionControlNetPipeline,
   UniPCMultistepScheduler,
)

We prepare a clean and compatible runtime by resolving dependency conflicts and installing all required libraries. We ensure image processing works reliably by pinning the correct Pillow version and loading the Diffusers ecosystem. We also import all core modules needed for generation, control, and inpainting workflows.

YOU MAY ALSO LIKE

OpenAI is putting ChatGPT, its browser and code generator into one desktop app

Anthropic just shipped an OpenClaw killer called Claude Code Channels, letting you message it over Telegram and Discord

def seed_everything(seed=42):
   random.seed(seed)
   np.random.seed(seed)
   torch.manual_seed(seed)
   torch.cuda.manual_seed_all(seed)


def to_grid(images, cols=2, bg=255):
   if isinstance(images, Image.Image):
       images = [images]
   w, h = images[0].size
   rows = math.ceil(len(images) / cols)
   grid = Image.new("RGB", (cols*w, rows*h), (bg, bg, bg))
   for i, im in enumerate(images):
       grid.paste(im, ((i % cols)*w, (i // cols)*h))
   return grid


device = "cuda" if torch.cuda.is_available() else "cpu"
dtype = torch.float16 if device == "cuda" else torch.float32
print("device:", device, "| dtype:", dtype)

We define utility functions to ensure reproducibility and to organize visual outputs efficiently. We set global random seeds so our generations remain consistent across runs. We also detect the available hardware and configure precision to optimize performance on the GPU or CPU.

seed_everything(7)
BASE_MODEL = "runwayml/stable-diffusion-v1-5"


pipe = StableDiffusionPipeline.from_pretrained(
   BASE_MODEL,
   torch_dtype=dtype,
   safety_checker=None,
).to(device)


pipe.scheduler = UniPCMultistepScheduler.from_config(pipe.scheduler.config)


if device == "cuda":
   pipe.enable_attention_slicing()
   pipe.enable_vae_slicing()


prompt = "a cinematic photo of a futuristic street market at dusk, ultra-detailed, 35mm, volumetric lighting"
negative_prompt = "blurry, low quality, deformed, watermark, text"


img_text = pipe(
   prompt=prompt,
   negative_prompt=negative_prompt,
   num_inference_steps=25,
   guidance_scale=6.5,
   width=768,
   height=512,
).images[0]

We initialize the base Stable Diffusion pipeline and switch to a more efficient UniPC scheduler. We generate a high-quality image directly from a text prompt using carefully chosen guidance and resolution settings. This establishes a strong baseline for subsequent improvements in speed and control.

LCM_LORA = "latent-consistency/lcm-lora-sdv1-5"
pipe.load_lora_weights(LCM_LORA)


try:
   pipe.fuse_lora()
   lora_fused = True
except Exception as e:
   lora_fused = False
   print("LoRA fuse skipped:", e)


fast_prompt = "a clean product photo of a minimal smartwatch on a reflective surface, studio lighting"
fast_images = []
for steps in [4, 6, 8]:
   fast_images.append(
       pipe(
           prompt=fast_prompt,
           negative_prompt=negative_prompt,
           num_inference_steps=steps,
           guidance_scale=1.5,
           width=768,
           height=512,
       ).images[0]
   )


grid_fast = to_grid(fast_images, cols=3)
print("LoRA fused:", lora_fused)


W, H = 768, 512
layout = Image.new("RGB", (W, H), "white")
draw = ImageDraw.Draw(layout)
draw.rectangle([40, 80, 340, 460], outline="black", width=6)
draw.ellipse([430, 110, 720, 400], outline="black", width=6)
draw.line([0, 420, W, 420], fill="black", width=5)


edges = cv2.Canny(np.array(layout), 80, 160)
edges = np.stack([edges]*3, axis=-1)
canny_image = Image.fromarray(edges)


CONTROLNET = "lllyasviel/sd-controlnet-canny"
controlnet = ControlNetModel.from_pretrained(
   CONTROLNET,
   torch_dtype=dtype,
).to(device)


cn_pipe = StableDiffusionControlNetPipeline.from_pretrained(
   BASE_MODEL,
   controlnet=controlnet,
   torch_dtype=dtype,
   safety_checker=None,
).to(device)


cn_pipe.scheduler = UniPCMultistepScheduler.from_config(cn_pipe.scheduler.config)


if device == "cuda":
   cn_pipe.enable_attention_slicing()
   cn_pipe.enable_vae_slicing()


cn_prompt = "a modern cafe interior, architectural render, soft daylight, high detail"
img_controlnet = cn_pipe(
   prompt=cn_prompt,
   negative_prompt=negative_prompt,
   image=canny_image,
   num_inference_steps=25,
   guidance_scale=6.5,
   controlnet_conditioning_scale=1.0,
).images[0]

We accelerate inference by loading and fusing a LoRA adapter and demonstrate fast sampling with very few diffusion steps. We then construct a structural conditioning image and apply ControlNet to guide the layout of the generated scene. This allows us to preserve composition while still benefiting from creative text guidance.

mask = Image.new("L", img_controlnet.size, 0)
mask_draw = ImageDraw.Draw(mask)
mask_draw.rectangle([60, 90, 320, 170], fill=255)
mask = mask.filter(ImageFilter.GaussianBlur(2))


inpaint_pipe = StableDiffusionInpaintPipeline.from_pretrained(
   BASE_MODEL,
   torch_dtype=dtype,
   safety_checker=None,
).to(device)


inpaint_pipe.scheduler = UniPCMultistepScheduler.from_config(inpaint_pipe.scheduler.config)


if device == "cuda":
   inpaint_pipe.enable_attention_slicing()
   inpaint_pipe.enable_vae_slicing()


inpaint_prompt = "a glowing neon sign that says 'CAFÉ', cyberpunk style, realistic lighting"


img_inpaint = inpaint_pipe(
   prompt=inpaint_prompt,
   negative_prompt=negative_prompt,
   image=img_controlnet,
   mask_image=mask,
   num_inference_steps=30,
   guidance_scale=7.0,
).images[0]


os.makedirs("outputs", exist_ok=True)
img_text.save("outputs/text2img.png")
grid_fast.save("outputs/lora_fast_grid.png")
layout.save("outputs/layout.png")
canny_image.save("outputs/canny.png")
img_controlnet.save("outputs/controlnet.png")
mask.save("outputs/mask.png")
img_inpaint.save("outputs/inpaint.png")


print("Saved outputs:", sorted(os.listdir("outputs")))
print("Done.")

We create a mask to isolate a specific region and apply inpainting to modify only that part of the image. We refine the selected area using a targeted prompt while keeping the rest intact. Finally, we save all intermediate and final outputs to disk for inspection and reuse.

In conclusion, we demonstrated how a single Diffusers pipeline can evolve into a flexible, production-ready image generation system. We explained how to move from pure text-to-image generation to fast sampling, structural control, and targeted image editing without changing frameworks or tooling. This tutorial highlights how we can combine schedulers, LoRA adapters, ControlNet, and inpainting to create controllable and efficient generative pipelines that are easy to extend for more advanced creative or applied use cases.


Check out the Full Codes here. Also, feel free to follow us on Twitter and don’t forget to join our 100k+ ML SubReddit and Subscribe to our Newsletter. Wait! are you on telegram? now you can join us on telegram as well.


Credit: Source link

ShareTweetSendSharePin

Related Posts

OpenAI is putting ChatGPT, its browser and code generator into one desktop app
AI & Technology

OpenAI is putting ChatGPT, its browser and code generator into one desktop app

March 20, 2026
Anthropic just shipped an OpenClaw killer called Claude Code Channels, letting you message it over Telegram and Discord
AI & Technology

Anthropic just shipped an OpenClaw killer called Claude Code Channels, letting you message it over Telegram and Discord

March 20, 2026
Alphabet no longer has a controlling stake in its life sciences business Verily
AI & Technology

Alphabet no longer has a controlling stake in its life sciences business Verily

March 19, 2026
Amazon acquires autonomous robotics startup Rivr
AI & Technology

Amazon acquires autonomous robotics startup Rivr

March 19, 2026
Next Post
Mark Zuckerberg testifies in social media addiction trial

Mark Zuckerberg testifies in social media addiction trial

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Search

No Result
View All Result
The authorization problem that could break enterprise AI

The authorization problem that could break enterprise AI

March 17, 2026
Iranian intelligence minister killed in precision airstrike, while US military targets missile sites – Fox News

Iranian intelligence minister killed in precision airstrike, while US military targets missile sites – Fox News

March 19, 2026
NASA Plans to Launch Artemis II on April 1

NASA Plans to Launch Artemis II on April 1

March 15, 2026

About

Learn more

Our Services

Legal

Privacy Policy

Terms of Use

Bloggers

Learn more

Article Links

Contact

Advertise

Ask us anything

©2020- TradePoint.io - All rights reserved!

Tradepoint.io, being just a publishing and technology platform, is not a registered broker-dealer or investment adviser. So we do not provide investment advice. Rather, brokerage services are provided to clients of Tradepoint.io by independent SEC-registered broker-dealers and members of FINRA/SIPC. Every form of investing carries some risk and past performance is not a guarantee of future results. “Tradepoint.io“, “Instant Investing” and “My Trading Tools” are registered trademarks of Apperbuild, LLC.

This website is operated by Apperbuild, LLC. We have no link to any brokerage firm and we do not provide investment advice. Every information and resource we provide is solely for the education of our readers. © 2020 Apperbuild, LLC. All rights reserved.

No Result
View All Result
  • Main
  • AI & Technology
  • Stock Charts
  • Market & News
  • Business
  • Finance Tips
  • Trade Tube
  • Blog
  • Shop

© 2023 - TradePoint.io - All Rights Reserved!