Python 实现 AI 图像生成:调用 Stable Diffusion API 完整教程

Python 实现 AI 图像生成:调用 Stable Diffusion API 完整教程

从零开始学习使用 Python 调用 Stable Diffusion API 生成图像,涵盖本地部署、API 调用、ControlNet、图生图等进阶技巧。

1. 技术架构

Python 客户端

Stable Diffusion API

本地部署
SD WebUI / ComfyUI

云端 API
Replicate / Stability AI

Stable Diffusion 模型

文生图
txt2img

图生图
img2img

局部重绘
inpainting

超分辨率
upscale

输出图像

后处理管道

存储
本地/OSS

2. 图像生成方式对比

50%25%15%10%各生成方式使用占比统计文生图 (txt2img)图生图 (img2img)局部重绘 (inpainting)超分辨率 (upscale)

3. 环境准备

3.1 本地部署 Stable Diffusion WebUI

# 克隆 Stable Diffusion WebUIgit clone https://github.com/AUTOMATIC1111/stable-diffusion-webui.git cd stable-diffusion-webui # 启动(开启 API 模式) ./webui.sh --api--listen# Windows 用户 webui.bat --api--listen

3.2 安装依赖

pip install requests Pillow io base64 

4. 核心代码实现

4.1 SD API 客户端封装

# sd_client.pyimport requests import base64 import io import json import time from pathlib import Path from PIL import Image from dataclasses import dataclass, field from typing import Optional @dataclassclassGenerationConfig:"""图像生成配置""" prompt:str="" negative_prompt:str="low quality, blurry, deformed" width:int=512 height:int=512 steps:int=30 cfg_scale:float=7.0 sampler_name:str="DPM++ 2M Karras" seed:int=-1# -1 表示随机 batch_size:int=1 n_iter:int=1# 迭代次数 model: Optional[str]=NoneclassStableDiffusionClient:"""Stable Diffusion API 客户端"""def__init__(self, base_url:str="http://127.0.0.1:7860"): self.base_url = base_url self.api_url =f"{base_url}/sdapi/v1"def_save_base64_image(self, b64_str:str, output_path:str)->str:"""将 base64 图片保存到文件""" img_data = base64.b64decode(b64_str) img = Image.open(io.BytesIO(img_data)) img.save(output_path)return output_path # ---- 文生图 ----deftxt2img(self, config: GenerationConfig, output_dir:str="./output")->list[str]:"""文生图:从文本描述生成图像""" payload ={"prompt": config.prompt,"negative_prompt": config.negative_prompt,"width": config.width,"height": config.height,"steps": config.steps,"cfg_scale": config.cfg_scale,"sampler_name": config.sampler_name,"seed": config.seed,"batch_size": config.batch_size,"n_iter": config.n_iter,}if config.model: self._switch_model(config.model) response = requests.post(f"{self.api_url}/txt2img", json=payload) response.raise_for_status() data = response.json() Path(output_dir).mkdir(exist_ok=True) saved_paths =[]for i, img_b64 inenumerate(data["images"]): path =f"{output_dir}/txt2img_{int(time.time())}_{i}.png" self._save_base64_image(img_b64, path) saved_paths.append(path)print(f"已保存: {path}")return saved_paths # ---- 图生图 ----defimg2img(self, init_image_path:str, prompt:str, denoising_strength:float=0.75, config: GenerationConfig =None, output_dir:str="./output")->list[str]:"""图生图:基于参考图 + 提示词生成新图""" config = config or GenerationConfig()# 读取初始图片并转 base64withopen(init_image_path,"rb")as f: init_images =[base64.b64encode(f.read()).decode()] payload ={"init_images": init_images,"prompt": prompt,"negative_prompt": config.negative_prompt,"width": config.width,"height": config.height,"steps": config.steps,"cfg_scale": config.cfg_scale,"sampler_name": config.sampler_name,"denoising_strength": denoising_strength,"seed": config.seed,} response = requests.post(f"{self.api_url}/img2img", json=payload) response.raise_for_status() data = response.json() Path(output_dir).mkdir(exist_ok=True) saved_paths =[]for i, img_b64 inenumerate(data["images"]): path =f"{output_dir}/img2img_{int(time.time())}_{i}.png" self._save_base64_image(img_b64, path) saved_paths.append(path)print(f"已保存: {path}")return saved_paths # ---- 局部重绘 ----definpaint(self, init_image_path:str, mask_image_path:str, prompt:str, denoising_strength:float=0.85, output_dir:str="./output")->list[str]:"""局部重绘:只修改 mask 区域"""withopen(init_image_path,"rb")as f: init_images =[base64.b64encode(f.read()).decode()]withopen(mask_image_path,"rb")as f: mask = base64.b64encode(f.read()).decode() payload ={"init_images": init_images,"mask": mask,"prompt": prompt,"negative_prompt":"low quality, blurry","denoising_strength": denoising_strength,"inpainting_fill":1,# 0=fill, 1=original, 2=latent noise"inpaint_full_res":True,"steps":30,"cfg_scale":7.0,"sampler_name":"DPM++ 2M Karras","width":512,"height":512,} response = requests.post(f"{self.api_url}/img2img", json=payload) response.raise_for_status() data = response.json() Path(output_dir).mkdir(exist_ok=True) saved_paths =[]for i, img_b64 inenumerate(data["images"]): path =f"{output_dir}/inpaint_{int(time.time())}_{i}.png" self._save_base64_image(img_b64, path) saved_paths.append(path)return saved_paths # ---- 超分辨率 ----defupscale(self, image_path:str, scale:int=2, output_dir:str="./output")->str:"""使用 ESRGAN 进行超分辨率放大"""withopen(image_path,"rb")as f: img_b64 = base64.b64encode(f.read()).decode() payload ={"image": img_b64,"upscaler_1":"R-ESRGAN 4x+","upscaling_resize": scale,} response = requests.post(f"{self.api_url}/extra-single-image", json=payload) response.raise_for_status() data = response.json() Path(output_dir).mkdir(exist_ok=True) path =f"{output_dir}/upscaled_{int(time.time())}.png" self._save_base64_image(data["image"], path)print(f"超分辨率完成: {path}")return path # ---- 模型管理 ----def_switch_model(self, model_name:str):"""切换模型""" response = requests.post(f"{self.api_url}/options", json={"sd_model_checkpoint": model_name},) response.raise_for_status() time.sleep(3)# 等待模型加载deflist_models(self)->list[str]:"""列出可用模型""" response = requests.get(f"{self.api_url}/sd-models")return[m["title"]for m in response.json()]deflist_samplers(self)->list[str]:"""列出可用采样器""" response = requests.get(f"{self.api_url}/samplers")return[s["name"]for s in response.json()]

4.2 批量生成示例

# batch_generate.pyfrom sd_client import StableDiffusionClient, GenerationConfig defbatch_generate_portraits():"""批量生成人物肖像""" sd = StableDiffusionClient()# 查看可用模型和采样器print("可用模型:", sd.list_models()[:5])print("可用采样器:", sd.list_samplers())# 风格列表 styles =["cyberpunk neon city","watercolor painting","oil painting renaissance","anime style","photorealistic 8k",] base_prompt =("portrait of a young woman, detailed face, beautiful eyes, ""dramatic lighting, masterpiece, best quality")for style in styles: config = GenerationConfig( prompt=f"{base_prompt}, {style}", negative_prompt="lowres, bad anatomy, bad hands, text, error", width=512, height=768, steps=30, cfg_scale=7.5,) paths = sd.txt2img(config, output_dir=f"./output/{style.replace(' ','_')}")print(f"风格 [{style}] -> {paths}")if __name__ =="__main__": batch_generate_portraits()

4.3 调用 Stability AI 云端 API

# stability_cloud.pyimport requests import base64 from pathlib import Path from PIL import Image from io import BytesIO classStabilityAIClient:"""Stability AI 官方云端 API"""def__init__(self, api_key:str): self.api_key = api_key self.base_url ="https://api.stability.ai/v2beta"defgenerate(self, prompt:str, aspect_ratio:str="1:1", style:str="photographic", output_path:str="output.png")->str:"""调用 Stable Diffusion 3 生成图像""" response = requests.post(f"{self.base_url}/stable-image/generate/sd3", headers={"Authorization":f"Bearer {self.api_key}","Accept":"image/*",}, files={"none":""}, data={"prompt": prompt,"aspect_ratio": aspect_ratio,"style_preset": style,"output_format":"png",},)if response.status_code !=200:raise Exception(f"API 错误: {response.status_code} - {response.text}")withopen(output_path,"wb")as f: f.write(response.content)print(f"已生成: {output_path}")return output_path # 使用示例if __name__ =="__main__": client = StabilityAIClient(api_key="sk-your-api-key") client.generate( prompt="A majestic dragon flying over a neon-lit cyberpunk city at night, ""highly detailed, cinematic lighting, 8k", aspect_ratio="16:9", style="cinematic", output_path="dragon_city.png",)

4.4 图像后处理管道

# postprocess.pyfrom PIL import Image, ImageEnhance, ImageFilter from pathlib import Path classImagePostProcessor:"""图像后处理:调整色彩、锐化、添加水印"""@staticmethoddefenhance(image_path:str, brightness:float=1.1, contrast:float=1.15, sharpness:float=1.3, output_path:str=None)->str:"""综合增强""" img = Image.open(image_path) img = ImageEnhance.Brightness(img).enhance(brightness) img = ImageEnhance.Contrast(img).enhance(contrast) img = ImageEnhance.Sharpness(img).enhance(sharpness) output_path = output_path or image_path.replace(".","_enhanced.") img.save(output_path, quality=95)return output_path @staticmethoddefadd_watermark(image_path:str, text:str="AI Generated", output_path:str=None)->str:"""添加水印"""from PIL import ImageDraw, ImageFont img = Image.open(image_path).convert("RGBA") overlay = Image.new("RGBA", img.size,(0,0,0,0)) draw = ImageDraw.Draw(overlay)# 半透明白色文字 draw.text((img.width -200, img.height -40), text, fill=(255,255,255,128),) img = Image.alpha_composite(img, overlay).convert("RGB") output_path = output_path or image_path.replace(".","_wm.") img.save(output_path, quality=95)return output_path @staticmethoddefcreate_grid(image_paths:list[str], cols:int=3, output_path:str="grid.png")->str:"""将多张图片拼成网格""" images =[Image.open(p)for p in image_paths] w, h = images[0].size rows =(len(images)+ cols -1)// cols grid = Image.new("RGB",(w * cols, h * rows),"white")for i, img inenumerate(images): row, col =divmod(i, cols) grid.paste(img,(col * w, row * h)) grid.save(output_path, quality=95)print(f"网格图已保存: {output_path}")return output_path 

5. Prompt 工程技巧

Prompt 结构

主体描述

风格关键词

质量修饰词

负面提示词

高质量 Prompt 模板

PROMPT_TEMPLATES ={"人物肖像":("{subject}, {style}, detailed face, expressive eyes, ""dramatic lighting, masterpiece, best quality, ultra detailed"),"风景":("{scene}, {mood}, volumetric lighting, god rays, ""landscape photography, 8k uhd, cinematic composition"),"产品设计":("{product}, minimalist design, studio lighting, ""white background, product photography, professional, 4k"),"动漫":("{character}, anime style, vibrant colors, ""detailed illustration, cel shading, masterpiece"),} NEGATIVE_PROMPTS ={"通用":"lowres, bad anatomy, bad hands, text, error, missing fingers, ""extra digit, cropped, worst quality, low quality, blurry","写实":"illustration, painting, drawing, art, sketch, anime, cartoon, ""CG, render, 3D, watermark, text, font, signature","动漫":"photo, realistic, 3d, western, ugly, duplicate, morbid, ""deformed, bad anatomy, blurry",}

6. 关键参数影响

35%20%15%15%10%5%不同参数对生成质量的影响权重Prompt 质量采样步数 (steps)CFG Scale采样器选择模型选择分辨率

参数推荐值说明
steps25-35步数越多细节越好,但边际递减且更慢
cfg_scale7-12越高越遵循 prompt,过高会过饱和
samplerDPM++ 2M Karras兼顾速度与质量
denoising_strength0.5-0.8图生图降噪强度,越高变化越大
seed-1随机种子,固定可复现

7. 完整使用流程

# complete_demo.pyfrom sd_client import StableDiffusionClient, GenerationConfig from stability_cloud import StabilityAIClient from postprocess import ImagePostProcessor defmain():# ===== 方式一:本地 SD WebUI ===== sd = StableDiffusionClient("http://127.0.0.1:7860")# 文生图 config = GenerationConfig( prompt="A serene Japanese garden with cherry blossoms, ""koi pond, stone bridge, golden hour, cinematic, 8k", negative_prompt="lowres, blurry, text, watermark", width=768, height=512, steps=30, cfg_scale=7.5,) paths = sd.txt2img(config)print(f"生成完成: {paths}")# 图生图if paths: new_paths = sd.img2img( init_image_path=paths[0], prompt="same scene but in autumn, orange and red leaves, snow", denoising_strength=0.6,)print(f"图生图完成: {new_paths}")# 超分辨率if paths: upscaled = sd.upscale(paths[0], scale=2)print(f"超分辨率完成: {upscaled}")# 后处理 pp = ImagePostProcessor()if paths: enhanced = pp.enhance(paths[0]) watermarked = pp.add_watermark(enhanced, text="AI Art")print(f"后处理完成: {watermarked}")# ===== 方式二:云端 API =====# cloud = StabilityAIClient("sk-xxx")# cloud.generate("A futuristic cityscape at sunset", "16:9", "cinematic")if __name__ =="__main__": main()

8. 总结

本文覆盖了 Stable Diffusion 图像生成的完整链路:

  1. 本地部署 SD WebUI 并开启 API 模式
  2. 封装 Python 客户端 支持文生图、图生图、局部重绘、超分辨率
  3. 云端 API 作为无 GPU 环境的替代方案
  4. Prompt 工程 模板化的提示词编写技巧
  5. 后处理管道 增强色彩、添加水印、拼图网格
生成速度参考:RTX 4090 生成 512x512 约 3-5 秒,512x768 约 5-8 秒。云端 API 约 10-20 秒。

Read more

Trae x 图片素描MCP一键将普通图片转换为多风格素描效果

Trae x 图片素描MCP一键将普通图片转换为多风格素描效果

目录 * 前言 * 一、核心工具与优势解析 * 二、操作步骤:从安装到生成素描效果 * 第一步:获取MCP配置代码 * 第二步:下载 * 第三步:在 Trae 中导入 MCP 配置并建立连接 * 第四步:核心功能调用 * 三、三大素描风格差异化应用 * 四.总结 前言 在设计创作、社交媒体分享、教育演示等场景中,素描风格的图片往往能以简洁的线条突出主体特征,带来独特的艺术质感。然而,传统素描效果制作需借助专业设计软件(如Photoshop、Procreate),不仅操作复杂,还需掌握一定的绘画技巧,难以满足普通用户快速生成素描的需求。 为解决这一痛点,本文将介绍蓝耘MCP广场提供的图片素描MCP工具(工具ID:3423)。该工具基于MCP(Model Context Protocol)协议开发,支持单张/批量图片转换、3种素描风格切换及自定义参数调节,兼容多种图片格式与中文路径,无需专业设计能力,

Node.js Web Streams API实战简化流处理

Node.js Web Streams API实战简化流处理

💓 博客主页:瑕疵的ZEEKLOG主页📝 Gitee主页:瑕疵的gitee主页⏩ 文章专栏:《热点资讯》 Node.js Web Streams API实战:简化流处理的革命性实践 目录 * Node.js Web Streams API实战:简化流处理的革命性实践 * 引言:流处理的困境与破局点 * 一、为什么Web Streams API是流处理的“破壁者”? * 传统流处理的三大痛点 * Web Streams API的核心优势 * 二、实战:从复杂到优雅的代码演进 * 场景:文件内容转换(CSV → JSON) * 传统方案(`stream`模块):15行+的“地狱代码” * Web Streams方案:5行代码的优雅实现 * 三、场景深化:Web Streams的跨界价值

Obsidian同步太折腾?试试坚果云官方插件:免WebDAV配置,支持Git级冲突合并

Obsidian同步太折腾?试试坚果云官方插件:免WebDAV配置,支持Git级冲突合并

Obsidian 作为本地 Markdown 笔记软件的王者,其“数据掌握在自己手中”的理念深受开发者喜爱。但作为一名多端用户,同步问题一直是最大的痛点。官方 Sync 服务太贵,WebDAV 配置繁琐且不仅容易断连,还经常遇到笔记冲突。 终于,大家催了无数遍的 Obsidian x 坚果云「官方同步插件 Nutstore Sync」 正式上架社区插件市场了! 这不仅仅是一个同步工具,更是一套完整的移动端解决方案。 为什么推荐这款官方插件? 1. 告别复杂的 WebDAV 配置(SSO单点登录) 以前配置 WebDAV,你需要生成应用密码、复制服务器地址、担心端口被封。 现在,安装 Nutstore Sync 后,直接点击“登录”,通过单点登录 授权,一键回调到 Obsidian,配置过程缩短到秒级,新手极其友好。

Windows软件安装报错?3分钟搞定Webview2和.NET4.8缺失问题(附C盘权限获取技巧)

Windows软件安装报错终极指南:从Webview2到.NET4.8的完整解决方案 每次安装新软件时遇到"缺少Webview2 Runtime"或".NET Framework 4.8未安装"的报错提示,是不是让你感到无比烦躁?这些看似复杂的系统组件缺失问题,其实都有简单直接的解决方法。本文将带你一步步彻底解决这些安装障碍,同时分享几个鲜为人知的C盘权限管理技巧,让你的软件安装过程从此畅通无阻。 1. 理解核心组件:Webview2和.NET4.8为何如此重要 现代Windows软件越来越依赖这些基础运行环境。Microsoft Edge WebView2是一个嵌入式浏览器组件,允许应用程序显示网页内容,而.NET Framework 4.8则是微软开发的软件开发平台,许多程序都基于它构建。当你的系统缺少这些组件时,就像试图在没有地基的房子上盖楼——注定会失败。 常见症状包括: * "Microsoft Edge WebView2 Runtime未安装"错误提示