stable_diffusion

package module
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 2, 2026 License: MIT Imports: 6 Imported by: 0

README ยถ

stable-diffusion-go

็ฎ€ไฝ“ไธญๆ–‡

A pure Golang binding library for stable-diffusion.cpp based on github.com/ebitengine/purego, no cgo dependency required, supporting cross-platform operation.

๐ŸŒŸ Project Features

  • Pure Go Implementation: Based on the purego library, calls C++ dynamic libraries without cgo
  • Cross-platform Support: Supports Windows, Linux, macOS, and other mainstream operating systems
  • Complete Functionality: Implements the main APIs of stable-diffusion.cpp, including text-to-image, image-to-image, video generation, etc.
  • Simple and Easy to Use: Provides a concise Go language API for easy integration into existing projects
  • High Performance: Supports performance optimization features like FlashAttention and model quantization
  • Includes Precompiled Libraries: Provides precompiled dynamic libraries for Windows platform, ready to use out of the box

๐Ÿ“ Project Structure

stable-diffusion-go/
โ”œโ”€โ”€ examples/           # Example programs directory
โ”‚   โ”œโ”€โ”€ txt2img.go      # Text-to-image generation example
โ”‚   โ””โ”€โ”€ txt2vid.go      # Text-to-video generation example
โ”œโ”€โ”€ lib/        # Dynamic library directory
โ”‚   โ”œโ”€โ”€ darwin/ # macOS platform dynamic library
โ”‚   โ”‚   โ””โ”€โ”€ libstable-diffusion.dylib
โ”‚   โ”œโ”€โ”€ linux/  # Linux platform dynamic library
โ”‚   โ”‚   โ””โ”€โ”€ libstable-diffusion.so
โ”‚   โ”œโ”€โ”€ windows/ # Windows platform dynamic library
โ”‚   โ”‚   โ”œโ”€โ”€ avx/      # AVX instruction set version
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”‚   โ”œโ”€โ”€ avx2/     # AVX2 instruction set version
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”‚   โ”œโ”€โ”€ avx512/   # AVX512 instruction set version
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”‚   โ”œโ”€โ”€ cuda12/   # CUDA 12 version
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”‚   โ”œโ”€โ”€ noavx/    # No AVX instruction set version
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”‚   โ”œโ”€โ”€ rocm/     # ROCm version
โ”‚   โ”‚   โ”‚   โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”‚   โ””โ”€โ”€ vulkan/   # Vulkan version
โ”‚   โ”‚       โ””โ”€โ”€ stable-diffusion.dll
โ”‚   โ”œโ”€โ”€ ggml.txt
โ”‚   โ”œโ”€โ”€ stable-diffusion.cpp.txt
โ”‚   โ””โ”€โ”€ version.txt
โ”œโ”€โ”€ pkg/                # Go package directory
โ”‚   โ””โ”€โ”€ sd/             # Core binding library
โ”‚       โ”œโ”€โ”€ load_library_unix.go   # Unix platform dynamic library loading
โ”‚       โ”œโ”€โ”€ load_library_windows.go # Windows platform dynamic library loading
โ”‚       โ”œโ”€โ”€ stable_diffusion.go    # Core functionality implementation
โ”‚       โ””โ”€โ”€ utils.go               # Auxiliary utility functions
โ”œโ”€โ”€ .gitignore          # Git ignore file configuration
โ”œโ”€โ”€ README.md           # Project documentation
โ”œโ”€โ”€ go.mod              # Go module file
โ”œโ”€โ”€ go.sum              # Go dependency checksum file
โ””โ”€โ”€ stable_diffusion.go # Root directory entry file

Note: All dynamic link library files in the lib directory need to be downloaded from https://github.com/leejet/stable-diffusion.cpp/releases according to the version in lib/version.txt

๐Ÿš€ Quick Start

1. Install Dependencies
go get github.com/orangelang/stable-diffusion-go
2. Prepare Model Files

Model files need to be prepared before use, supporting multiple formats:

  • Diffusion models: .gguf format (e.g., z_image_turbo-Q4_K_M.gguf)
  • LLM models: .gguf format (e.g., Qwen3-4B-Instruct-2507-Q4_K_M.gguf)
  • VAE models: .safetensors format (e.g., diffusion_pytorch_model.safetensors)
3. Dynamic Library Description

The project includes precompiled dynamic libraries for multiple platforms, located in the pkg/sd/lib/ directory:

  • Windows: Multiple versions to suit different hardware
    • avx/: Supports AVX instruction set
    • avx2/: Supports AVX2 instruction set
    • avx512/: Supports AVX512 instruction set
    • cuda12/: Supports CUDA 12
    • noavx/: No AVX instruction set dependency
    • rocm/: Supports ROCm
    • vulkan/: Supports Vulkan
  • Linux: libstable-diffusion.so
  • macOS: libstable-diffusion.dylib

The program automatically selects the appropriate dynamic library based on the current environment, no manual specification required.

4. Run Examples
Text-to-Image Generation
# Enter the examples directory
cd examples

# Run text-to-image example
go run txt2img.go

Example code:

package main

import (
	"fmt"
	stablediffusion "github.com/orangelang/stable-diffusion-go"
)

func main() {
	fmt.Println("Stable Diffusion Go - Text to Image Example")
	fmt.Println("===============================================")

	// Create Stable Diffusion instance
	sd, err := stablediffusion.NewStableDiffusion(&stablediffusion.ContextParams{
		DiffusionModelPath: "path/to/diffusion_model.gguf",
		LLMPath:            "path/to/llm_model.gguf",
		VAEPath:            "path/to/vae_model.safetensors",
		DiffusionFlashAttn: true,
		OffloadParamsToCPU: true,
	})

	if err != nil {
		fmt.Println("Failed to create instance:", err)
		return
	}
	defer sd.Free()

	// Generate image
	err = sd.GenerateImage(&stablediffusion.ImgGenParams{
		Prompt:      "ไธ€ไฝ็ฉฟ็€ๆ˜Žๆœๆœ้ฅฐ็š„็พŽๅฅณ่กŒ่ตฐๅœจ่Šฑๅ›ญไธญ",
		Width:       512,
		Height:      512,
		SampleSteps: 10,
		CfgScale:    1.0,
	}, "output.png")

	if err != nil {
		fmt.Println("Failed to generate image:", err)
		return
	}

	fmt.Println("Image generated successfully!")
}

Text-to-Video Generation
# Run text-to-video example
go run txt2vid.go

๐Ÿ“š Core Features

1. Context Management
  • Create and destroy Stable Diffusion contexts
  • Support multiple model path configurations
  • Provide rich performance optimization parameters
2. Text-to-Image Generation (txt2img)
  • Generate high-quality images from text descriptions
  • Support Chinese and English prompts
  • Adjustable image dimensions, sampling steps, CFG scale, and other parameters
  • Support random seed generation
3. Text-to-Video Generation (txt2vid)
  • Generate videos from text prompts
  • Support custom frame count and resolution
  • Support Easycache optimization
  • Integrate FFmpeg for video encoding

๐Ÿ“ Usage Guide

Basic Usage
  1. Create Instance: Use NewStableDiffusion to create a Stable Diffusion instance
  2. Configure Parameters: Set context parameters and generation parameters
  3. Generate Content: Call GenerateImage or GenerateVideo to generate content
  4. Release Resources: Use defer sd.Free() to release resources
Context Parameters Description
Parameter Name Type Description
DiffusionModelPath string Diffusion model file path
LLMPath string LLM model file path
VAEPath string VAE model file path
NThreads int Number of threads
DiffusionFlashAttn bool Whether to enable FlashAttention
OffloadParamsToCPU bool Whether to offload some parameters to CPU
WType SDType Model quantization type
Image Generation Parameters Description
Parameter Name Type Description
Prompt string Prompt text
NegativePrompt string Negative prompt text
Width int Image width
Height int Image height
Seed int Random seed
SampleSteps int Number of sampling steps
CfgScale float64 CFG scale
Strength float64 Initial image strength (img2img only)

๐Ÿ”ง Performance Optimization

1. Adjust Thread Count

Adjust the NThreads parameter according to the number of CPU cores:

ctxParams := &stablediffusion.ContextParams{
    // Other parameters...
    NThreads: 8, // Adjust according to CPU core count
}
2. Use Quantized Models

Using quantized models can improve performance and reduce memory usage:

ctxParams := &stablediffusion.ContextParams{
    // Other parameters...
    WType: stablediffusion.SDTypeQ4_K, // Use Q4_K quantized model
}
3. Adjust Sampling Steps

Reducing the number of sampling steps can improve generation speed but may reduce image quality:

imgGenParams := &stablediffusion.ImgGenParams{
    // Other parameters...
    SampleSteps: 10, // Reduce sampling steps
}
4. Enable FlashAttention

Enabling FlashAttention can accelerate the diffusion process:

ctxParams := &stablediffusion.ContextParams{
    // Other parameters...
    DiffusionFlashAttn: true,
}

โš ๏ธ Notes

  1. Dynamic Library Path: The program automatically selects the appropriate dynamic library from the pkg/sd/lib/ directory and current environment
  2. Model Compatibility: Ensure using model formats compatible with stable-diffusion.cpp
  3. Dependencies: Install dependencies like CUDA or Vulkan as needed
  4. Video Generation: Requires FFmpeg for video encoding
  5. Memory Usage: Large models may require more memory, it is recommended to use quantized models
  6. About AMD Graphics Cards (Windows Platform): If using AMD graphics cards (including AMD integrated graphics), you need to download the ROCm library and place it in the project root directory, download link: https://github.com/leejet/stable-diffusion.cpp/releases/download/master-453-4ff2c8c/sd-master-4ff2c8c-bin-win-rocm-x64.zip
  7. About Vulkan: If using non-nvidia graphics cards (such as AMD or Intel graphics cards, including integrated graphics), you can install Vulkan to enable GPU acceleration

๐Ÿ“ฆ Example Programs

Text-to-Image Example
package main

import (
	"fmt"
	stablediffusion "github.com/orangelang/stable-diffusion-go"
)

func main() {
	// Create instance
	sd, err := stablediffusion.NewStableDiffusion(&stablediffusion.ContextParams{
		DiffusionModelPath: "models/z_image_turbo-Q4_K_M.gguf",
		LLMPath:            "models/Qwen3-4B-Instruct-2507-Q4_K_M.gguf",
		VAEPath:            "models/diffusion_pytorch_model.safetensors",
		DiffusionFlashAttn: true,
	})
	if err != nil {
		fmt.Println("Failed to create instance:", err)
		return
	}
	defer sd.Free()

	// Generate image
	err = sd.GenerateImage(&stablediffusion.ImgGenParams{
		Prompt:      "A cute Corgi dog running on the grass",
		Width:       512,
		Height:      512,
		SampleSteps: 15,
		CfgScale:    2.0,
	}, "output_corgi.png")

	if err != nil {
		fmt.Println("Failed to generate image:", err)
		return
	}

	fmt.Println("Image generated successfully!")
}
Text-to-Video Example
package main

import (
	"fmt"
	stablediffusion "github.com/orangelang/stable-diffusion-go"
)

func main() {
	// Create instance
	sd, err := stablediffusion.NewStableDiffusion(&stablediffusion.ContextParams{
		DiffusionModelPath: "D:\\hf-mirror\\wan2.1\\wan2.1_t2v_1.3B_bf16.safetensors",
		T5XXLPath:          "D:\\hf-mirror\\wan2.1\\umt5-xxl-encoder-Q4_K_M.gguf",
		VAEPath:            "D:\\hf-mirror\\wan2.1\\wan_2.1_vae.safetensors",
		DiffusionFlashAttn: true,
		KeepClipOnCPU:      true,
		OffloadParamsToCPU: true,
		NThreads:           4,
		FlowShift:          3.0,
	})

	if err != nil {
		fmt.Println("Failed to create stable diffusion instance:", err)
		return
	}
	defer sd.Free()

	err = sd.GenerateVideo(&stablediffusion.VidGenParams{
		Prompt:      "ไธ€ไธชๅœจ้•ฟๆปกๆกƒ่Šฑๆ ‘ไธ‹ๆ‹็…ง็š„็พŽๅฅณ",
		Width:       300,
		Height:      300,
		SampleSteps: 40,
		VideoFrames: 33,
		CfgScale:    6.0,
	}, "./output.mp4")

	if err != nil {
		fmt.Println("Failed to generate video:", err)
		return
	}

	fmt.Println("Video generated successfully!")
}

๐Ÿ“„ License

MIT License

๐Ÿค Contribution

Welcome to submit Issues and Pull Requests!

๐Ÿ“ž Support

If you encounter problems during use, please:

  1. Check the example code
  2. Check the dynamic library path and model files
  3. Check project Issues
  4. Submit a new Issue

Thank you for using stable-diffusion-go! If this project has helped you, please give us a Star โญ๏ธ

Documentation ยถ

Index ยถ

Constants ยถ

This section is empty.

Variables ยถ

View Source
var LoraApplyModeMap = map[string]sd.LoraApplyMode{
	"auto":                  sd.LoraApplyAuto,
	"immediately":           sd.LoraApplyImmediately,
	"at_runtime":            sd.LoraApplyAtRuntime,
	"lora_apply_mode_count": sd.LoraApplyModeCount,
}

LoraApplyModeMap LoRA apply mode mapping

View Source
var PredictionMap = map[string]sd.Prediction{
	"eps":        sd.EPSPred,
	"v":          sd.VPred,
	"edm_v":      sd.EDMVPred,
	"flow":       sd.FlowPred,
	"flux_flow":  sd.FluxFlowPred,
	"flux2_flow": sd.Flux2FlowPred,
	"default":    sd.PredictionCount,
}

PredictionMap prediction type mapping

View Source
var PreviewMap = map[string]sd.Preview{
	"none":          sd.PreviewNone,
	"proj":          sd.PreviewProj,
	"tae":           sd.PreviewTAE,
	"vae":           sd.PreviewVAE,
	"preview_count": sd.PreviewCount,
}

PreviewMap preview type mapping

View Source
var RNGTypeMap = map[string]sd.RngType{
	"default":    sd.DefaultRNG,
	"cuda":       sd.CUDARNG,
	"cpu":        sd.CPURNG,
	"type_count": sd.RNGTypeCount,
}

RNGTypeMap RNG type mapping

View Source
var SDTypeMap = map[string]sd.SDType{
	"f32":  sd.SDTypeF32,
	"f16":  sd.SDTypeF16,
	"q4_0": sd.SDTypeQ4_0,
	"q4_1": sd.SDTypeQ4_1,
	"q5_0": sd.SDTypeQ5_0,
	"q5_1": sd.SDTypeQ5_1,
	"q8_0": sd.SDTypeQ8_0,
	"q8_1": sd.SDTypeQ8_1,

	"q2_k":    sd.SDTypeQ2_K,
	"q3_k":    sd.SDTypeQ3_K,
	"q4_k":    sd.SDTypeQ4_K,
	"q5_k":    sd.SDTypeQ5_K,
	"q6_k":    sd.SDTypeQ6_K,
	"q8_k":    sd.SDTypeQ8_K,
	"iq2_xxs": sd.SDTypeIQ2_XXS,
	"iq2_xs":  sd.SDTypeIQ2_XS,
	"iq3_xxs": sd.SDTypeIQ3_XXS,
	"iq1_s":   sd.SDTypeIQ1_S,
	"iq4_nl":  sd.SDTypeIQ4_NL,
	"iq3_s":   sd.SDTypeIQ3_S,
	"iq2_s":   sd.SDTypeIQ2_S,
	"iq4_xs":  sd.SDTypeIQ4_XS,
	"i8":      sd.SDTypeI8,
	"i16":     sd.SDTypeI16,
	"i32":     sd.SDTypeI32,
	"i64":     sd.SDTypeI64,
	"f64":     sd.SDTypeF64,
	"iq1_m":   sd.SDTypeIQ1_M,
	"bf16":    sd.SDTypeBF16,

	"tq1_0": sd.SDTypeTQ1_0,
	"tq2_0": sd.SDTypeTQ2_0,

	"mxfp4":   sd.SDTypeMXFP4,
	"default": sd.SDTypeCount,
}

SDTypeMap SDType mapping

View Source
var SampleMethodMap = map[string]sd.SampleMethod{
	"default":             -1,
	"euler":               sd.EulerSampleMethod,
	"euler_a":             sd.EulerASampleMethod,
	"heun":                sd.HeunSampleMethod,
	"dpm2":                sd.DPM2SampleMethod,
	"dpm++2s_a":           sd.DPMPP2SASampleMethod,
	"dpm++2m":             sd.DPMPP2MSampleMethod,
	"dpm++2mv2":           sd.DPMPP2Mv2SampleMethod,
	"ipndm":               sd.IPNDMSampleMethod,
	"ipndm_v":             sd.IPNDMSampleMethodV,
	"lcm":                 sd.LCMSampleMethod,
	"ddim_trailing":       sd.DDIMTrailingSampleMethod,
	"tcd":                 sd.TCDSampleMethod,
	"sample_method_count": sd.SampleMethodCount,
}

SampleMethodMap sampling method mapping

View Source
var SchedulerMap = map[string]sd.Scheduler{
	"default":         -1,
	"discrete":        sd.DiscreteScheduler,
	"karras":          sd.KarrasScheduler,
	"exponential":     sd.ExponentialScheduler,
	"ays":             sd.AYSScheduler,
	"gits":            sd.GITScheduler,
	"sgm_uniform":     sd.SGMUniformScheduler,
	"simple":          sd.SimpleScheduler,
	"smoothstep":      sd.SmoothstepScheduler,
	"kl_optimal":      sd.KLOptimalScheduler,
	"lcm":             sd.LCMScheduler,
	"scheduler_count": sd.SchedulerCount,
}

SchedulerMap scheduler mapping

Functions ยถ

func Convert ยถ

func Convert(inputPath, vaePath, outputPath, outputType, tensorTypeRules string, convertName bool) error

Convert model conversion function, convert a model to gguf format. inputPath: Path to the input model. vaePath: Path to the vae. outputPath: Path to save the converted model. outputType: The weight type (default: auto). tensorTypeRules: Weight type per tensor pattern (example: "^vae\\\\.=f16,model\\\\.=q8_0")

Types ยถ

type ContextParams ยถ

type ContextParams struct {
	ModelPath                   string     // Full model path
	ClipLPath                   string     // CLIP-L text encoder path
	ClipGPath                   string     // CLIP-G text encoder path
	ClipVisionPath              string     // CLIP Vision encoder path
	T5XXLPath                   string     // T5-XXL text encoder path
	LLMPath                     string     // LLM text encoder path (e.g., qwenvl2.5 for qwen-image, mistral-small3.2 for flux2)
	LLMVisionPath               string     // LLM Vision encoder path
	DiffusionModelPath          string     // Standalone diffusion model path
	HighNoiseDiffusionModelPath string     // Standalone high noise diffusion model path
	VAEPath                     string     // VAE model path
	TAESDPath                   string     // TAE-SD model path, uses Tiny AutoEncoder for fast decoding (low quality)
	ControlNetPath              string     // ControlNet model path
	Embeddings                  *Embedding // Embedding information
	EmbeddingCount              uint32     // Number of embeddings
	PhotoMakerPath              string     // PhotoMaker model path
	TensorTypeRules             string     // Weight type rules per tensor pattern (e.g., "^vae\.=f16,model\.=q8_0")
	VAEDecodeOnly               bool       // Process VAE using only decode mode
	FreeParamsImmediately       bool       // Whether to free parameters immediately
	NThreads                    int32      // Number of threads to use for generation
	WType                       string     // Weight type (default: auto-detect from model file)
	RNGType                     string     // Random number generator type (default: "cuda")
	SamplerRNGType              string     // Sampler random number generator type (default: "cuda")
	Prediction                  string     // Prediction type override
	LoraApplyMode               string     // LoRA application mode (default: "auto")
	OffloadParamsToCPU          bool       // Keep weights in RAM to save VRAM, auto-load to VRAM when needed
	EnableMmap                  bool       // Whether to enable memory mapping
	KeepClipOnCPU               bool       // Keep CLIP on CPU (for low VRAM)
	KeepControlNetOnCPU         bool       // Keep ControlNet on CPU (for low VRAM)
	KeepVAEOnCPU                bool       // Keep VAE on CPU (for low VRAM)
	DiffusionFlashAttn          bool       // Use Flash attention in diffusion model (significantly reduces memory usage)
	TAEPreviewOnly              bool       // Prevent decoding final image with taesd (for preview="tae")
	DiffusionConvDirect         bool       // Use Conv2d direct in diffusion model
	VAEConvDirect               bool       // Use Conv2d direct in VAE model (should improve performance)
	CircularX                   bool       // Enable circular padding on X axis
	CircularY                   bool       // Enable circular padding on Y axis
	ForceSDXLVAConvScale        bool       // Force conv scale on SDXL VAE
	ChromaUseDitMask            bool       // Whether Chroma uses DiT mask
	ChromaUseT5Mask             bool       // Whether Chroma uses T5 mask
	ChromaT5MaskPad             int32      // Chroma T5 mask padding size
	QwenImageZeroCondT          bool       // Qwen-image zero condition T parameter
	FlowShift                   float32    // Shift value for Flow models (e.g., SD3.x or WAN)
}

ContextParams context parameters structure for initializing Stable Diffusion context

type Embedding ยถ

type Embedding struct {
	Name string // Embedding name
	Path string // Embedding file path
}

Embedding embedding structure for defining model embeddings

type ImgGenParams ยถ

type ImgGenParams struct {
	Loras              *Lora             // LoRA parameters
	LoraCount          uint32            // Number of LoRAs
	Prompt             string            // Prompt to render
	NegativePrompt     string            // Negative prompt
	ClipSkip           int32             // Skip last layers of CLIP network (1 = no skip, 2 = skip one layer, <=0 = not specified)
	InitImagePath      string            // Initial image path for guidance
	RefImagesPath      []string          // Array of reference image paths for Flux Kontext models
	RefImagesCount     int32             // Number of reference images
	AutoResizeRefImage bool              // Whether to auto-resize reference images
	IncreaseRefIndex   bool              // Whether to auto-increase index based on reference image list order (starting from 1)
	MaskImagePath      string            // Inpainting mask image path
	Width              int32             // Image width (pixels)
	Height             int32             // Image height (pixels)
	CfgScale           float32           // Unconditional guidance scale.
	ImageCfgScale      float32           // Image guidance scale for inpaint or instruct-pix2pix models (default: same as `CfgScale`).
	DistilledGuidance  float32           // Distilled guidance scale for models with guidance input.
	SkipLayers         []int32           // Layers to skip for SLG steps (SLG will be enabled at step int([STEPS]x[START]) and disabled at int([STEPS]x[END])).
	SkipLayerStart     float32           // SLG enabling point.
	SkipLayerEnd       float32           // SLG disabling point.
	SlgScale           float32           // Skip layer guidance (SLG) scale, only for DiT models.
	Scheduler          string            // Denoiser sigma scheduler (default: discrete).
	SampleMethod       string            // Sampling method (default: euler for Flux/SD3/Wan, euler_a otherwise).
	SampleSteps        int32             // Number of sample steps.
	Eta                float32           // Eta in DDIM, only for DDIM and TCD.
	ShiftedTimestep    int32             // Shift timestep for NitroFusion models, default: 0, recommended N for NitroSD-Realism around 250 and 500 for NitroSD-Vibrant.
	CustomSigmas       []float32         // Custom sigma values for the sampler, comma-separated (e.g. "14.61,7.8,3.5,0.0").
	Strength           float32           // Noise/denoise strength (range [0.0, 1.0])
	Seed               int64             // RNG seed (< 0 for random seed)
	BatchCount         int32             // Number of images to generate
	ControlImagePath   string            // Control condition image path for ControlNet
	ControlStrength    float32           // Strength to apply ControlNet
	PMParams           *PMParams         // PhotoMaker parameters
	VAETilingParams    sd.SDTilingParams // VAE tiling parameters for reducing memory usage
	CacheParams        sd.SDCacheParams  // Cache parameters for DiT models
}

ImgGenParams image generation parameters structure for defining image generation related parameters

type Lora ยถ

type Lora struct {
	IsHighNoise bool    // Whether it's a high noise LoRA
	Multiplier  float32 // LoRA multiplier
	Path        string  // LoRA file path
}

Lora LoRA structure for defining LoRA model parameters

type PMParams ยถ

type PMParams struct {
	IDImages      *sd.SDImage // ID images pointer
	IDImagesCount int32       // Number of ID images
	IDEmbedPath   string      // PhotoMaker v2 ID embedding path
	StyleStrength float32     // Strength to keep PhotoMaker input identity
}

PMParams PhotoMaker parameters structure for defining PhotoMaker related parameters

type StableDiffusion ยถ

type StableDiffusion struct {
	// contains filtered or unexported fields
}

StableDiffusion Stable Diffusion structure containing context pointer

func NewStableDiffusion ยถ

func NewStableDiffusion(ctxParams *ContextParams) (*StableDiffusion, error)

NewStableDiffusion creates a stable diffusion instance

func (*StableDiffusion) Free ยถ

func (sDiffusion *StableDiffusion) Free()

Free frees the stable diffusion context

func (*StableDiffusion) GenerateImage ยถ

func (sDiffusion *StableDiffusion) GenerateImage(imgGenParams *ImgGenParams, newImagePath string) error

GenerateImage generates image from text or image

func (*StableDiffusion) GenerateVideo ยถ

func (sDiffusion *StableDiffusion) GenerateVideo(vidGenParams *VidGenParams, newVideoPath string) error

GenerateVideo generates video

type Upscaler ยถ

type Upscaler struct {
	// contains filtered or unexported fields
}

func NewUpscaler ยถ

func NewUpscaler(params *UpscalerParams) *Upscaler

NewUpscaler creates a new upscaler context

func (*Upscaler) Upscale ยถ

func (us *Upscaler) Upscale(inputImagePath string, upscaleFactor uint32, outputImagePath string) error

Upscale upscaling function

type UpscalerParams ยถ

type UpscalerParams struct {
	EsrganPath         string // ESRGAN model path
	OffloadParamsToCPU bool   // Whether to save parameters to CPU
	Direct             bool   // Whether to use direct mode
	NThreads           int    // Number of threads to use
	TileSize           int    // Tile size
}

type VidGenParams ยถ

type VidGenParams struct {
	Loras             *Lora    // LoRA parameters
	LoraCount         uint32   // Number of LoRAs
	Prompt            string   // Prompt to render
	NegativePrompt    string   // Negative prompt
	ClipSkip          int32    // Skip last layers of CLIP network (1 = no skip, 2 = skip one layer, <=0 = not specified)
	InitImagePath     string   // Initial image path for starting generation
	EndImagePath      string   // End image path for ending generation (required for flf2v)
	ControlFramesPath []string // Array of control frame image paths for video
	ControlFramesSize int32    // Control frame size
	Width             int32    // Video width (pixels)
	Height            int32    // Video height (pixels)

	CfgScale          float32   // Unconditional guidance scale.
	ImageCfgScale     float32   // Image guidance scale for inpaint or instruct-pix2pix models (default: same as `CfgScale`).
	DistilledGuidance float32   // Distilled guidance scale for models with guidance input.
	SkipLayers        []int32   // Layers to skip for SLG steps (SLG will be enabled at step int([STEPS]x[START]) and disabled at int([STEPS]x[END])).
	SkipLayerStart    float32   // SLG enabling point.
	SkipLayerEnd      float32   // SLG disabling point.
	SlgScale          float32   // Skip layer guidance (SLG) scale, only for DiT models.
	Scheduler         string    // Denoiser sigma scheduler (default: discrete).
	SampleMethod      string    // Sampling method (default: euler for Flux/SD3/Wan, euler_a otherwise).
	SampleSteps       int32     // Number of sample steps.
	Eta               float32   // Eta in DDIM, only for DDIM and TCD.
	ShiftedTimestep   int32     // Shift timestep for NitroFusion models, default: 0, recommended N for NitroSD-Realism around 250 and 500 for NitroSD-Vibrant.
	CustomSigmas      []float32 // Custom sigma values for the sampler, comma-separated (e.g. "14.61,7.8,3.5,0.0").

	HighNoiseCfgScale          float32   // High noise diffusion model equivalent of `cfg_scale`.
	HighNoiseImageCfgScale     float32   // High noise diffusion model equivalent of `image_cfg_scale`.
	HighNoiseDistilledGuidance float32   // High noise diffusion model equivalent of `guidance`.
	HighNoiseSkipLayers        []int32   // High noise diffusion model equivalent of `skip_layers`.
	HighNoiseSkipLayerStart    float32   // High noise diffusion model equivalent of `skip_layer_start`.
	HighNoiseSkipLayerEnd      float32   // High noise diffusion model equivalent of `skip_layer_end`.
	HighNoiseSlgScale          float32   // High noise diffusion model equivalent of `slg_scale`.
	HighNoiseScheduler         string    // High noise diffusion model equivalent of `scheduler`.
	HighNoiseSampleMethod      string    // High noise diffusion model equivalent of `sample_method`.
	HighNoiseSampleSteps       int32     // High noise diffusion model equivalent of `sample_steps` (default: -1 = auto).
	HighNoiseEta               float32   // High noise diffusion model equivalent of `eta`.
	HighNoiseShiftedTimestep   int32     // Shift timestep for NitroFusion models, default: 0, recommended N for NitroSD-Realism around 250 and 500 for NitroSD-Vibrant.
	HighNoiseCustomSigmas      []float32 // Custom sigma values for the sampler, comma-separated (e.g. "14.61,7.8,3.5,0.0").

	MOEBoundary  float32          // Timestep boundary for Wan2.2 MoE models
	Strength     float32          // Noise/denoise strength (range [0.0, 1.0])
	Seed         int64            // RNG seed (< 0 for random seed)
	VideoFrames  int32            // Number of video frames to generate
	VaceStrength float32          // Wan VACE strength
	CacheParams  sd.SDCacheParams // Cache parameters for DiT models
}

VidGenParams video generation parameters structure for defining video generation related parameters

Directories ยถ

Path Synopsis
pkg
sd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL