Image Preprocessing Example
This example demonstrates the image preprocessing capabilities of the PromptKit SDK for optimizing images before sending them to vision models.
Features Demonstrated
- WithAutoResize: Simple option to automatically resize large images
- WithImagePreprocessing: Full control over preprocessing configuration
- Quality optimization: Configure JPEG quality for the best balance of size and clarity
- Streaming support: Preprocessing works with both
Send() and Stream()
Why Image Preprocessing?
- Cost reduction: Smaller images = fewer tokens = lower API costs
- Faster responses: Less data to transmit and process
- Consistent quality: Ensure images meet model requirements
- Automatic handling: No manual image manipulation needed
Usage
export GEMINI_API_KEY=your-key
go run .
Configuration Options
Simple: WithAutoResize
conv, err := sdk.Open(
"./pack.json",
"vision",
sdk.WithAutoResize(1024, 1024), // Max dimensions
)
Advanced: WithImagePreprocessing
conv, err := sdk.Open(
"./pack.json",
"vision",
sdk.WithImagePreprocessing(&stage.ImagePreprocessConfig{
Resize: stage.ImageResizeStageConfig{
MaxWidth: 800,
MaxHeight: 600,
Quality: 90, // JPEG quality (1-100)
},
EnableResize: true,
}),
)
How It Works
- When you call
Send() or Stream() with an image, the SDK:
- Downloads or reads the image data
- Checks if resizing is needed based on your configuration
- Resizes maintaining aspect ratio if the image exceeds limits
- Re-encodes as JPEG with the specified quality
- Sends the optimized image to the vision model
Best Practices
- 1024x1024 is a good default for most vision models
- Quality 85 provides a good balance of size and clarity
- For detailed analysis, use higher max dimensions (2048x2048)
- For quick classification tasks, smaller sizes (512x512) work well