realtime-video

command
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 15, 2026 License: Apache-2.0 Imports: 13 Imported by: 0

README

Realtime Video Streaming Example

This example demonstrates how to stream video frames to an LLM for real-time vision analysis using PromptKit's duplex session capabilities.

Features Demonstrated

  • OpenDuplex with video: Opening a bidirectional streaming session for video
  • WithStreamingVideo: Configuring frame rate limiting and preprocessing
  • SendFrame(): Sending individual image frames to the session
  • Frame rate limiting: Automatic dropping of excess frames to match LLM processing speed

Use Cases

  • Webcam analysis: Real-time description of what a webcam sees
  • Screen sharing: Analyzing screen content as it changes
  • Security monitoring: Continuous analysis of camera feeds
  • Accessibility: Describing visual content for users with visual impairments

Usage

export GEMINI_API_KEY=your-key
go run .

How It Works

1. Open a Duplex Session with Video Config
conv, err := sdk.OpenDuplex(
    "./pack.json",
    "vision-stream",
    sdk.WithStreamingVideo(&sdk.VideoStreamConfig{
        TargetFPS:    1.0,   // Process 1 frame per second
        MaxWidth:     1024,  // Resize large frames
        MaxHeight:    1024,
        Quality:      85,    // JPEG quality
        EnableResize: true,
    }),
)
2. Send Frames
frame := &session.ImageFrame{
    Data:      jpegBytes,
    MIMEType:  "image/jpeg",
    Width:     640,
    Height:    480,
    FrameNum:  frameCount,
    Timestamp: time.Now(),
}
err = conv.SendFrame(ctx, frame)
3. Receive Responses
for chunk := range conv.Response() {
    if chunk.Content != "" {
        fmt.Print(chunk.Content)
    }
}

Frame Rate Limiting

The TargetFPS setting automatically drops excess frames:

  • Webcam at 30 FPS → With TargetFPS: 1.0, only ~1 frame/second reaches the LLM
  • No frame loss: The most recent frame is kept, older frames are dropped
  • Reduces costs: Fewer frames = fewer tokens = lower API costs

Video Chunk Streaming

For encoded video segments (H.264, VP8, etc.), use SendVideoChunk():

chunk := &session.VideoChunk{
    Data:       h264Data,
    MIMEType:   "video/h264",
    ChunkIndex: 0,
    IsKeyFrame: true,
    Timestamp:  time.Now(),
}
err = conv.SendVideoChunk(ctx, chunk)

Real-World Integration

In a real application, replace the simulated frames with actual capture:

Using gocv (OpenCV for Go)
import "gocv.io/x/gocv"

webcam, _ := gocv.VideoCaptureDevice(0)
defer webcam.Close()

mat := gocv.NewMat()
defer mat.Close()

for webcam.Read(&mat) {
    // Encode to JPEG
    buf, _ := gocv.IMEncode(".jpg", mat)

    frame := &session.ImageFrame{
        Data:      buf.GetBytes(),
        MIMEType:  "image/jpeg",
        Timestamp: time.Now(),
    }
    conv.SendFrame(ctx, frame)
}

Provider Support

Currently, realtime video streaming works best with providers that support bidirectional streaming:

  • Gemini Live API: Full support for real-time vision (when available)
  • OpenAI Realtime: Audio-focused, video support may be added later

The SDK is provider-agnostic - video frames flow through when the provider supports them.

Documentation

Overview

Package main demonstrates realtime video/image streaming with the PromptKit SDK.

This example shows:

  • Opening a duplex session with video streaming enabled
  • Sending image frames using SendFrame()
  • Frame rate limiting with WithStreamingVideo
  • Receiving streaming responses from vision models

This simulates a webcam or screen capture scenario where frames are continuously sent to an LLM for real-time analysis.

Run with:

export GEMINI_API_KEY=your-key
go run .

Note: This example simulates frame capture. In a real application, you would capture frames from a webcam, screen, or video file.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL