wsproxy

package module
v0.0.0-...-1f429af Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 8, 2026 License: Apache-2.0 Imports: 20 Imported by: 0

README

ws-proxy

An embedded WebSocket proxy for the OpenAI Realtime API, implemented entirely inside an Envoy dynamic module .so.

What it does

  • Intercepts WebSocket upgrade requests at /v1/responses
  • Proxies frames bidirectionally between the downstream client and wss://api.openai.com/v1/responses
  • Injects the Authorization: Bearer $OPENAI_API_KEY header on the upstream connection
  • Taps frames in both directions via SessionTap:
    • Client → upstream: extracts the model name from response.create messages
    • Upstream → client: extracts token usage from response.completed messages

Normal HTTP traffic (non-WebSocket) passes through to the upstream cluster unchanged.

Architecture

Client ──WS──► Envoy (port 10000, upgrade route)
                  │ ──► ws-proxy-local cluster (127.0.0.1:<random>)
                  │             │
                  │     WSProxy.ServeHTTP
                  │             │ ──► wss://api.openai.com/v1/responses
                  │             │     Authorization: Bearer $OPENAI_API_KEY
                  │
Normal HTTP ──► upstream cluster

Configuration

Set OPENAI_API_KEY in the environment before starting Envoy.

The .so binds the proxy on a loopback port at startup. By default it uses a random free port and logs it:

ws-proxy: listening on 127.0.0.1:XXXXX

For repeatable local configs or e2e tests, set listen_addr in the filter config and point Envoy's ws-proxy-local STATIC cluster at the same address:

filter_config:
  "@type": type.googleapis.com/google.protobuf.StringValue
  value: '{"listen_addr":"127.0.0.1:10001","upstream_url":"ws://127.0.0.1:18080","auth_value":"","otel_endpoint":"127.0.0.1:4317"}'

Build

make
# or:
CGO_ENABLED=1 go build -trimpath -buildmode=c-shared -o libws-proxy.so ./cmd

Run

# set your OpenAI API key
export OPENAI_API_KEY=sk-...

# start Envoy
ENVOY_DYNAMIC_MODULES_SEARCH_PATH=$(pwd) envoy -c envoy.yaml
# stderr: ws-proxy: listening on 127.0.0.1:XXXXX
# (update ws-proxy-local cluster port in envoy.yaml if needed)

# normal HTTP — passes through to upstream
curl http://localhost:10000/v1/models -H "authorization: Bearer $OPENAI_API_KEY"

# WebSocket upgrade — proxied through the embedded WS server to OpenAI
# (use a WebSocket client that supports the Realtime API protocol)

What this demonstrates

  • RegisterRaw for filters that need full HTTP upgrade control
  • jisr/server.Group for binding the proxy server to Envoy's filter lifecycle
  • Bidirectional WebSocket proxying with frame-level tapping via callbacks
  • Why jisr's HandlerFunc model cannot intercept WebSocket frames (they are raw TCP after the 101 handshake) — and how RegisterRaw + embedded server is the escape hatch
  • Environment variable expansion for secret injection at runtime
  • Embedded actor observability: structured session logs from WSProxy.ServeHTTP, actor-side metrics through github.com/dio/logging, e2e coverage with a local mock upstream, and OpenTelemetry guidance in OBSERVABILITY.md

Documentation

Overview

Package wsproxy demonstrates the embedded WebSocket proxy pattern using jisr/server, with a complete implementation for the OpenAI Realtime API at /v1/responses.

Protocol: OpenAI Realtime WebSocket (/v1/responses)

The client sends one message to start a response:

{"type":"response.create","model":"gpt-4o-mini","input":[...],"max_output_tokens":N}

The server streams events back:

{"type":"response.created", ...}
{"type":"response.output_text.delta", "delta":"Hi"}   // text chunks
{"type":"response.output_text.done",  ...}
{"type":"response.completed", "response":{"usage":{"input_tokens":N,"output_tokens":M}}}

The WSProxy taps every frame:

  • Client → Upstream: extracts model name from response.create
  • Upstream → Client: extracts token usage from response.completed

Architecture

Client ──WS──► Envoy (port 10000, upgrade route)
                  │  ──► ws-proxy-local cluster (127.0.0.1:<random>)
                  │              │
                  │      WSProxy.ServeHTTP
                  │              │  ──► wss://api.openai.com/v1/responses
                  │              │      Authorization: Bearer $OPENAI_API_KEY
                  │
Normal HTTP ──► upstream cluster (TLS + ws-auth upstream filter)

Index

Constants

View Source
const AuthExtensionName = "ws-auth"
View Source
const ExtensionName = "ws-proxy"

Variables

This section is empty.

Functions

func ResolveEnv

func ResolveEnv(v string) string

ResolveEnv expands ${ENV_VAR} references in v using os.Expand. Returns v unchanged if the variable is unset.

Types

type AuthConfig

type AuthConfig struct {
	// StripHeaders are request headers removed before forwarding upstream
	// (e.g. "authorization" — strip the client's key, inject the provider's).
	StripHeaders []string

	// InjectHeaders are headers added to the upstream request.
	// Values support ${ENV_VAR} expansion.
	InjectHeaders map[string]string
}

AuthConfig holds credentials for a single upstream cluster. Loaded once at filter config time from environment variables.

func DefaultAuthConfig

func DefaultAuthConfig() AuthConfig

DefaultAuthConfig returns a default config reading from well-known env vars.

type Config

type Config struct {
	// UpstreamURL is the upstream wss:// base URL.
	// Default: "wss://api.openai.com"
	UpstreamURL string `json:"upstream_url"`

	// AuthHeader is the HTTP header name to inject when dialing upstream.
	// Default: "authorization"
	AuthHeader string `json:"auth_header"`

	// AuthValue is the header value. Supports ${ENV_VAR} expansion.
	// Default: "Bearer ${OPENAI_API_KEY}"
	AuthValue string `json:"auth_value"`

	// ShutdownTimeout for graceful shutdown. Default: "5s".
	ShutdownTimeout string `json:"shutdown_timeout"`

	// ListenAddress is the loopback address for the embedded proxy server.
	// Default: "127.0.0.1:0" (random free port).
	ListenAddress string `json:"listen_addr"`

	// OTELEndpoint enables actor-side OTLP/gRPC metrics export when set.
	// Example: "127.0.0.1:4317".
	OTELEndpoint string `json:"otel_endpoint"`

	// OTELExportInterval controls actor-side metric export cadence.
	// Default: "1s".
	OTELExportInterval string `json:"otel_export_interval"`
}

Config is the JSON config for the ws-proxy filter.

type SessionTap

type SessionTap struct {
	// contains filtered or unexported fields
}

SessionTap extracts the model name and token usage from OpenAI Realtime frames. It is created fresh for each WebSocket session. Exported for unit testing.

func NewSessionTap

func NewSessionTap() *SessionTap

NewSessionTap creates a new SessionTap.

func (*SessionTap) FeedClient

func (t *SessionTap) FeedClient(data []byte)

FeedClient taps the response.create frame (first client message) to extract the model name. All subsequent client frames are ignored after model is known.

OpenAI Realtime client frame format:

{"type":"response.create","model":"gpt-4o-mini","input":[...],"max_output_tokens":20}

func (*SessionTap) FeedUpstream

func (t *SessionTap) FeedUpstream(data []byte)

FeedUpstream taps the response.completed frame (final server event) to extract token usage. All other upstream frames are ignored.

OpenAI Realtime server frame format:

{"type":"response.completed","response":{"usage":{"input_tokens":N,"output_tokens":M}}}

func (*SessionTap) Model

func (t *SessionTap) Model() string

Model returns the model name extracted from response.create, or "" if not yet seen.

func (*SessionTap) Usage

func (t *SessionTap) Usage() TokenUsage

Usage returns the token usage extracted from response.completed.

type TokenUsage

type TokenUsage struct {
	InputTokens  uint32
	OutputTokens uint32
}

TokenUsage holds the token counts from the completed response.

type WSProxy

type WSProxy struct {

	// OnClientFrame is called for each text frame the client sends.
	// Runs in the pump goroutine — must be fast and non-blocking.
	OnClientFrame func(websocket.MessageType, []byte)

	// OnUpstreamFrame is called for each text frame received from upstream.
	// Runs in the pump goroutine — must be fast and non-blocking.
	OnUpstreamFrame func(websocket.MessageType, []byte)
	// contains filtered or unexported fields
}

WSProxy is an http.Handler that proxies WebSocket connections to an upstream provider, tapping frames for model extraction and token counting.

func NewProxy

func NewProxy(upstreamURL, authHeader, authValue string) *WSProxy

NewProxy creates a WSProxy for direct use in tests or without the Envoy factory.

func (*WSProxy) ServeHTTP

func (p *WSProxy) ServeHTTP(w http.ResponseWriter, r *http.Request)

ServeHTTP accepts a WebSocket upgrade from Envoy, dials the upstream, and runs the bidirectional frame pump with per-session tapping.

Directories

Path Synopsis
Package main builds the ws-proxy example as a standalone Envoy dynamic module.
Package main builds the ws-proxy example as a standalone Envoy dynamic module.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL