xiaozhi

package
v1.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 8, 2026 License: MIT Imports: 16 Imported by: 0

Documentation

Overview

Package xiaozhi implements the xiaozhi-esp32 WebSocket voice protocol.

Device / Browser ── xiaozhi WS ──► Server ── dialog WS ──► Dialog (LLM)

Index

Constants

View Source
const (
	MsgHello         = "hello"
	MsgListen        = "listen"
	MsgAbort         = "abort"
	MsgPing          = "ping"
	RespHello        = "hello"
	RespPong         = "pong"
	RespSTT          = "stt"
	RespTTS          = "tts"
	RespError        = "error"
	RespAbortConfirm = "abort"
)
View Source
const (
	ListenStart  = "start"
	ListenStop   = "stop"
	ListenDetect = "detect"
)
View Source
const (
	AudioFormatOpus = "opus"
	AudioFormatPCM  = "pcm"
)
View Source
const (
	// ModePipeline: ASR + dialog-plane WebSocket (LLM) + TTS.
	ModePipeline = "pipeline"
	// ModeRealtime: pkg/realtime multimodal agent (e.g. Qwen-Omni).
	ModeRealtime = "realtime"
)

Variables

This section is empty.

Functions

func MakeAbortConfirm

func MakeAbortConfirm(sessionID string) []byte

func MakeError

func MakeError(message string, fatal bool) []byte

func MakeLLMReply

func MakeLLMReply(text string) []byte

func MakePongReply

func MakePongReply(sessionID string) []byte

func MakeSTTReply

func MakeSTTReply(sessionID, text string) []byte

func MakeTTSStateReply

func MakeTTSStateReply(sessionID, state, codec string) []byte

func MakeTTSStateReplyFrames

func MakeTTSStateReplyFrames(sessionID, state, codec string, frameMs int) []byte

func MakeWelcomeReply

func MakeWelcomeReply(sessionID string, ap AudioParams) []byte

func MergeHelloAudio

func MergeHelloAudio(h *AudioParams)

func ParseTextFrame

func ParseTextFrame(raw []byte) (string, error)

Types

type AudioParams

type AudioParams struct {
	Format        string `json:"format,omitempty"`
	Codec         string `json:"codec,omitempty"`
	SampleRate    int    `json:"sample_rate,omitempty"`
	Channels      int    `json:"channels,omitempty"`
	FrameDuration int    `json:"frame_duration,omitempty"`
	BitDepth      int    `json:"bit_depth,omitempty"`
}

func DefaultHelloAudio

func DefaultHelloAudio() AudioParams

type HelloMessage

type HelloMessage struct {
	Type        string                 `json:"type"`
	Version     int                    `json:"version,omitempty"`
	Transport   string                 `json:"transport,omitempty"`
	Features    map[string]interface{} `json:"features,omitempty"`
	AudioParams *AudioParams           `json:"audio_params,omitempty"`
	Mode        string                 `json:"mode,omitempty"`
}

type ListenMessage

type ListenMessage struct {
	Type  string `json:"type"`
	State string `json:"state"`
	Mode  string `json:"mode,omitempty"`
}

type MessageHandler added in v1.3.0

type MessageHandler func(ctx context.Context, session *wsSession, raw []byte) error

MessageHandler is a custom message handler for extensibility.

type RealtimeAgentFactory

type RealtimeAgentFactory interface {
	NewAgent(ctx context.Context, callID string, onEvent func(realtime.Event)) (realtime.Agent, int, int, error)
}

RealtimeAgentFactory builds a started realtime.Agent for one xiaozhi session.

type Server

type Server struct {
	// contains filtered or unexported fields
}

Server accepts xiaozhi WebSocket connections.

func NewServer

func NewServer(cfg ServerConfig) (*Server, error)

NewServer validates cfg.

func (*Server) Handle

func (s *Server) Handle(w http.ResponseWriter, r *http.Request)

Handle upgrades the HTTP connection to the xiaozhi protocol.

type ServerConfig

type ServerConfig struct {
	SessionFactory       transport.SessionFactory
	RealtimeFactory      RealtimeAgentFactory
	DialogWSURL          string
	CallIDPrefix         string
	ResolveDevicePayload func(ctx context.Context, deviceID string) ([]byte, error)
	ConfigureClient      func(*gateway.ClientConfig)
	OnSessionStart       func(ctx context.Context, callID, deviceID string)
	OnSessionEnd         func(ctx context.Context, callID, reason string)
	// ConfigureSession allows third-party extensions to register custom message handlers
	ConfigureSession func(session *wsSession)
}

ServerConfig wires the xiaozhi adapter.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL