Documentation
¶
Overview ¶
Package ovsession contains the native OpenVINO session/KV bridge used by the openvino modelrepo provider.
The default build is a pure-Go stub. Build with -tags openvino and provide the OpenVINO C++ SDK include/lib flags to enable the CGo implementation.
Index ¶
- Constants
- Variables
- type ChatMessage
- type EmbedSession
- type GenAIConfig
- type GenAIResult
- type GenAISession
- func (s *GenAISession) ApplyChatTemplate(_ []ChatMessage, _ string) (string, error)
- func (s *GenAISession) Close() error
- func (s *GenAISession) Generate(_ context.Context, _ string, _ GenerateOptions) (GenAIResult, error)
- func (s *GenAISession) Stream(_ context.Context, _ string, _ GenerateOptions) (<-chan StreamChunk, error)
- func (s *GenAISession) Tokenize(_ context.Context, _ string, _ bool) ([]int, error)
- type GenerateOptions
- type PipelineMetrics
- type Session
- type StreamChunk
- type StructuredOutput
Constants ¶
const Available = false
Available reports whether the native OpenVINO backend was compiled in.
const GenAIAvailable = false
GenAIAvailable reports whether the OpenVINO GenAI session backend was built.
Variables ¶
ErrUnavailable is returned by the stub implementation.
Functions ¶
This section is empty.
Types ¶
type ChatMessage ¶
type ChatMessage struct {
Role string
Content string
ToolCalls string `json:",omitempty"`
ToolCallID string `json:",omitempty"`
}
ChatMessage is one role/content turn for chat-template rendering.
type EmbedSession ¶ added in v0.32.2
type EmbedSession struct{}
func NewEmbed ¶ added in v0.32.2
func NewEmbed(modelDir, device string) (*EmbedSession, error)
func (*EmbedSession) Close ¶ added in v0.32.2
func (s *EmbedSession) Close() error
type GenAIConfig ¶
type GenAIConfig struct {
Device string
KVCachePrecision string
CacheSize int
DynamicSplitFuse *bool
EnablePrefixCaching *bool
UseSparseAttention *bool
NumLastDenseTokensInPrefill int
XAttentionThreshold float32
XAttentionBlockSize int
XAttentionStride int
}
GenAIConfig controls construction of an OpenVINO GenAI session.
type GenAIResult ¶
type GenAIResult struct {
Text string
ParsedJSON string
Metrics PipelineMetrics
}
GenAIResult is the generated text plus the pipeline metrics observed for the request.
type GenAISession ¶
type GenAISession struct{}
GenAISession is unavailable without the openvino and openvino_genai build tags.
func NewGenAI ¶
func NewGenAI(_ string, _ GenAIConfig) (*GenAISession, error)
NewGenAI reports that the native GenAI backend is not compiled in.
func (*GenAISession) ApplyChatTemplate ¶
func (s *GenAISession) ApplyChatTemplate(_ []ChatMessage, _ string) (string, error)
ApplyChatTemplate reports that the native GenAI backend is not compiled in.
func (*GenAISession) Generate ¶
func (s *GenAISession) Generate(_ context.Context, _ string, _ GenerateOptions) (GenAIResult, error)
Generate reports that the native GenAI backend is not compiled in.
func (*GenAISession) Stream ¶
func (s *GenAISession) Stream(_ context.Context, _ string, _ GenerateOptions) (<-chan StreamChunk, error)
Stream reports that the native GenAI backend is not compiled in.
type GenerateOptions ¶
type GenerateOptions struct {
MaxNewTokens int
Temperature *float64
TopP *float64
StructuredOutput StructuredOutput
ParserProtocols []string
}
GenerateOptions controls a single GenAI generation call.
type PipelineMetrics ¶
type PipelineMetrics struct {
Requests uint64
ScheduledRequests uint64
CacheUsage float32
MaxCacheUsage float32
AvgCacheUsage float32
InferenceDuration float32
CacheSizeInBytes uint64
}
PipelineMetrics mirrors the OpenVINO GenAI PipelineMetrics fields used by the local runtime.
type Session ¶
type Session struct{}
Session is a placeholder in default builds.
func (*Session) DecodeNext ¶
func (*Session) SnapshotRestore ¶
func (*Session) SnapshotSave ¶
type StreamChunk ¶
StreamChunk carries a decoded text delta or a terminal stream error.
type StructuredOutput ¶
StructuredOutput selects an OpenVINO structured-output primitive plus its payload for this generation call.