Documentation
¶
Overview ¶
Package openvino implements the runtime/transport.Service boundary for the OpenVINO (Intel) backend: it opens persistent, manifest-keyed sessions on the owned device (CPU / GPU / NPU) that the runtime drives over the transport. It is not a modelprovider — the stateless modelprovider lives in the runtime; this package is the daemon-side compute implementation.
session.go adapts OpenVINO GenAI (string-prompt based, with the tokenizer, chat template, and prefix cache held inside the ContinuousBatchingPipeline) to the EnsurePrefix/PrefillSuffix/Decode contract. The native backend lives in the isolated sub-package ./ovsession behind the 'openvino' and 'openvino_genai' build tags, so the default build and CI never require OpenVINO or a C++ toolchain; without the tags ovsession reports Available == false and OpenSession returns the not-compiled-in error.
Build and benchmark the native path with Makefile.openvino (the CGO flags are derived from the OpenVINO wheels; CONTENOX_OPENVINO_DEVICE selects CPU/GPU/NPU).
Index ¶
- func HasAccelerator() bool
- func RuntimeInfo() transport.ModelInfo
- type EmbedSessionBackend
- type Service
- func (s *Service) Describe(_ context.Context, req transport.OpenSessionRequest) (transport.ModelInfo, error)
- func (s *Service) Embed(ctx context.Context, req transport.EmbedRequest) (transport.EmbedResult, error)
- func (s *Service) OpenSession(_ context.Context, req transport.OpenSessionRequest) (transport.Session, error)
- type ServiceOption
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HasAccelerator ¶ added in v0.32.5
func HasAccelerator() bool
HasAccelerator reports whether OpenVINO enumerates a non-CPU device (GPU/NPU) on this host. modeld uses it to pick the backend on a universal build.
func RuntimeInfo ¶ added in v0.32.5
RuntimeInfo reports the linked OpenVINO runtime identity and device inventory. In builds without the native OpenVINO GenAI tags, this returns a minimal record with RuntimeName set.
Types ¶
type EmbedSessionBackend ¶ added in v0.32.3
type Service ¶ added in v0.32.3
type Service struct {
// contains filtered or unexported fields
}
Service implements the runtime/transport.Service boundary for the OpenVINO GenAI backend. It opens persistent, manifest-keyed sessions on the owned device (CPU / GPU / NPU); the runtime reaches it as a client over the transport and never imports this package.
func NewService ¶ added in v0.32.5
func NewService(opts ...ServiceOption) *Service
func (*Service) Describe ¶ added in v0.32.3
func (s *Service) Describe(_ context.Context, req transport.OpenSessionRequest) (transport.ModelInfo, error)
Describe reports the model's trained context window read from the IR's config.json (max_position_embeddings) — no pipeline load. The runtime consumes this as the model's capacity; it never reads the IR files itself.
func (*Service) Embed ¶ added in v0.32.5
func (s *Service) Embed(ctx context.Context, req transport.EmbedRequest) (transport.EmbedResult, error)
Embed runs a one-shot OpenVINO GenAI TextEmbeddingPipeline for req.Text. It is deliberately separate from OpenSession: embedding models do not use the chat session's prefix/suffix/Decode lifecycle.
func (*Service) OpenSession ¶ added in v0.32.3
func (s *Service) OpenSession(_ context.Context, req transport.OpenSessionRequest) (transport.Session, error)
OpenSession makes the model at req.Path (an OpenVINO IR directory, resolved by the runtime) resident and returns a session bound to it. It rejects a model typed for a different backend (ErrBackendMismatch) before loading, so a request for a llama model on an openvino-mode daemon fails at the boundary. In a build without the openvino + openvino_genai tags, ovsession.NewGenAI reports the backend is not compiled in and that error surfaces here unchanged.
type ServiceOption ¶ added in v0.32.5
type ServiceOption func(*Service)
func WithCapacityPolicy ¶ added in v0.32.5
func WithCapacityPolicy(p capacity.Policy) ServiceOption
func WithMemorySource ¶ added in v0.32.5
func WithMemorySource(src capacity.MemorySource) ServiceOption