Documentation
¶
Overview ¶
Package openvino implements the runtime/transport.Service boundary for the OpenVINO (Intel) backend: it opens persistent, manifest-keyed sessions on the owned device (CPU / GPU / NPU) that the runtime drives over the transport. It is not a modelprovider — the stateless modelprovider lives in the runtime; this package is the daemon-side compute implementation.
session.go adapts OpenVINO GenAI (string-prompt based, with the tokenizer, chat template, and prefix cache held inside the ContinuousBatchingPipeline) to the EnsurePrefix/PrefillSuffix/Decode contract. The native backend lives in the isolated sub-package ./ovsession behind the 'openvino' and 'openvino_genai' build tags, so the default build and CI never require OpenVINO or a C++ toolchain; without the tags ovsession reports Available == false and OpenSession returns the not-compiled-in error.
Build and benchmark the native path with Makefile.openvino (the CGO flags are derived from the OpenVINO wheels; CONTENOX_OPENVINO_DEVICE selects CPU/GPU/NPU).
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type EmbedSessionBackend ¶ added in v0.32.3
type Service ¶ added in v0.32.3
type Service struct{}
Service implements the runtime/transport.Service boundary for the OpenVINO GenAI backend. It opens persistent, manifest-keyed sessions on the owned device (CPU / GPU / NPU); the runtime reaches it as a client over the transport and never imports this package.
func (*Service) Describe ¶ added in v0.32.3
func (s *Service) Describe(_ context.Context, req transport.OpenSessionRequest) (transport.ModelInfo, error)
Describe reports the model's trained context window read from the IR's config.json (max_position_embeddings) — no pipeline load. The runtime consumes this as the model's capacity; it never reads the IR files itself.
func (*Service) OpenSession ¶ added in v0.32.3
func (s *Service) OpenSession(_ context.Context, req transport.OpenSessionRequest) (transport.Session, error)
OpenSession makes the model at req.Path (an OpenVINO IR directory, resolved by the runtime) resident and returns a session bound to it. It rejects a model typed for a different backend (ErrBackendMismatch) before loading, so a request for a llama model on an openvino-mode daemon fails at the boundary. In a build without the openvino + openvino_genai tags, ovsession.NewGenAI reports the backend is not compiled in and that error surfaces here unchanged.