llama

package

v0.32.5 Latest Latest Go to latest Published: Jun 19, 2026 License: Apache-2.0 Imports: 13 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/contenox/runtime

Links

Open Source Insights

Documentation ¶

Overview ¶

Package llama is the graduated local coding-node runtime: a persistent, workspace-scoped inference session that keeps a stable prefix's KV hot and re-prefills only the changed suffix (the live warm-reuse hot path), distinct from the toy fixed-constant `local` provider.

This package defines the backend-neutral session contract. Backend adapters implement it — llama.cpp now (./llamasession), OpenVINO later. Product code talks to Session, never to llama.cpp or OpenVINO concepts. Snapshot/restore (durability, branching, crash recovery) is a separate, later concern; the hot coding loop is EnsurePrefix -> PrefillSuffix -> Decode on a live session.

Index ¶

Variables
func EmbedAvailable() bool
func NewContextOverflowError(stage string, resident, additional, numCtx int) error
func NewManifestMismatchError(reason string) error
func NewUnsupportedFeatureError(feature string) error
func RuntimeInfo() transport.ModelInfo
func SessionAvailable() bool
func SetEmbedFunc(f EmbedFunc)
func SetSessionFactory(f SessionFactory)
type Config
type ContextManifest
type ContextOverflowError
- func (e *ContextOverflowError) Error() string
- func (e *ContextOverflowError) Is(target error) bool
type ContextReport
type DecodeConfig
type EmbedFunc
type ManifestMismatchError
type ManifestSegment
type PrefixInput
type PrefixStatus
type Service
- func NewService(opts ...ServiceOption) *Service
- func (s *Service) Describe(_ context.Context, req transport.OpenSessionRequest) (transport.ModelInfo, error)
- func (s *Service) Embed(_ context.Context, req transport.EmbedRequest) (transport.EmbedResult, error)
- func (s *Service) OpenSession(ctx context.Context, req transport.OpenSessionRequest) (transport.Session, error)
type ServiceOption
- func WithCapacityPolicy(p capacity.Policy) ServiceOption
- func WithMemorySource(src capacity.MemorySource) ServiceOption
type Session
type SessionFactory
type SessionSnapshot
type StreamChunk
type SuffixInput
type SuffixStatus
type TokenizeFunc
type ToolCall
type UnsupportedFeatureError
- func (e *UnsupportedFeatureError) Error() string
- func (e *UnsupportedFeatureError) Is(target error) bool

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// ErrSessionUnavailable means no native llama backend was compiled into
	// this binary.
	ErrSessionUnavailable = errors.New("llama: session backend unavailable")
	// ErrSessionClosed means the caller used a closed persistent session.
	ErrSessionClosed = errors.New("llama: session closed")
	// ErrContextOverflow means a prefix, suffix, or decode would exceed NumCtx.
	ErrContextOverflow = errors.New("llama: context overflow")
	// ErrUnsupportedFeature marks explicit product-surface gaps such as tools.
	ErrUnsupportedFeature = errors.New("llama: unsupported feature")
	// ErrSessionFatal means the backend marked the session unusable and callers
	// must evict it instead of trying to reuse resident KV.
	ErrSessionFatal = errors.New("llama: session fatal")
)

View Source

var ErrManifestMismatch = contextasm.ErrManifestMismatch

ErrManifestMismatch is returned when a prefix/suffix cannot be safely paired with resident KV under the current manifest.

Functions ¶

func EmbedAvailable ¶

func EmbedAvailable() bool

EmbedAvailable reports whether an embedding backend is compiled into this build.

func NewContextOverflowError ¶

func NewContextOverflowError(stage string, resident, additional, numCtx int) error

func NewManifestMismatchError ¶

func NewManifestMismatchError(reason string) error

NewManifestMismatchError builds a manifest-mismatch error with a reason.

func NewUnsupportedFeatureError ¶

func NewUnsupportedFeatureError(feature string) error

func RuntimeInfo ¶ added in v0.32.5

func RuntimeInfo() transport.ModelInfo

RuntimeInfo reports the linked llama.cpp runtime identity and device inventory. In non-direct builds this returns an empty record.

func SessionAvailable ¶

func SessionAvailable() bool

SessionAvailable reports whether a session backend is compiled into this build.

func SetEmbedFunc ¶

func SetEmbedFunc(f EmbedFunc)

SetEmbedFunc registers the native embedding backend.

func SetSessionFactory ¶

func SetSessionFactory(f SessionFactory)

SetSessionFactory registers the backend that creates sessions. The llama.cpp adapter (./llamasession) calls this from its init when built with the 'llamanode' tag, so the provider never imports the CGo package directly (no import cycle, default build stays CGo-free).

Types ¶

type Config ¶

type Config = transport.Config

type ContextManifest ¶

type ContextManifest = contextasm.ContextManifest

The llama backend keys warm KV reuse on the backend-neutral context manifest owned by the runtime (runtime/contextasm, surfaced to the runtime as transport.ContextManifest). These aliases let the llama.cpp session adapter and its tests refer to those types through this package without importing contextasm directly. The manifest is assembled by the runtime and crosses the transport; modeld only fills the backend-resolved token data during prefill.

type ContextOverflowError ¶

type ContextOverflowError struct {
	Stage            string
	ResidentTokens   int
	AdditionalTokens int
	NumCtx           int
}

ContextOverflowError carries token counts for an overflow at a specific primitive boundary.

func (*ContextOverflowError) Error ¶

func (e *ContextOverflowError) Error() string

func (*ContextOverflowError) Is ¶

func (e *ContextOverflowError) Is(target error) bool

type ContextReport ¶

type ContextReport = transport.ContextReport

type DecodeConfig ¶

type DecodeConfig = transport.DecodeConfig

type EmbedFunc ¶

type EmbedFunc func(ctx context.Context, modelPath string, cfg Config, input string) ([]float64, error)

EmbedFunc computes a single embedding via the native backend. The llama.cpp adapter registers one from its init when built with the 'llamanode' tag.

type ManifestMismatchError ¶

type ManifestMismatchError = contextasm.ManifestMismatchError

The llama backend keys warm KV reuse on the backend-neutral context manifest owned by the runtime (runtime/contextasm, surfaced to the runtime as transport.ContextManifest). These aliases let the llama.cpp session adapter and its tests refer to those types through this package without importing contextasm directly. The manifest is assembled by the runtime and crosses the transport; modeld only fills the backend-resolved token data during prefill.

type ManifestSegment ¶

type ManifestSegment = contextasm.ManifestSegment

The llama backend keys warm KV reuse on the backend-neutral context manifest owned by the runtime (runtime/contextasm, surfaced to the runtime as transport.ContextManifest). These aliases let the llama.cpp session adapter and its tests refer to those types through this package without importing contextasm directly. The manifest is assembled by the runtime and crosses the transport; modeld only fills the backend-resolved token data during prefill.

type PrefixInput ¶

type PrefixInput = transport.PrefixInput

type PrefixStatus ¶

type PrefixStatus = transport.PrefixStatus

type Service ¶ added in v0.32.3

type Service struct {
	// contains filtered or unexported fields
}

Service implements the runtime/transport.Service boundary. It acts as the opener for native llama.cpp backend sessions.

func NewService ¶ added in v0.32.5

func NewService(opts ...ServiceOption) *Service

func (*Service) Describe ¶ added in v0.32.3

func (s *Service) Describe(_ context.Context, req transport.OpenSessionRequest) (transport.ModelInfo, error)

Describe reports the model's trained context window read from the GGUF header (no tensor load). The runtime consumes this as the model's capacity; it never reads the GGUF itself.

func (*Service) Embed ¶ added in v0.32.5

func (s *Service) Embed(_ context.Context, req transport.EmbedRequest) (transport.EmbedResult, error)

Embed is not served through the modeld transport for llama yet. The runtime's llama provider still owns its native one-shot embedding path separately.

func (*Service) OpenSession ¶ added in v0.32.3

func (s *Service) OpenSession(ctx context.Context, req transport.OpenSessionRequest) (transport.Session, error)

OpenSession binds a session to the requested model. It rejects a model typed for a different backend (ErrBackendMismatch) before loading, so a GGUF request sent to an openvino-mode daemon — or vice versa — fails at the boundary, not deep in the engine. The model is loaded from req.Path (resolved by the runtime); identity/caching uses req.Digest.

type ServiceOption ¶ added in v0.32.5

type ServiceOption func(*Service)

func WithCapacityPolicy ¶ added in v0.32.5

func WithCapacityPolicy(p capacity.Policy) ServiceOption

func WithMemorySource ¶ added in v0.32.5

func WithMemorySource(src capacity.MemorySource) ServiceOption

type Session ¶

type Session = transport.Session

type SessionFactory ¶

type SessionFactory func(modelPath string, cfg Config) (Session, error)

SessionFactory creates a backend session for a model with explicit config.

type SessionSnapshot ¶ added in v0.32.5

type SessionSnapshot = transport.SessionSnapshot

type StreamChunk ¶

type StreamChunk = transport.StreamChunk

type SuffixInput ¶

type SuffixInput = transport.SuffixInput

type SuffixStatus ¶

type SuffixStatus = transport.SuffixStatus

type TokenizeFunc ¶

type TokenizeFunc = contextasm.TokenizeFunc

The llama backend keys warm KV reuse on the backend-neutral context manifest owned by the runtime (runtime/contextasm, surfaced to the runtime as transport.ContextManifest). These aliases let the llama.cpp session adapter and its tests refer to those types through this package without importing contextasm directly. The manifest is assembled by the runtime and crosses the transport; modeld only fills the backend-resolved token data during prefill.

type ToolCall ¶ added in v0.32.5

type ToolCall = transport.ToolCall

type UnsupportedFeatureError ¶

type UnsupportedFeatureError struct {
	Feature string
}

UnsupportedFeatureError describes a deliberately unsupported surface.

func (*UnsupportedFeatureError) Error ¶

func (e *UnsupportedFeatureError) Error() string

func (*UnsupportedFeatureError) Is ¶

func (e *UnsupportedFeatureError) Is(target error) bool

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
llamacppshim Package llamacppshim owns the direct llama.cpp C API boundary for modeld.	Package llamacppshim owns the direct llama.cpp C API boundary for modeld.
llamasession

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL