Documentation
¶
Overview ¶
Package middleware provides reusable middleware for Genkit model generation, including retry with exponential backoff and model fallback.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Fallback ¶
type Fallback struct {
// Models is the ordered list of fallback models to try.
// These are tried in order after the primary model fails. Each ref's
// Config is used verbatim for that model -- the original request's
// Config is not inherited. Use [ai.NewModelRef] to attach config.
Models []ai.ModelRef `json:"models,omitempty"`
// Statuses is the set of status codes that trigger a fallback.
// Only [core.GenkitError] errors with a matching status will trigger fallback;
// non-GenkitError errors propagate immediately.
// Defaults to [defaultFallbackStatuses].
Statuses []core.StatusName `json:"statuses,omitempty"`
}
Fallback is a middleware that tries alternative models when the primary model fails with a retryable error status.
It only hooks the Model stage -- when a model API call fails with a matching status, the request is forwarded to the next model in the list.
Models are specified as ai.ModelRef values (created via ai.NewModelRef) and resolved via the genkit.Genkit instance at call time.
Usage:
resp, err := genkit.Generate(ctx, g,
ai.WithModel(primary),
ai.WithPrompt("hello"),
ai.WithUse(&middleware.Fallback{Models: []ai.ModelRef{
googlegenai.ModelRef("googleai/gemini-2.5-flash", ...),
googlegenai.ModelRef("vertexai/gemini-2.5-flash", ...),
}}),
)
type Filesystem ¶
type Filesystem struct {
// RootDir is the directory that all operations are confined to.
RootDir string `json:"rootDirectory,omitempty"`
// AllowWriteAccess adds write_file and edit_file.
AllowWriteAccess bool `json:"allowWriteAccess,omitempty"`
// ToolNamePrefix is prepended to each tool name. Use distinct prefixes
// when attaching multiple Filesystem middlewares to one call so their
// tool names don't collide.
ToolNamePrefix string `json:"toolNamePrefix,omitempty"`
}
Filesystem is a middleware that grants the LLM scoped file access under a single root directory. It registers list_files and read_file, plus write_file and edit_file when AllowWriteAccess is true.
Path safety is enforced by os.Root (Go 1.25+), which rejects any path that resolves outside the root, including via "..", absolute paths, or symbolic links.
Usage:
resp, err := genkit.Generate(ctx, g,
ai.WithModel(m),
ai.WithPrompt("summarise docs/ and save the summary to out.md"),
ai.WithUse(&middleware.Filesystem{
RootDir: "./workspace",
AllowWriteAccess: true,
}),
)
func (*Filesystem) Name ¶
func (f *Filesystem) Name() string
type Middleware ¶
type Middleware struct{}
Middleware provides the built-in middleware (Retry, Fallback, ToolApproval) as a Genkit plugin. Register it with genkit.WithPlugins during genkit.Init.
func (*Middleware) Middlewares ¶
func (p *Middleware) Middlewares(ctx context.Context) ([]*ai.MiddlewareDesc, error)
func (*Middleware) Name ¶
func (p *Middleware) Name() string
type Retry ¶
type Retry struct {
// MaxRetries is the maximum number of retry attempts. Defaults to 3.
MaxRetries int `json:"maxRetries,omitempty"`
// Statuses is the set of status codes that trigger a retry for [core.GenkitError] errors.
// Non-GenkitError errors are always retried regardless of this setting.
// Defaults to [defaultRetryStatuses].
Statuses []core.StatusName `json:"statuses,omitempty"`
// InitialDelayMs is the delay before the first retry, in milliseconds. Defaults to 1000.
InitialDelayMs int `json:"initialDelayMs,omitempty"`
// MaxDelayMs is the upper bound on retry delay, in milliseconds. Defaults to 60000.
MaxDelayMs int `json:"maxDelayMs,omitempty"`
// BackoffFactor is the multiplier applied to the delay after each retry. Defaults to 2.
BackoffFactor float64 `json:"backoffFactor,omitempty"`
// NoJitter disables random jitter on the delay. Jitter helps prevent
// thundering-herd problems when many clients retry simultaneously.
NoJitter bool `json:"noJitter,omitempty"`
}
Retry is a middleware that retries failed model calls with exponential backoff.
It only hooks the Model stage — individual model API calls are retried, not the entire generate loop.
By default, retries occur for non-core.GenkitError errors (e.g. network failures) and for core.GenkitError errors whose status is one of UNAVAILABLE, DEADLINE_EXCEEDED, RESOURCE_EXHAUSTED, ABORTED, or INTERNAL_SERVER_ERROR.
Usage:
resp, err := ai.Generate(ctx, r,
ai.WithModel(m),
ai.WithPrompt("hello"),
ai.WithUse(&middleware.Retry{MaxRetries: 3}),
)
type Skills ¶
type Skills struct {
// SkillPaths lists directories that are scanned for skills. Each direct
// subdirectory containing a SKILL.md file is exposed as a skill.
// Defaults to []string{"skills"}.
SkillPaths []string `json:"skillPaths,omitempty"`
}
Skills is a middleware that makes a local library of "skills" available to the model. A skill is a directory containing a SKILL.md file whose contents become specialized instructions the model can load on demand.
When used, Skills:
- Injects a system prompt listing the available skill names and their (optional) descriptions.
- Registers a use_skill tool that the model can call to load a skill's full SKILL.md content into the conversation.
SKILL.md may start with a YAML frontmatter block with name and description fields; if absent, only the directory name is surfaced to the model.
Usage:
resp, err := genkit.Generate(ctx, g,
ai.WithModel(m),
ai.WithPrompt("use the python skill to compute ..."),
ai.WithUse(&middleware.Skills{SkillPaths: []string{"skills"}}),
)
func (*Skills) New ¶
New scans the configured skill paths and returns a ai.Hooks that injects the skills system prompt and exposes the use_skill tool. Scanning happens once per ai.Generate call; the result is captured in the returned hooks so WrapGenerate and the use_skill tool agree on the same skill set.
type ToolApproval ¶
type ToolApproval struct {
// AllowedTools is the list of tool names pre-approved to run without
// interruption. Tools not in this list trigger an interrupt. An empty
// list interrupts all tools.
AllowedTools []string `json:"allowedTools,omitempty"`
}
ToolApproval is a middleware that interrupts tool execution unless the tool is in [AllowedTools] or the call has been explicitly approved on resume.
To approve on resume, attach a "toolApproved" flag to the restart metadata:
restart := tool.Restart(interruptPart, &ai.RestartOptions{
ResumedMetadata: map[string]any{"toolApproved": true},
})
The bare ai.IsToolResumed flag alone is NOT treated as approval; callers must opt in so that unrelated resume flows (e.g. respond-only turns) cannot bypass approval.
Usage:
resp, err := ai.Generate(ctx, r,
ai.WithModel(m),
ai.WithPrompt("do something"),
ai.WithTools(toolA, toolB),
ai.WithUse(&middleware.ToolApproval{AllowedTools: []string{"toolA"}}),
)
// toolA runs; toolB triggers an interrupt.
// Resume with ai.WithToolRestarts carrying {"toolApproved": true} to re-execute.
func (*ToolApproval) Name ¶
func (t *ToolApproval) Name() string