vision

package
v0.0.0-beta Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 30, 2026 License: MIT Imports: 15 Imported by: 0

Documentation

Overview

Package vision provides a tool for LLM-based image analysis and text extraction.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Inputs

type Inputs struct {
	// Path is the file path to the image or PDF.
	Path string `json:"path" jsonschema:"required,description=Path to the image or PDF file" validate:"required"`
	// Instruction describes what to extract or analyze.
	Instruction string `json:"instruction,omitempty" jsonschema:"description=What to extract or analyze (default: describe or extract text)"`
}

Inputs defines the parameters for the Vision tool.

type Tool

type Tool struct {
	tool.Base

	// Cfg is the full config — needed because this tool creates sub-agents
	// via agent.New(), which requires access to agents, providers, modes,
	// and the full BuildAgent pipeline.
	Cfg      config.Config
	CfgPaths config.Paths
	Rt       *config.Runtime
}

Tool implements vision-based image analysis via an LLM.

func New

func New(cfg config.Config, paths config.Paths, rt *config.Runtime) *Tool

New creates a Vision tool with the given configuration.

func (*Tool) Available

func (t *Tool) Available() bool

Available checks if the vision agent is configured.

func (*Tool) Execute

func (t *Tool) Execute(ctx context.Context, args map[string]any) (string, error)

Execute reads an image/PDF, compresses it, sends it to a vision LLM, and returns the response.

func (*Tool) Name

func (t *Tool) Name() string

Name returns the tool's identifier.

func (*Tool) Paths

func (t *Tool) Paths(ctx context.Context, args map[string]any) (read, write []string, err error)

Paths returns the filesystem paths this tool call will access.

func (*Tool) Sandboxable

func (t *Tool) Sandboxable() bool

Sandboxable returns false because the tool makes network calls to an LLM provider.

func (*Tool) Schema

func (t *Tool) Schema() tool.Schema

Schema returns the provider-agnostic tool definition.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL