runners

package

v0.13.3 Latest Latest Go to latest Published: Dec 23, 2025 License: MPL-2.0 Imports: 18 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/petmal/mindtrial

Links

Open Source Insights

Documentation ¶

Overview ¶

Package runners provides interfaces and implementations for executing MindTrial tasks and collecting their results.

Index ¶

Variables
func NewEmittingLogger(logger zerolog.Logger, emitter eventEmitter) logging.Logger
type AnswerDetails
type AsyncResultSet
type Details
type EmittingLogger
type ErrorDetails
type ResultKind
type ResultSet
type Results
- func (r Results) ProviderResultsByRunAndKind(provider string) map[string]map[ResultKind][]RunResult
type RunResult
- func (r RunResult) GetID() (sanitizedTaskID string)
type Runner
- func NewDefaultRunner(ctx context.Context, cfg []config.ProviderConfig, judges []config.JudgeConfig, ...) (Runner, error)
type TokenUsage
type ToolUsage
type ValidationDetails

Constants ¶

This section is empty.

Variables ¶

View Source

var (
	// ErrToolNotFound is returned when a required tool is not found in the available tools.
	ErrToolNotFound = errors.New("required tool not found")
)

Functions ¶

func NewEmittingLogger ¶ added in v0.5.0

func NewEmittingLogger(logger zerolog.Logger, emitter eventEmitter) logging.Logger

NewEmittingLogger creates a new EmittingLogger that wraps the provided zerolog.Logger and emits log messages through the provided event emitter.

Types ¶

type AnswerDetails ¶ added in v0.6.0

type AnswerDetails struct {
	// Title is a descriptive header for the response produced by the target AI model.
	Title string
	// Explanation of the answer produced by the target AI model.
	Explanation []string
	// ActualAnswer is the raw answer from the target AI model split into lines.
	ActualAnswer []string
	// ExpectedAnswer is a set of all acceptable correct answers, each being an array of lines.
	ExpectedAnswer [][]string
	// Usage contains token usage statistics for generating the answer.
	Usage TokenUsage
	// ToolUsage contains aggregated statistics for any tools invoked while producing the answer.
	ToolUsage map[string]ToolUsage `json:"ToolUsage,omitempty"`
}

AnswerDetails defines structured information about the AI model's response to a task.

type AsyncResultSet ¶

type AsyncResultSet interface {
	// GetResults returns the task results for each provider.
	// The call will block until the run is finished.
	GetResults() Results
	// ProgressEvents returns a channel that emits run progress as a value between 0 and 1.
	// The channel will be closed when the run is finished.
	ProgressEvents() <-chan float32
	// MessageEvents returns a channel that emits run log messages.
	// The channel will be closed when the run is finished.
	MessageEvents() <-chan string
	// Cancel stops the ongoing run execution.
	Cancel()
}

AsyncResultSet extends the basic ResultSet interface to provide asynchronous operation capabilities. It offers channels for monitoring progress and receiving messages during execution, as well as the ability to cancel the ongoing run.

type Details ¶ added in v0.6.0

type Details struct {
	// Answer contains details about the AI model's response and reasoning process.
	Answer AnswerDetails
	// Validation contains details about the answer verification and assessment.
	Validation ValidationDetails
	// Error contains details about any errors that occurred during task execution.
	Error ErrorDetails
}

Details encapsulates comprehensive information about task execution and validation.

type EmittingLogger ¶ added in v0.5.0

type EmittingLogger struct {
	// contains filtered or unexported fields
}

EmittingLogger implements the logging.Logger interface and additionally emits log messages as events through the provided event emitter. This allows log messages to be broadcasted to UI components or other consumers.

func (*EmittingLogger) Error ¶ added in v0.5.0

func (l *EmittingLogger) Error(ctx context.Context, level slog.Level, err error, msg string, args ...any)

Error logs an error at the specified level with optional format arguments. The error and message are logged by the logger and emitted as an event.

func (*EmittingLogger) Message ¶ added in v0.5.0

func (l *EmittingLogger) Message(ctx context.Context, level slog.Level, msg string, args ...any)

Message logs a message at the specified level with optional format arguments. The message is logged by the logger and emitted as an event.

func (*EmittingLogger) WithContext ¶ added in v0.5.0

func (l *EmittingLogger) WithContext(context string) logging.Logger

WithContext returns a new Logger that appends the specified context to the existing prefix.

type ErrorDetails ¶ added in v0.6.0

type ErrorDetails struct {
	// Title provides a summary description of the error.
	Title string
	// Message contains the primary error message.
	Message string
	// Details contains any additional error information in a generic structure.
	Details map[string][]string
	// Usage contains token usage statistics if available even in error scenarios.
	// This is typically populated if the error occurs when parsing the generated response.
	Usage TokenUsage
	// ToolUsage contains aggregated statistics for any tools invoked prior to the error.
	ToolUsage map[string]ToolUsage `json:"ToolUsage,omitempty"`
}

ErrorDetails defines structured information about errors that occurred during execution.

type ResultKind ¶

type ResultKind int

ResultKind represents the task execution result status.

const (
	Success ResultKind = iota
	Failure
	Error
	NotSupported
)

Success indicates that task finished successfully with correct result. Failure indicates that task finished successfully but with incorrect result. Error indicates that task failed to produce a result. NotSupported indicates that task could not finish because the provider does not support the required features.

type ResultSet ¶

type ResultSet interface {
	// GetResults returns the task results for each provider.
	GetResults() Results
}

ResultSet represents the outcome of executing a set of tasks.

type Results ¶

type Results map[string][]RunResult

Results stores task results for each provider.

func (Results) ProviderResultsByRunAndKind ¶

func (r Results) ProviderResultsByRunAndKind(provider string) map[string]map[ResultKind][]RunResult

ProviderResultsByRunAndKind organizes results by run configuration and result kind.

type RunResult ¶

type RunResult struct {
	// TraceID is a globally unique identifier for this specific task result, used for tracing and correlation.
	TraceID string
	// Kind indicates the result status.
	Kind ResultKind
	// Task is the name of the executed task.
	Task string
	// Provider is the name of the AI provider that executed the task.
	Provider string
	// Run is the name of the provider's run configuration used.
	Run string
	// Got is the actual answer received from the AI model.
	// For plain text response format, this should be a string that follows the format instruction precisely.
	// For structured schema-based response format, this will be any object that conforms to the schema.
	Got interface{}
	// Want are the accepted valid answer(s) for the task.
	// For plain text response format: contains string values that should follow the format instruction precisely.
	// For structured schema-based response format: contains object values that conform to the schema.
	Want utils.ValueSet
	// Details contains comprehensive information about the generated response and validation assessment.
	Details Details
	// Duration represents the time taken to generate the response.
	Duration time.Duration
}

RunResult represents the outcome of executing a single task.

func (RunResult) GetID ¶

func (r RunResult) GetID() (sanitizedTaskID string)

GetID generates a unique, sanitized identifier for the RunResult. The ID must be non-empty, must not contain whitespace, must begin with a letter, and must only include letters, digits, dashes (-), and underscores (_).

type Runner ¶

type Runner interface {
	// Run executes all given tasks against all run configurations and returns when done.
	Run(ctx context.Context, tasks []config.Task) (ResultSet, error)
	// Start executes all given tasks against all run configurations asynchronously.
	// It returns immediately and the execution continues in the background,
	// offering progress updates and messages through the returned result set.
	Start(ctx context.Context, tasks []config.Task) (AsyncResultSet, error)
	// Close releases resources when the runner is no longer needed.
	Close(ctx context.Context)
}

Runner executes tasks on configured AI providers.

func NewDefaultRunner ¶

func NewDefaultRunner(ctx context.Context, cfg []config.ProviderConfig, judges []config.JudgeConfig, tools []config.ToolConfig, logger zerolog.Logger) (Runner, error)

NewDefaultRunner creates a new Runner that executes tasks on all configured providers in parallel. The individual runs on a single provider are executed sequentially. It returns an error if any provider initialization fails.

type TokenUsage ¶ added in v0.6.1

type TokenUsage struct {
	// InputTokens is the number of tokens consumed by the prompt/input.
	InputTokens *int64 `json:"InputTokens,omitempty"`
	// OutputTokens is the number of tokens generated in the completion/output.
	OutputTokens *int64 `json:"OutputTokens,omitempty"`
}

TokenUsage represents token usage consumed by an LLM request. Values are optional and may be nil if not available.

type ToolUsage ¶ added in v0.11.0

type ToolUsage struct {
	// CallCount is the number of times the tool was invoked.
	CallCount *int64 `json:"CallCount,omitempty"`
	// TotalDuration is the cumulative execution time for the tool.
	TotalDuration *time.Duration `json:"TotalDuration,omitempty"`
}

ToolUsage represents aggregated tool invocation statistics captured during execution.

type ValidationDetails ¶ added in v0.6.0

type ValidationDetails struct {
	// Title identifies the type of validation assessment performed.
	Title string
	// Explanation contains detailed analysis of why the validation succeeded or failed.
	Explanation []string
	// Usage contains token usage statistics for the response validation step.
	// This is typically populated when using an LLM judge validator.
	Usage TokenUsage
	// ToolUsage contains aggregated statistics for any tools invoked during validation.
	ToolUsage map[string]ToolUsage `json:"ToolUsage,omitempty"`
}

ValidationDetails defines structured information about answer verification and correctness assessment.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL