Documentation
¶
Overview ¶
Package storage defines the ResultStore interface for persisting and retrieving evaluation outcomes. Implementations include a local filesystem adapter and (planned) an Azure Blob Storage adapter.
Index ¶
- Variables
- type AzureBlobStore
- func (abs *AzureBlobStore) Compare(ctx context.Context, runID1, runID2 string) (*ComparisonReport, error)
- func (abs *AzureBlobStore) Download(ctx context.Context, runID string) (*models.EvaluationOutcome, error)
- func (abs *AzureBlobStore) List(ctx context.Context, opts ListOptions) ([]ResultSummary, error)
- func (abs *AzureBlobStore) Upload(ctx context.Context, outcome *models.EvaluationOutcome) error
- type ComparisonReport
- type ListOptions
- type LocalStore
- func (ls *LocalStore) Compare(ctx context.Context, runID1, runID2 string) (*ComparisonReport, error)
- func (ls *LocalStore) Download(_ context.Context, runID string) (*models.EvaluationOutcome, error)
- func (ls *LocalStore) List(_ context.Context, opts ListOptions) ([]ResultSummary, error)
- func (ls *LocalStore) Upload(_ context.Context, outcome *models.EvaluationOutcome) error
- type MetricDelta
- type ResultStore
- type ResultSummary
Constants ¶
This section is empty.
Variables ¶
var ErrNotFound = errors.New("result not found")
ErrNotFound is returned when a requested run ID does not exist.
Functions ¶
This section is empty.
Types ¶
type AzureBlobStore ¶
type AzureBlobStore struct {
// contains filtered or unexported fields
}
AzureBlobStore implements ResultStore using Azure Blob Storage. It authenticates using DefaultAzureCredential and supports automatic az login fallback if credentials are unavailable.
func NewAzureBlobStore ¶
func NewAzureBlobStore(ctx context.Context, accountName, containerName string) (*AzureBlobStore, error)
NewAzureBlobStore creates an Azure Blob Storage-backed ResultStore. It uses DefaultAzureCredential for authentication. If credentials are unavailable, it attempts to run 'az login' automatically and retries once.
func (*AzureBlobStore) Compare ¶
func (abs *AzureBlobStore) Compare(ctx context.Context, runID1, runID2 string) (*ComparisonReport, error)
Compare downloads two runs and produces a comparison report with deltas.
func (*AzureBlobStore) Download ¶
func (abs *AzureBlobStore) Download(ctx context.Context, runID string) (*models.EvaluationOutcome, error)
Download retrieves a single evaluation outcome by run ID. Optimization: we first attempt a prefix-scoped list using the runID suffix pattern (*/{runID}.json) to avoid scanning all blobs. If no match is found, we fall back to a full blob scan matching on metadata. The prefix approach is O(1) when the blob naming convention is followed; the fallback is O(N) but handles legacy or misnamed blobs.
func (*AzureBlobStore) List ¶
func (abs *AzureBlobStore) List(ctx context.Context, opts ListOptions) ([]ResultSummary, error)
List returns summaries of stored results matching the given options. Uses ListBlobsFlat with prefix filtering and reads blob metadata to build ResultSummary objects without downloading blobs.
func (*AzureBlobStore) Upload ¶
func (abs *AzureBlobStore) Upload(ctx context.Context, outcome *models.EvaluationOutcome) error
Upload persists an evaluation outcome to Azure Blob Storage. Blob path: {skill-name}/{run-id}.json Metadata: skill, model, passrate, timestamp, runid
type ComparisonReport ¶
type ComparisonReport struct {
Run1 ResultSummary
Run2 ResultSummary
PassDelta float64
ScoreDelta float64
Metrics map[string]MetricDelta
}
ComparisonReport holds the result of comparing two evaluation runs.
type ListOptions ¶
ListOptions controls filtering and pagination for List.
type LocalStore ¶
type LocalStore struct {
// contains filtered or unexported fields
}
LocalStore implements ResultStore using the local filesystem. JSON result files are stored in a flat directory structure.
func NewLocalStore ¶
func NewLocalStore(dir string) *LocalStore
NewLocalStore creates a LocalStore that reads/writes results in dir.
func (*LocalStore) Compare ¶
func (ls *LocalStore) Compare(ctx context.Context, runID1, runID2 string) (*ComparisonReport, error)
Compare downloads two runs and produces a comparison report with deltas.
func (*LocalStore) Download ¶
func (ls *LocalStore) Download(_ context.Context, runID string) (*models.EvaluationOutcome, error)
Download retrieves a single evaluation outcome by run ID.
func (*LocalStore) List ¶
func (ls *LocalStore) List(_ context.Context, opts ListOptions) ([]ResultSummary, error)
List returns summaries of stored results matching the given options.
func (*LocalStore) Upload ¶
func (ls *LocalStore) Upload(_ context.Context, outcome *models.EvaluationOutcome) error
Upload writes an evaluation outcome as a JSON file to the local results directory.
type MetricDelta ¶
MetricDelta captures the difference for a single metric between two runs.
type ResultStore ¶
type ResultStore interface {
// Upload persists an evaluation outcome.
Upload(ctx context.Context, outcome *models.EvaluationOutcome) error
// List returns summaries matching the given options.
List(ctx context.Context, opts ListOptions) ([]ResultSummary, error)
// Download retrieves a single evaluation outcome by run ID.
Download(ctx context.Context, runID string) (*models.EvaluationOutcome, error)
// Compare downloads two runs and produces a comparison report.
Compare(ctx context.Context, runID1, runID2 string) (*ComparisonReport, error)
}
ResultStore abstracts how evaluation outcomes are persisted and queried. All methods accept a context.Context for cancellation and deadline support, which is required for cloud-backed implementations.
func NewStore ¶
func NewStore(cfg *projectconfig.StorageConfig, localDir string) (ResultStore, error)
NewStore creates a ResultStore based on project configuration. If storage is configured with provider "azure-blob" and enabled, it returns an AzureBlobStore using DefaultAzureCredential. Otherwise it returns a LocalStore backed by localDir.
type ResultSummary ¶
type ResultSummary struct {
RunID string `json:"run_id"`
Skill string `json:"skill"`
Model string `json:"model"`
Timestamp time.Time `json:"timestamp"`
PassRate float64 `json:"pass_rate"`
BlobPath string `json:"blob_path"`
}
ResultSummary is a lightweight representation of a stored evaluation run, suitable for listing without loading the full outcome.