Documentation
¶
Overview ¶
Package e2e provides end-to-end testing infrastructure for LLM-based operations.
Provider Configuration ¶
E2E tests use devlore's standard provider resolution chain:
- CLI flags (not applicable in tests)
- Environment variables: DEVLORE_MODEL_PROVIDER, DEVLORE_MODEL_API_KEY, etc.
- Config file: ~/.config/devlore/config.yaml
- Native keystore (API keys stored via 'lore config model')
- Auto-detect from common env vars: GROQ_API_KEY, GEMINI_API_KEY, etc.
- Ollama fallback if running locally
To run E2E tests:
# Using devlore config (recommended) lore config model # one-time setup E2E_TEST=1 go test ./internal/e2e/... # Using environment variables E2E_TEST=1 DEVLORE_MODEL_PROVIDER=groq DEVLORE_MODEL_API_KEY=... go test ./internal/e2e/... # Auto-detect from existing API keys E2E_TEST=1 go test ./internal/e2e/... # picks up GROQ_API_KEY, ANTHROPIC_API_KEY, etc.
Package Contents ¶
This package includes:
- GetTestProvider: Returns a provider using devlore's standard resolution
- Metrics collection (latency, tokens, correctness)
- Test fixtures for migration and onboarding scenarios
- Result comparison and reporting
Index ¶
- func CreateProvider(cfg ProviderConfig) (model.Provider, error)
- func GetTestProvider(ctx context.Context) (prov model.Provider, errMsg string)
- func RunWithTimeout(ctx context.Context, timeout time.Duration, fn func(context.Context) error) error
- type CorrectnessMetrics
- type PerformanceMetrics
- type ProviderConfig
- type TestConfig
- type TestReport
- type TestResult
- type TestSuite
- type Timer
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func CreateProvider ¶
func CreateProvider(cfg ProviderConfig) (model.Provider, error)
CreateProvider creates a model.Provider from a ProviderConfig.
func GetTestProvider ¶
GetTestProvider returns a provider using devlore's standard configuration. Uses the full resolution chain: CLI flags → env vars → config → keystore → auto-detect → Ollama. Returns nil and an error message if no provider is available (tests should skip).
Types ¶
type CorrectnessMetrics ¶
type CorrectnessMetrics struct {
// Generic metrics
TotalExpected int `json:"total_expected" yaml:"total_expected"`
TotalFound int `json:"total_found" yaml:"total_found"`
TruePositives int `json:"true_positives" yaml:"true_positives"`
FalsePositives int `json:"false_positives" yaml:"false_positives"`
FalseNegatives int `json:"false_negatives" yaml:"false_negatives"`
Precision float64 `json:"precision" yaml:"precision"`
Recall float64 `json:"recall" yaml:"recall"`
F1Score float64 `json:"f1_score" yaml:"f1_score"`
// Domain-specific flags
SystemCorrect bool `json:"system_correct,omitempty" yaml:"system_correct,omitempty"`
ProductCorrect bool `json:"product_correct,omitempty" yaml:"product_correct,omitempty"`
PlatformsCorrect bool `json:"platforms_correct,omitempty" yaml:"platforms_correct,omitempty"`
}
CorrectnessMetrics captures correctness data for test validation.
func (*CorrectnessMetrics) ComputePrecisionRecall ¶
func (c *CorrectnessMetrics) ComputePrecisionRecall()
ComputePrecisionRecall calculates precision, recall, and F1 from TP/FP/FN.
type PerformanceMetrics ¶
type PerformanceMetrics struct {
LatencyMs int64 `json:"latency_ms" yaml:"latency_ms"`
InputTokens int `json:"input_tokens" yaml:"input_tokens"`
OutputTokens int `json:"output_tokens" yaml:"output_tokens"`
TotalTokens int `json:"total_tokens" yaml:"total_tokens"`
CostUSD float64 `json:"cost_usd,omitempty" yaml:"cost_usd,omitempty"`
Retries int `json:"retries" yaml:"retries"`
}
PerformanceMetrics captures performance data for an LLM operation.
type ProviderConfig ¶
type ProviderConfig struct {
Name string `yaml:"name" json:"name"`
Provider string `yaml:"provider" json:"provider"` // ollama, anthropic, openai, github, etc.
Model string `yaml:"model" json:"model"`
Endpoint string `yaml:"endpoint,omitempty" json:"endpoint,omitempty"`
EnvKey string `yaml:"env_key,omitempty" json:"env_key,omitempty"` // Environment variable for API key
}
ProviderConfig defines configuration for a test provider.
type TestConfig ¶
type TestConfig struct {
Providers []ProviderConfig `yaml:"providers" json:"providers"`
Timeout time.Duration `yaml:"timeout" json:"timeout"`
}
TestConfig holds configuration for E2E tests.
func DefaultTestConfig ¶
func DefaultTestConfig() TestConfig
DefaultTestConfig returns the default test configuration.
func LoadTestConfig ¶
func LoadTestConfig(path string) (TestConfig, error)
LoadTestConfig loads test configuration from a file.
type TestReport ¶
type TestReport struct {
GeneratedAt time.Time `json:"generated_at" yaml:"generated_at"`
Suites []TestSuite `json:"suites" yaml:"suites"`
}
TestReport aggregates results across all tests and providers.
func (*TestReport) GenerateSummary ¶
func (r *TestReport) GenerateSummary() string
GenerateSummary creates a markdown summary of test results.
func (*TestReport) WriteReport ¶
func (r *TestReport) WriteReport(outDir string) error
WriteReport writes the test report to a directory.
type TestResult ¶
type TestResult struct {
TestName string `json:"test_name" yaml:"test_name"`
Provider string `json:"provider" yaml:"provider"`
Model string `json:"model" yaml:"model"`
StartTime time.Time `json:"start_time" yaml:"start_time"`
EndTime time.Time `json:"end_time" yaml:"end_time"`
Success bool `json:"success" yaml:"success"`
Error string `json:"error,omitempty" yaml:"error,omitempty"`
Performance PerformanceMetrics `json:"performance" yaml:"performance"`
Correctness CorrectnessMetrics `json:"correctness" yaml:"correctness"`
Details map[string]any `json:"details,omitempty" yaml:"details,omitempty"`
}
TestResult captures the complete result of a single test run.
type TestSuite ¶
type TestSuite struct {
Name string `json:"name" yaml:"name"`
RunAt time.Time `json:"run_at" yaml:"run_at"`
Results []TestResult `json:"results" yaml:"results"`
}
TestSuite holds results for all providers on a single test.