benchmark

package
v1.5.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 4, 2026 License: MIT Imports: 14 Imported by: 0

Documentation

Overview

Package benchmark provides an automated benchmark loop for evaluating and iteratively improving Indago's detection capabilities.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ExtractParsedEndpointsFromSpec

func ExtractParsedEndpointsFromSpec(scanResultPath string) []types.Endpoint

ExtractParsedEndpointsFromSpec reads the scan result JSON and returns the endpoint paths that Indago reported scanning.

func ExtractScannedEndpoints

func ExtractScannedEndpoints(logPath string, findings []types.Finding, endpoints []types.Endpoint) []string

ExtractScannedEndpoints determines which endpoints were fuzzed by combining data from the request log (if available) and from scan findings. This covers both standard and differential auth modes.

Types

type ConvergenceTracker

type ConvergenceTracker struct {
	FilePath string
	History  []IterationRecord
}

ConvergenceTracker tracks iteration-over-iteration metrics and detects convergence or stalls.

func NewConvergenceTracker

func NewConvergenceTracker(filePath string) *ConvergenceTracker

NewConvergenceTracker creates a tracker that appends records to the given JSONL file.

func (*ConvergenceTracker) Append

func (ct *ConvergenceTracker) Append(rec IterationRecord) error

Append adds an iteration record and writes it to the JSONL file.

func (*ConvergenceTracker) IsConverged

func (ct *ConvergenceTracker) IsConverged(rec IterationRecord) bool

IsConverged returns true if recall == 1.0, FP == 0, and average confidence >= 0.8.

func (*ConvergenceTracker) IsStalled

func (ct *ConvergenceTracker) IsStalled(n int) bool

IsStalled returns true if recall has not improved for the last n iterations.

type EvaluationResult

type EvaluationResult struct {
	TruePositives  []MatchResult   `json:"true_positives"`
	FalseNegatives []MatchResult   `json:"false_negatives"`
	FalsePositives []types.Finding `json:"false_positives"`

	Precision       float64 `json:"precision"`
	Recall          float64 `json:"recall"`
	F1              float64 `json:"f1"`
	AvgConfidence   float64 `json:"avg_confidence"`
	TotalFindings   int     `json:"total_findings"`
	TotalGroundTrue int     `json:"total_ground_truth"`
}

EvaluationResult holds the outcome of comparing scan findings to ground truth.

func Evaluate

func Evaluate(gt *GroundTruth, findings []types.Finding) *EvaluationResult

Evaluate compares scan results against ground truth and returns a full evaluation with precision, recall, F1 score, and per-vulnerability detail.

type FPSummary

type FPSummary struct {
	Type      string `json:"type"`
	Endpoint  string `json:"endpoint"`
	Method    string `json:"method"`
	Parameter string `json:"parameter,omitempty"`
	Severity  string `json:"severity"`
	Title     string `json:"title,omitempty"`
}

FPSummary captures key details of a false positive finding for convergence tracking.

type GapAnalysis

type GapAnalysis struct {
	VulnID       string  `json:"vuln_id"`
	VulnName     string  `json:"vuln_name"`
	VulnClass    string  `json:"vuln_class"`
	Endpoint     string  `json:"endpoint"`
	Gap          GapType `json:"gap_type"`
	Notes        string  `json:"notes"`
	PayloadsSent int     `json:"payloads_sent"`
	ResponseCode int     `json:"response_code,omitempty"`
	ResponseBody string  `json:"response_body,omitempty"`
}

GapAnalysis describes why a specific ground truth vulnerability was missed.

func AnalyzeGaps

func AnalyzeGaps(
	falseNegatives []MatchResult,
	allFindings []types.Finding,
	requestLogPath string,
	scannedEndpoints []string,
) []GapAnalysis

AnalyzeGaps determines why each false negative occurred by examining the request log, findings, and known attack types.

type GapType

type GapType string

GapType classifies why a vulnerability was missed.

const (
	GapEndpointNotScanned GapType = "GAP_ENDPOINT_NOT_SCANNED"
	GapNoPayloads         GapType = "GAP_NO_PAYLOADS"
	GapPayloadIneffective GapType = "GAP_PAYLOAD_INEFFECTIVE"
	GapDetectionMissed    GapType = "GAP_DETECTION_MISSED"
	GapFilteredOut        GapType = "GAP_FILTERED_OUT"
	GapNewVulnClass       GapType = "GAP_NEW_VULN_CLASS"
)

type GroundTruth

type GroundTruth struct {
	Vulnerabilities []Vulnerability `yaml:"vulnerabilities"`
}

GroundTruth holds all known vulnerabilities for a target.

func LoadGroundTruth

func LoadGroundTruth(path string) (*GroundTruth, error)

LoadGroundTruth reads a ground truth YAML file.

type ImprovementCategory

type ImprovementCategory string

ImprovementCategory classifies the type of improvement.

const (
	CatPayloadGenerator   ImprovementCategory = "payload_generator"
	CatDetectionHeuristic ImprovementCategory = "detection_heuristic"
	CatLLMPrompt          ImprovementCategory = "llm_prompt"
	CatFilterTuning       ImprovementCategory = "filter_tuning"
)

type ImprovementProposal

type ImprovementProposal struct {
	Category    ImprovementCategory `json:"category"`
	FilePath    string              `json:"file_path"`
	Action      string              `json:"action"` // "modify" or "add_code"
	CurrentCode string              `json:"current_code,omitempty"`
	NewCode     string              `json:"new_code"`
	Rationale   string              `json:"rationale"`
	Applied     bool                `json:"applied"`
	Error       string              `json:"error,omitempty"`
}

ImprovementProposal represents a single code change proposed by the LLM.

type Improver

type Improver struct {
	// contains filtered or unexported fields
}

Improver uses LLM(s) to propose and apply code improvements.

func NewImprover

func NewImprover(primary llm.Provider, local llm.Provider, projectRoot string) *Improver

NewImprover creates an improver with one or two LLM providers.

func (*Improver) Apply

func (imp *Improver) Apply(proposals []ImprovementProposal) []ImprovementProposal

Apply applies proposals to the codebase, rolling back on build/test failures.

func (*Improver) Propose

func (imp *Improver) Propose(ctx context.Context, eval *EvaluationResult, gaps []GapAnalysis) ([]ImprovementProposal, error)

Propose generates improvement proposals from gap analysis.

type IterationRecord

type IterationRecord struct {
	Iteration            int         `json:"iteration"`
	Recall               float64     `json:"recall"`
	Precision            float64     `json:"precision"`
	F1                   float64     `json:"f1"`
	FalsePositives       int         `json:"false_positives"`
	FalseNegatives       int         `json:"false_negatives"`
	TruePositives        int         `json:"true_positives"`
	AvgConfidence        float64     `json:"avg_confidence"`
	TotalFindings        int         `json:"total_findings"`
	ImprovementsUsed     []string    `json:"improvements_applied"`
	Converged            bool        `json:"converged"`
	FalsePositiveDetails []FPSummary `json:"false_positive_details,omitempty"`
}

IterationRecord stores metrics for a single benchmark iteration.

type MatchResult

type MatchResult struct {
	Vuln     Vulnerability
	Matched  bool
	Matches  []types.Finding // findings that matched
	GapType  GapType         // populated if not matched
	GapNotes string          // human-readable gap explanation
}

MatchResult records whether a ground truth entry was matched.

func MatchFindings

func MatchFindings(gt *GroundTruth, findings []types.Finding) (results []MatchResult, falsePositives []types.Finding)

MatchFindings compares scan findings against ground truth and produces per-vulnerability match results along with a list of false positives.

type MatchRules

type MatchRules struct {
	FindingTypes    []string `yaml:"finding_types"`
	MinSeverity     string   `yaml:"min_severity,omitempty"`
	MinConfidence   string   `yaml:"min_confidence,omitempty"`
	EndpointPattern string   `yaml:"endpoint_pattern,omitempty"`
	Method          string   `yaml:"method,omitempty"`
}

MatchRules defines how to match findings to a ground truth vulnerability.

type RequestLogEntry

type RequestLogEntry struct {
	Endpoint    string            `json:"endpoint"`
	Method      string            `json:"method"`
	PayloadType string            `json:"payload_type"`
	URL         string            `json:"url"`
	Response    *logEntryResponse `json:"response,omitempty"`
	// Computed fields (populated after parsing)
	StatusCode   int    `json:"-"`
	ResponseBody string `json:"-"`
}

RequestLogEntry mirrors the structure of Indago's --log-requests output. The log is a JSON array of objects with nested request/response.

type VAmPISetup

type VAmPISetup struct {
	ContainerName string
	HostPort      int
	BaseURL       string
}

VAmPISetup manages the VAmPI Docker container lifecycle and user setup.

func NewVAmPISetup

func NewVAmPISetup(hostPort int) *VAmPISetup

NewVAmPISetup creates a VAmPI setup manager.

func (*VAmPISetup) BuildIndagoArgs

func (v *VAmPISetup) BuildIndagoArgs(tokens *VAmPITokens, specPath, outputPath, logPath, provider string) []string

BuildIndagoArgs returns the CLI arguments for running Indago against VAmPI.

func (*VAmPISetup) Start

func (v *VAmPISetup) Start() (*VAmPITokens, error)

Start launches a fresh VAmPI container, initializes the DB, creates test users, and returns JWT tokens for both users.

func (*VAmPISetup) StartWatchdog

func (v *VAmPISetup) StartWatchdog(interval time.Duration) (cancel func())

StartWatchdog launches a goroutine that periodically health-checks VAmPI and restarts the container if it becomes unresponsive. Returns a cancel function to stop the watchdog.

func (*VAmPISetup) Stop

func (v *VAmPISetup) Stop()

Stop removes the VAmPI container.

type VAmPITokens

type VAmPITokens struct {
	User1Token    string
	User2Token    string
	User1Name     string
	User2Name     string
	User1Password string
	User2Password string
}

VAmPITokens holds JWT tokens for two test users.

type Vulnerability

type Vulnerability struct {
	ID          string     `yaml:"id"`
	Name        string     `yaml:"name"`
	Class       string     `yaml:"class"`
	Endpoint    string     `yaml:"endpoint"`
	Method      string     `yaml:"method"`
	Parameter   string     `yaml:"parameter,omitempty"`
	Description string     `yaml:"description"`
	MatchRules  MatchRules `yaml:"match_rules"`
	MinMatches  int        `yaml:"min_matches"`
}

Vulnerability describes a known vulnerability in the ground truth.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL