Documentation
¶
Overview ¶
ABOUTME: `tracker-swebench analyze <results-dir>` — bulk-triage a prior run's artifacts. ABOUTME: Reads predictions.jsonl, per-instance logs, and optional empty-patch diagnostics; emits a structured report.
ABOUTME: SWE-bench Lite JSONL dataset parser for the tracker-swebench harness. ABOUTME: Provides LoadDataset to read instances from a JSONL file and Instance methods for prompt generation.
ABOUTME: Docker container lifecycle management for the swebench benchmarking harness. ABOUTME: Shells out to the docker CLI to create, start, exec, stop, and remove containers per instance.
ABOUTME: CLI entry point for the tracker-swebench SWE-bench benchmarking harness. ABOUTME: Runs tracker's code agent against SWE-bench Lite instances and records predictions.
ABOUTME: Results writer for SWE-bench predictions — appends JSONL predictions and tracks completed instances. ABOUTME: Supports resumability by reading existing predictions on open, plus run stats and run metadata helpers.
Directories
¶
| Path | Synopsis |
|---|---|
|
ABOUTME: In-container agent binary for SWE-bench evaluation harness.
|
ABOUTME: In-container agent binary for SWE-bench evaluation harness. |