tracker-swebench

command
v0.39.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 12, 2026 License: MIT Imports: 20 Imported by: 0

Documentation

Overview

ABOUTME: `tracker-swebench analyze <results-dir>` — bulk-triage a prior run's artifacts. ABOUTME: Reads predictions.jsonl, per-instance logs, and optional empty-patch diagnostics; emits a structured report.

ABOUTME: SWE-bench Lite JSONL dataset parser for the tracker-swebench harness. ABOUTME: Provides LoadDataset to read instances from a JSONL file and Instance methods for prompt generation.

ABOUTME: Docker container lifecycle management for the swebench benchmarking harness. ABOUTME: Shells out to the docker CLI to create, start, exec, stop, and remove containers per instance.

ABOUTME: CLI entry point for the tracker-swebench SWE-bench benchmarking harness. ABOUTME: Runs tracker's code agent against SWE-bench Lite instances and records predictions.

ABOUTME: Results writer for SWE-bench predictions — appends JSONL predictions and tracks completed instances. ABOUTME: Supports resumability by reading existing predictions on open, plus run stats and run metadata helpers.

Directories

Path Synopsis
ABOUTME: In-container agent binary for SWE-bench evaluation harness.
ABOUTME: In-container agent binary for SWE-bench evaluation harness.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL