Documentation
¶
Overview ¶
Package indexer clones and indexes GitHub repositories using the Synapses CLI, caching results to disk so re-runs skip already-indexed repos.
Designed for RepoBench-R: clone the repos referenced in the dataset, index each with `synapses index --path <dir>`, then let the benchmark binary call tools with `?project=<dir>` per sample.
Usage:
indexer.Run(indexer.Options{
ReposDir: "/tmp/repobench_repos",
CacheFile: "/tmp/repobench_index_cache.json",
Repos: []string{"sissaschool/elementpath", ...},
Workers: 8,
SynapsesBin: "/Users/itachi/.synapses/bin/synapses",
})
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Cache ¶
type Cache struct {
// contains filtered or unexported fields
}
Cache persists indexed repo paths across runs.
type Options ¶
type Options struct {
// ReposDir is the directory where repos are cloned.
ReposDir string
// CacheFile is the JSON file tracking which repos have been indexed.
CacheFile string
// Repos is the list of "owner/repo" strings to clone and index.
Repos []string
// Workers is the number of parallel clone+index workers (default 8).
Workers int
// SynapsesBin is the path to the synapses binary (default: auto-detect).
SynapsesBin string
// SkipIndex skips the `synapses index` step (clone only).
SkipIndex bool
// TimeoutPerRepo is the max time per clone+index operation (default 3 min).
TimeoutPerRepo time.Duration
// Verbose prints per-repo progress.
Verbose bool
}
Options controls the indexer.
type Result ¶
type Result struct {
Repo string `json:"repo"`
LocalPath string `json:"local_path"`
Indexed bool `json:"indexed"`
Skipped bool `json:"skipped"` // already cached
Error string `json:"error,omitempty"`
DurationS float64 `json:"duration_s"`
}
Result holds the outcome of indexing a single repo.
Click to show internal directories.
Click to hide internal directories.