measure-injection-classifier

command
v1.0.0-beta.73 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 15, 2026 License: MIT Imports: 11 Imported by: 0

Documentation

Overview

Command measure-injection-classifier runs the Phase 2 ADR-043 measurement harness: load one or more JSONL corpora, classify each record against the loaded corpus, and emit a markdown table of precision / recall per signal bucket plus latency stats.

Two modes:

  • --self-eval: classify the corpus against itself. This is the smoke mode (`task adr043:measure`) and produces the noise floor for the vendored seed. Useful as a regression target for the loader and classifier.

  • --eval-corpus FILE: classify FILE against the configured --corpus. This is the production mode operators use with their own holdout eval set and benign-OSINT slice. The classifier corpus and the eval corpus are independent.

The harness does not call any LLM. Phase 2's T2-vs-T3-only comparison protocol is documented in docs/operations/18-injection-classifier-measurements.md and run by operators with their own LLM endpoint configured.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL