eval

command

v0.2.0 Latest Latest Go to latest Published: Feb 18, 2026 License: Apache-2.0 Imports: 17 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/bbiangul/go-reason

Links

Open Source Insights

Documentation ¶

Overview ¶

Command eval runs evaluation suites against a GoReason engine.

ALTAVision usage:

go run -tags sqlite_fts5 ./cmd/eval \
  --pdf ./docs/ALTAVision.pdf \
  --chat-provider groq \
  --chat-model openai/gpt-oss-120b \
  --difficulty easy

LegalBench-RAG usage:

go run -tags sqlite_fts5 ./cmd/eval \
  --dataset-type legalbench \
  --corpus-dir ./data/legalbench-rag-mini/corpus \
  --benchmark-file ./data/legalbench-rag-mini/benchmarks/cuad.json \
  --benchmark-file ./data/legalbench-rag-mini/benchmarks/contractnli.json \
  --chat-provider groq \
  --chat-model openai/gpt-oss-120b

GDPR usage (Graph RAG):

go run -tags sqlite_fts5 ./cmd/eval \
  --dataset-type gdpr \
  --pdf ~/Downloads/CELEX_32016R0679_EN_TXT.pdf \
  --chat-provider ollama --chat-model llama3.1:8b \
  --embed-provider openai --embed-model text-embedding-3-small \
  --difficulty all

GDPR full-context baseline (Gemini):

go run -tags sqlite_fts5 ./cmd/eval \
  --dataset-type gdpr \
  --pdf ~/Downloads/CELEX_32016R0679_EN_TXT.pdf \
  --full-context \
  --fc-provider gemini --fc-model gemini-2.0-flash \
  --difficulty all

Source Files ¶

View all Source files

main.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL