Documentation
¶
Overview ¶
Command eval runs evaluation suites against a GoReason engine.
ALTAVision usage:
go run -tags sqlite_fts5 ./cmd/eval \ --pdf ./docs/ALTAVision.pdf \ --chat-provider groq \ --chat-model openai/gpt-oss-120b \ --difficulty easy
LegalBench-RAG usage:
go run -tags sqlite_fts5 ./cmd/eval \ --dataset-type legalbench \ --corpus-dir ./data/legalbench-rag-mini/corpus \ --benchmark-file ./data/legalbench-rag-mini/benchmarks/cuad.json \ --benchmark-file ./data/legalbench-rag-mini/benchmarks/contractnli.json \ --chat-provider groq \ --chat-model openai/gpt-oss-120b
GDPR usage (Graph RAG):
go run -tags sqlite_fts5 ./cmd/eval \ --dataset-type gdpr \ --pdf ~/Downloads/CELEX_32016R0679_EN_TXT.pdf \ --chat-provider ollama --chat-model llama3.1:8b \ --embed-provider openai --embed-model text-embedding-3-small \ --difficulty all
GDPR full-context baseline (Gemini):
go run -tags sqlite_fts5 ./cmd/eval \ --dataset-type gdpr \ --pdf ~/Downloads/CELEX_32016R0679_EN_TXT.pdf \ --full-context \ --fc-provider gemini --fc-model gemini-2.0-flash \ --difficulty all
Click to show internal directories.
Click to hide internal directories.