gorag

package module

v1.1.5 Latest Latest Go to latest Published: Mar 31, 2026 License: MIT Imports: 23 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/DotNetAge/gorag

Links

Open Source Insights

README ¶

🦖 GoRAG

The Expert-Grade, High-Performance Modular RAG Framework for Go

English | 中文文档

GoRAG is a production-ready Retrieval-Augmented Generation (RAG) framework built for high-scale AI engineering. Unlike complex "black-box" frameworks, GoRAG provides a transparent, pipeline-based architecture that combines Go's native concurrency with advanced RAG patterns.

From GraphRAG with automated triple extraction to Agentic RAG with self-correction, GoRAG is designed to move your AI applications from "prototype" to "production" with zero friction.

✨ Why GoRAG?

🚀 Performance First: Built-in concurrent workers and streaming parsers with O(1) memory efficiency. Perfect for indexing TB-scale knowledge bases.
🏗️ Pipeline-Based Architecture: Powered by gochat/pkg/pipeline. Every retrieval step is explicit, traceable, and pluggable. No more "hidden magic" or deep inheritance hell.
🧠 Smart Intent Routing: Automatically dispatches queries to the most suitable retrieval strategy (Vector, Graph, or Global) based on user intent.
🕸️ Advanced GraphRAG: Native support for Neo4j, SQLite (Zero-CGO), and BoltDB. Includes automated LLM-driven knowledge graph construction.
🔭 Built-in Observability: Comprehensive distributed tracing across all core retrievers and steps. See exactly where your time and tokens go.
📊 Enterprise-Grade Evaluation: Built-in benchmarking protocol for Faithfulness, Answer Relevance, and Context Precision (RAGAS-style).

🧰 The RAG "Expert" Ecosystem

GoRAG doesn't just give you tools; it gives you pre-optimized strategies as first-class citizens:

Strategy	When to use	Key Features
Native RAG	Standard semantic search	Vector-only, fast, low cost
Graph RAG	Complex relationship reasoning	Entities, Triples, Multi-hop reasoning
Self-RAG	High accuracy requirements	Self-reflection, Hallucination detection
CRAG	Handling ambiguous queries	Quality evaluation, fallback to Web Search
Fusion RAG	Multi-faceted queries	Query rewriting, RRF fusion
Smart Router	Dynamic workloads	Intent-based automatic dispatching

🚀 Quick Start: Build Industrial RAG in 1 Minute

GoRAG provides a unified RAG application interface that bundles Indexing and Retrieval into one seamless entity. Choose your preset and start building.

1. NativeRAG (Perfect for AI Agents & Local Knowledge Bases)

Pure Go, zero-dependencies (SQLite + GoVector). One-line setup.

import "github.com/DotNetAge/gorag"

// 1. One line to create a complete local RAG app
app, _ := gorag.DefaultNativeRAG(gorag.WithWorkDir("./my_kb"))

// 2. Feed it documents
app.IndexDirectory(ctx, "./docs", true)

// 3. Ask questions!
res, _ := app.Search(ctx, "What is GoRAG?", 5)
fmt.Println(res.Answer)

2. AdvancedRAG (Enterprise-Grade / High Recall)

Designed for distributed scale. Bundles RAG-Fusion + RRF best practices.

// Connect to enterprise Milvus and start an Advanced RAG app
app, _ := gorag.DefaultAdvancedRAG(
    gorag.WithMilvus("milvus:19530", "kb_collection"),
    gorag.WithOpenAI("sk-xxxx"),
)

app.IndexDirectory(ctx, "./enterprise_docs", true)
res, _ := app.Search(ctx, "Compare architecture A vs B", 10)

3. GraphRAG (Deep Reasoning / Relational)

Automated Knowledge Graph construction with hybrid vector-graph search.

// Bundles Neo4j relationship reasoning with Vector search
app, _ := gorag.DefaultGraphRAG(
    gorag.WithMilvus("milvus:19530", "kb_collection"),
    gorag.WithNeo4j("neo4j://localhost", "user", "pass"),
)

app.IndexFile(ctx, "corporate_report.pdf")
res, _ := app.Search(ctx, "How are entity X and Y related?", 5)

🔭 Built-in Industrial Observability

Stop flying blind. GoRAG natively supports Prometheus and OpenTelemetry to monitor your RAG pipelines in production.

idx, _ := indexer.DefaultAdvancedIndexer(vStore, dStore, 
    indexer.WithZapLogger("./logs/rag.log", 100, 30, 7, true), // Industrial Logging
    indexer.WithPrometheusMetrics(":8080"),                   // Metrics
    indexer.WithOpenTelemetryTracer(ctx, "jaeger:4317", "RAG"),// Distributed Tracing
)

⚡ Technical Integrity & Standards

Go 1.24+: Leveraging the latest language features.
Zero-CGO SQLite: Using modernc.org/sqlite for painless cross-compilation.
Clean Architecture: Strict separation of interfaces (pkg/core) and implementations.
Modular Steps: Reuse hyde, rerank, fuse, or prune steps in any custom pipeline.

🤝 Contributing

We aim to build the most robust AI infrastructure for the Go ecosystem. Whether it's a new VectorStore driver or an improved Parser, your PRs are welcome!

Check our Contributing Guidelines.

📄 License

GoRAG is licensed under the MIT License.

Documentation ¶

Overview ¶

Package gorag provides a high-level API for building Retrieval-Augmented Generation (RAG) applications.

This package offers pre-configured RAG implementations with support for multiple retrieval strategies including Native, Advanced, Agentic, and Graph-based RAG patterns. It leverages dependency injection for flexible component customization and supports various vector stores, document stores, and LLM providers.

Quick Start:

rag, err := gorag.DefaultNativeRAG(
    gorag.WithWorkDir("./data"),
    gorag.WithTopK(5),
)
if err != nil {
    log.Fatal(err)
}
defer rag.Close()

// Index documents
err = rag.IndexFile(ctx, "path/to/document.pdf")

// Search
result, err := rag.Search(ctx, "your query", 5)

Index ¶

func CheckEnvironment(ctx context.Context) (bool, []string)
func PrepareEnvironment(ctx context.Context, ...) error
type RAG
type RAGConfig
type RAGOption

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func CheckEnvironment ¶ added in v1.1.4

func CheckEnvironment(ctx context.Context) (bool, []string)

CheckEnvironment verifies if the GoRAG environment is ready for operation.

func PrepareEnvironment ¶ added in v1.1.4

func PrepareEnvironment(ctx context.Context, progress func(modelName, fileName string, downloaded, total int64)) error

PrepareEnvironment downloads all required models for GoRAG.

Types ¶

type RAG ¶

type RAG interface {
	// IndexFile processes a single file and adds it to the knowledge base.
	IndexFile(ctx context.Context, filePath string) error
	// IndexDirectory processes all files in a directory and adds them to the knowledge base.
	// If recursive is true, it will also process subdirectories.
	IndexDirectory(ctx context.Context, dirPath string, recursive bool) error
	// Search performs a retrieval query and returns the top K results.
	Search(ctx context.Context, query string, topK int) (*core.RetrievalResult, error)
	// Container returns the underlying dependency injection container for advanced usage.
	Container() *di.Container
	// Close releases all resources held by the RAG instance.
	Close() error
}

RAG application interface defines the main entry point for RAG operations.

func DefaultAdvancedRAG ¶

func DefaultAdvancedRAG(opts ...RAGOption) (RAG, error)

DefaultAdvancedRAG creates a high-performance RAG instance optimized for production use. It supports advanced features like query rewriting, fusion, and reranking.

func DefaultAgenticRAG ¶

func DefaultAgenticRAG(opts ...RAGOption) (RAG, error)

DefaultAgenticRAG creates an agentic RAG instance with smart routing capabilities. It automatically selects the best retrieval strategy based on query intent classification.

func DefaultCRAG ¶ added in v1.1.4

func DefaultCRAG(opts ...RAGOption) (RAG, error)

DefaultCRAG creates a corrective RAG instance with external search fallback. It automatically triggers web search when internal knowledge is insufficient or incorrect.

func DefaultGraphRAG ¶

func DefaultGraphRAG(opts ...RAGOption) (RAG, error)

DefaultGraphRAG creates a knowledge graph-enhanced RAG instance. It leverages entity relationships and structured knowledge for complex queries.

func DefaultNativeRAG ¶

func DefaultNativeRAG(opts ...RAGOption) (RAG, error)

DefaultNativeRAG creates a lightweight, local-first RAG instance. It uses default TokenChunker, local SQLite and GoVector stores, suitable for quick prototyping.

Example:

rag, err := gorag.DefaultNativeRAG(
    gorag.WithWorkDir("./data"),
    gorag.WithTopK(5),
)

func DefaultSelfRAG ¶ added in v1.1.4

func DefaultSelfRAG(opts ...RAGOption) (RAG, error)

DefaultSelfRAG creates a self-correcting RAG instance with reflection capabilities. It evaluates retrieval relevance and generation quality for high-precision answers.

type RAGConfig ¶

type RAGConfig struct {
	// 0. Identity
	// Name specifies a unique name for the RAG instance, used for resource isolation.
	Name string

	// 1. Persistence
	// WorkDir specifies the working directory for storing persistent data (vector DB, document DB, etc.).
	WorkDir string
	// VectorDBType specifies the type of vector store to use (e.g., "govector", "milvus", "pinecone").
	VectorDBType string // "govector", "milvus"
	// Dimension specifies the dimension of embedding vectors.
	Dimension int

	// 2. Model "Grounding" (Variable Name Driven)
	// EmbedderType specifies the embedding model provider (e.g., "openai", "ollama", "local-onnx").
	EmbedderType string // "openai", "ollama", "local-onnx"
	// LLMType specifies the large language model provider (e.g., "openai", "claude", "ollama").
	LLMType string // "openai", "claude", "ollama"
	// ModelPath specifies the local model file path (typically under .test/models/...).
	ModelPath string // Local model file path (.test/models/...)
	// ModelName specifies the model name identifier (e.g., "qwen-turbo", "bge-small-zh-v1.5").
	ModelName string // e.g., "qwen-turbo", "bge-small-zh-v1.5"
	// APIKeyEnv specifies the environment variable name for the API key (e.g., "DASHSCOPE_API_KEY").
	APIKeyEnv string // Name of the env var, e.g., "DASHSCOPE_API_KEY"
	// BaseURL specifies the base URL for API-compatible model providers.
	BaseURL string // e.g., "https://dashscope.aliyuncs.com/compatible-mode/"

	// 3. Retrieval Tuning
	// TopK specifies the default number of top results to retrieve during search.
	TopK int

	// 4. Semantic Cache
	// EnableSemanticCache enables semantic caching for improved performance.
	EnableSemanticCache bool
	// SemanticCacheType specifies the cache backend type: "memory" or "bolt".
	SemanticCacheType string

	// 5. Advanced Tuning (Self-RAG & CRAG)
	// SelfRAGThreshold specifies the quality threshold for Self-RAG reflection.
	SelfRAGThreshold float32
	// MaxRetries specifies the maximum number of refinement attempts in Self-RAG.
	MaxRetries int

	// 6. Graph RAG Tuning
	// GraphDepth specifies the search depth in the knowledge graph.
	GraphDepth int
	// GraphLimit specifies the maximum number of related nodes to retrieve per entity.
	GraphLimit int
	// contains filtered or unexported fields
}

RAGConfig is the single source of truth for all RAG modes. It holds configuration parameters for RAG system initialization.

type RAGOption ¶

type RAGOption func(*RAGConfig)

RAGOption is a function type for configuring RAG instances using the functional options pattern.

func WithAPIKeyEnv ¶

func WithAPIKeyEnv(name string) RAGOption

WithAPIKeyEnv sets the environment variable name for API key.

func WithBaseURL ¶

func WithBaseURL(url string) RAGOption

WithBaseURL sets the base URL for API-compatible providers.

func WithContainer ¶

func WithContainer(ctr *di.Container) RAGOption

WithContainer injects a custom dependency injection container.

func WithDepth ¶ added in v1.1.4

func WithDepth(d int) RAGOption

WithDepth sets the search depth for Graph RAG.

func WithDimension ¶

func WithDimension(dim int) RAGOption

WithDimension sets the embedding vector dimension.

func WithEmbedder ¶

func WithEmbedder(e embedding.Provider) RAGOption

Expert injection WithEmbedder injects a custom embedding provider.

func WithLLM ¶

func WithLLM(l gochat.Client) RAGOption

WithLLM injects a custom LLM client.

func WithLimit ¶ added in v1.1.4

func WithLimit(l int) RAGOption

WithLimit sets the neighbor limit for Graph RAG.

func WithMaxRetries ¶ added in v1.1.4

func WithMaxRetries(r int) RAGOption

WithMaxRetries sets the maximum refinement attempts for Self-RAG.

func WithMetrics ¶ added in v1.1.4

func WithMetrics(m core.Metrics) RAGOption

WithMetrics injects a custom metrics collector.

func WithModelName ¶

func WithModelName(name string) RAGOption

WithModelName sets the model name identifier.

func WithModelPath ¶

func WithModelPath(path string) RAGOption

WithModelPath sets the local model file path.

func WithName ¶ added in v1.1.4

func WithName(name string) RAGOption

WithName sets a unique name for the RAG instance for resource isolation.

func WithParsers ¶

func WithParsers(p ...core.Parser) RAGOption

WithParsers injects custom document parsers.

func WithSemanticCache ¶ added in v1.1.4

func WithSemanticCache(enable bool, cacheType ...string) RAGOption

WithSemanticCache enables semantic caching with specified backend type. cacheType can be "memory" (default, in-memory) or "bolt" (persistent).

func WithThreshold ¶ added in v1.1.4

func WithThreshold(t float32) RAGOption

WithThreshold sets the quality threshold for Self-RAG reflection.

func WithTopK ¶

func WithTopK(k int) RAGOption

WithTopK sets the default number of top results to retrieve.

func WithTracer ¶ added in v1.1.4

func WithTracer(t observability.Tracer) RAGOption

WithTracer injects a custom distributed tracer.

func WithWebSearcher ¶ added in v1.1.4

func WithWebSearcher(s core.WebSearcher) RAGOption

WithWebSearcher sets the external search engine for CRAG.

func WithWorkDir ¶

func WithWorkDir(path string) RAGOption

WithWorkDir sets the working directory for persistent storage.

Source Files ¶

View all Source files

gorag.go

Directories ¶

Path	Synopsis
cmd
gorag command
pkg
core Package core defines the fundamental entities, interfaces, and types for the goRAG framework.	Package core defines the fundamental entities, interfaces, and types for the goRAG framework.
core/agent
core/env
core/store
di Package di provides a lightweight dependency injection container for managing component lifecycle.	Package di provides a lightweight dependency injection container for managing component lifecycle.
generation/citation
generation/evaluator
indexer Package indexer provides high-level indexers for building RAG pipelines.	Package indexer provides high-level indexers for building RAG pipelines.
indexing Package indexing provides the core indexing pipeline for offline data preparation.	Package indexing provides the core indexing pipeline for offline data preparation.
indexing/chunker
indexing/parser/base
indexing/parser/config
indexing/parser/config/types
indexing/parser/csv
indexing/parser/dbschema
indexing/parser/docx
indexing/parser/email
indexing/parser/excel
indexing/parser/gocode
indexing/parser/html
indexing/parser/image
indexing/parser/javacode
indexing/parser/jscode
indexing/parser/json
indexing/parser/log
indexing/parser/markdown
indexing/parser/pdf
indexing/parser/ppt
indexing/parser/pycode
indexing/parser/text
indexing/parser/tscode
indexing/parser/xml
indexing/parser/yaml
indexing/store/bolt
indexing/store/gograph
indexing/store/memgraph
indexing/store/neo4j
indexing/store/sqlite
indexing/vectorstore/govector
indexing/vectorstore/memory
indexing/vectorstore/milvus
indexing/vectorstore/pinecone
indexing/vectorstore/qdrant
indexing/vectorstore/weaviate
logging
observability
resilience
retrieval/answer Package answer provides answer generation utilities for RAG systems.	Package answer provides answer generation utilities for RAG systems.
retrieval/cache
retrieval/enhancement Package enhancement provides query and document enhancement utilities for RAG systems.	Package enhancement provides query and document enhancement utilities for RAG systems.
retrieval/expand
retrieval/fusion
retrieval/graph Package graph provides graph-related utilities for RAG systems.	Package graph provides graph-related utilities for RAG systems.
retrieval/query
retrieval/rerank
retrieval/service
retriever/advanced
retriever/agentic
retriever/crag
retriever/graph
retriever/native
retriever/selfrag
steps/cache
steps/crag Package crag provides evaluation steps for RAG retrieval quality assessment.	Package crag provides evaluation steps for RAG retrieval quality assessment.
steps/decompose Package decompose provides query decomposition steps for RAG retrieval pipelines.	Package decompose provides query decomposition steps for RAG retrieval pipelines.
steps/dedup
steps/enrich
steps/filter Package filter provides query preprocessing steps for RAG pipelines.	Package filter provides query preprocessing steps for RAG pipelines.
steps/fuse Package fuse provides result fusion steps for RAG retrieval pipelines.	Package fuse provides result fusion steps for RAG retrieval pipelines.
steps/generate Package generate provides answer generation steps for RAG pipelines.	Package generate provides answer generation steps for RAG pipelines.
steps/hyde
steps/image Package image provides image retrieval steps for multimodal RAG pipelines.	Package image provides image retrieval steps for multimodal RAG pipelines.
steps/indexing Package indexing provides document indexing pipeline steps for RAG data preparation.	Package indexing provides document indexing pipeline steps for RAG data preparation.
steps/prune
steps/rerank Package rerank provides reranking steps for RAG retrieval pipelines.	Package rerank provides reranking steps for RAG retrieval pipelines.
steps/rewrite Package rewrite provides query rewriting steps for RAG retrieval pipelines.	Package rewrite provides query rewriting steps for RAG retrieval pipelines.
steps/sparse Package sparse provides sparse retrieval steps using BM25 algorithm.	Package sparse provides sparse retrieval steps using BM25 algorithm.
steps/stepback Package stepback provides query abstraction steps for RAG pipelines.	Package stepback provides query abstraction steps for RAG pipelines.
steps/vector

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL