gorag

package module
v1.0.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 17, 2026 License: MIT Imports: 0 Imported by: 0

README ΒΆ

πŸ¦– GoRAG

A Production-Ready, High-Performance Modular RAG Framework for Go

Go Report Card Go Reference codecov Go Version

English | δΈ­ζ–‡ζ–‡ζ‘£


GoRAG is an enterprise-grade Retrieval-Augmented Generation (RAG) framework written entirely in Go. Designed for developers who are tired of Python dependency hell and slow async loops, GoRAG brings high concurrency, memory efficiency, and static type safety to the AI engineering world.

Whether you are building a simple document Q&A bot or a complex Agentic RAG system with multi-hop reasoning, GoRAG provides the foundational building blocks you need with zero bloat.

✨ Why GoRAG?

  • πŸš€ Blazing Fast: Built-in concurrent workers (10+ goroutines by default) and streaming parsers with O(1) memory footprint. Effortlessly index 100M+ scale document repositories.
  • 🧩 Lego-like Modularity: Strictly follows Clean Architecture. Swap out LLMs, Vector Stores, or Document Parsers with a single line of code.
  • 🧠 Advanced RAG Patterns Built-in: Out-of-the-box support for HyDE, RAG-Fusion, Semantic Chunking, Cross-Encoder Reranking, and Context Pruning.
  • ☁️ Cloud-Native & Production-Ready: Compiles to a single binary. Features built-in circuit breakers, rate limiters, graceful degradation, and observability metrics.
  • πŸ“¦ Zero-Dependency Quickstart: Deeply integrated with govector (a pure-Go embedded vector database) and gochat (a unified LLM SDK). Run a 100% local, privacy-first RAG pipeline without deploying external databases like Milvus or Qdrant.

🧰 Ecosystem & Integrations

πŸ€– LLM Providers (Powered by gochat)

  • Global: OpenAI, Anthropic (Claude 3), Azure OpenAI.
  • Local/Open-Source: Ollama (Llama 3, Qwen, Mistral, etc.).
  • Chinese AI: Kimi, DeepSeek, GLM-4, Minimax, Baichuan, etc.

πŸ—„οΈ Vector Databases

  • govector 🌟 (Pure Go embedded vector store - Zero external dependencies!)
  • Milvus / Zilliz (Enterprise standard)
  • Qdrant (High-performance Rust engine)
  • Weaviate (Leading semantic search)
  • Pinecone (Fully managed cloud DB)

πŸ“„ Universal Parsers

Native streaming support for 16+ formats including: Text, PDF, DOCX, Markdown, HTML, CSV, JSON, and source code (Go, Python, Java, TS/JS).


πŸš€ Quick Start

Installation

go get github.com/DotNetAge/gorag

10 Lines to Your Private Knowledge Base

Using Ollama and our built-in govector engine, you can build a 100% local, privacy-first RAG system without any API keys or external database deployments:

package main

import (
    "context"
    "fmt"
    
    "github.com/DotNetAge/gochat/pkg/client/base"
    "github.com/DotNetAge/gochat/pkg/client/ollama"
    "github.com/DotNetAge/gorag/rag"
    "github.com/DotNetAge/gorag/vectorstore/govector"
)

func main() {
    ctx := context.Background()

    // 1. Init LLM Client (via gochat)
    llmClient, _ := ollama.New(ollama.Config{
        Config: base.Config{Model: "qwen:0.5b"},
    })

    // 2. Init Pure-Go Vector Store (Zero dependencies)
    vectorStore, _ := govector.NewStore(govector.Config{
        Dimension:  1536,
        Collection: "my_knowledge",
    })

    // 3. Build RAG Engine
    engine, _ := rag.New(
        rag.WithLLM(llmClient),
        rag.WithVectorStore(vectorStore),
    )

    // 4. Index your private data (Auto-chunking & vectorization)
    engine.Index(ctx, rag.Source{
        Type:    "text",
        Content: "GoRAG is a high-performance RAG framework written in pure Go.",
    })

    // 5. Query
    resp, _ := engine.Query(ctx, "What is GoRAG?", rag.QueryOptions{TopK: 3})
    fmt.Println("Answer:", resp.Answer)
}

High-Concurrency Directory Indexing

Need to process a massive codebase or 50GB of company documents? GoRAG handles it concurrently:

// πŸš€ One-click index an entire directory! 
// Auto-detects .pdf, .go, .md, .docx, etc., and routes to the correct parser.
err := engine.IndexDirectory(ctx, "./my-company-docs")

// Stream the response back (Typewriter effect for frontend UX)
ch, _ := engine.QueryStream(ctx, "Summarize the Q3 financial report from the docs", rag.QueryOptions{
    Stream: true,
})

for resp := range ch {
    fmt.Print(resp.Chunk)
}

⚑ Advanced Capabilities

GoRAG is not just a glue framework; it implements cutting-edge retrieval paradigms natively:

  • Agentic RAG / CRAG: Intelligent routing, self-reflection, and fallback mechanisms for complex queries.
  • RAG-Fusion & Multi-Query: Rewrites user queries into multiple perspectives, retrieving and applying Reciprocal Rank Fusion (RRF) for higher accuracy.
  • Context Pruning & Cross-Encoder: Extracts only the most relevant sentences from chunks and reranks them, saving LLM tokens and reducing hallucinations.
  • Graph RAG: Native support for Neo4j and ArangoDB for cross-node multi-hop reasoning.

πŸ› οΈ CLI Tool

GoRAG comes with a powerful built-in CLI for rapid testing and administration:

# Install the CLI
go install github.com/DotNetAge/gorag/cmd/gorag@latest

# Index a file directly from the terminal
gorag index --api-key $OPENAI_API_KEY --file ./docs/architecture.md

# Query your knowledge base
gorag query --api-key $OPENAI_API_KEY "How does the circuit breaker work?"

πŸ“ˆ Roadmap

  • Core architecture and pluggable interfaces
  • Advanced Enhancers (HyDE, Context Pruning, Reranking)
  • Native Graph RAG integration (Neo4j, ArangoDB)
  • Multi-modal RAG (Image & Video indexing)
  • Enterprise Dashboard and API server

🀝 Contributing

We welcome contributions! Whether it's adding a new vector store driver, improving the documentation, or fixing a bug, please check out our Contributing Guidelines.

Give us a ⭐️ if this project helped you build faster and safer AI applications!

πŸ“„ License

GoRAG is dual-licensed under the MIT License.

Documentation ΒΆ

Overview ΒΆ

Package gorag provides a Retrieval-Augmented Generation (RAG) framework for Go

GoRAG is a comprehensive framework for building RAG applications that combine large language models (LLMs) with vector databases for efficient information retrieval.

Key features include: - Circuit breaker pattern for service resilience - Graceful degradation for unreliable services - Lazy loading for efficient memory usage - Observability with metrics, logging, and tracing - Plugin system for extensibility - Connection pooling for efficient resource management - Support for multiple vector stores (Memory, Milvus, Pinecone, Qdrant, Weaviate) - Support for multiple embedding providers (Cohere, Ollama, OpenAI, Voyage) - Support for multiple LLM clients (Anthropic, Azure OpenAI, Ollama, OpenAI) - Support for multiple document parsers (CSV, JSON, Markdown, PDF, etc.)

To get started, see the examples in the cmd/gorag directory or refer to the documentation in the docs directory.

Directories ΒΆ

Path Synopsis
examples
01_native_rag
Package nativerag demonstrates a basic Native RAG pipeline using GoRAG Steps.
Package nativerag demonstrates a basic Native RAG pipeline using GoRAG Steps.
02_hybrid_rag command
Package hybridrag demonstrates how to compose Hybrid RAG pipeline with multiple retrieval strategies.
Package hybridrag demonstrates how to compose Hybrid RAG pipeline with multiple retrieval strategies.
infra
chunker/semantic
Package semantic provides semantic chunking utilities for RAG systems.
Package semantic provides semantic chunking utilities for RAG systems.
enhancer
Package enhancer provides query and document enhancement utilities for RAG systems.
Package enhancer provides query and document enhancement utilities for RAG systems.
graph
Package graph provides graph-related utilities for RAG systems.
Package graph provides graph-related utilities for RAG systems.
searcher/agentic
Package agentic provides the Agentic RAG searcher:
Package agentic provides the Agentic RAG searcher:
searcher/core
Package core provides shared default factory functions used by all searcher sub-packages (native, rerank, hybrid, multimodal, agentic, graphlocal, graphglobal, multiagent).
Package core provides shared default factory functions used by all searcher sub-packages (native, rerank, hybrid, multimodal, agentic, graphlocal, graphglobal, multiagent).
searcher/crag
Package crag provides CRAG (Corrective RAG) implementation.
Package crag provides CRAG (Corrective RAG) implementation.
searcher/graphglobal
Package graphglobal provides the Graph-Global RAG searcher (community-summary-based macro pipeline):
Package graphglobal provides the Graph-Global RAG searcher (community-summary-based macro pipeline):
searcher/graphlocal
Package graphlocal provides the Graph-Local RAG searcher (entity-centric N-Hop pipeline):
Package graphlocal provides the Graph-Local RAG searcher (entity-centric N-Hop pipeline):
searcher/hybrid
Package hybrid provides the Hybrid RAG searcher:
Package hybrid provides the Hybrid RAG searcher:
searcher/hyde
Package hyde provides HyDE (Hypothetical Document Embeddings) RAG implementation.
Package hyde provides HyDE (Hypothetical Document Embeddings) RAG implementation.
searcher/multiagent
Package multiagent provides the Multi-Agent RAG searcher:
Package multiagent provides the Multi-Agent RAG searcher:
searcher/native
Package native provides the NativeRAG searcher:
Package native provides the NativeRAG searcher:
searcher/selfquery
Package selfquery provides Self-Query RAG implementation.
Package selfquery provides Self-Query RAG implementation.
searcher/stepback
Package stepback provides StepBack RAG implementation.
Package stepback provides StepBack RAG implementation.
steps/agentic
Package agentic provides agentic orchestration steps for autonomous RAG workflows.
Package agentic provides agentic orchestration steps for autonomous RAG workflows.
steps/post_retrieval
Package post_retrieval provides steps that process and optimize retrieval results.
Package post_retrieval provides steps that process and optimize retrieval results.
steps/pre_retrieval
Package pre_retrieval provides query enhancement steps that occur before retrieval.
Package pre_retrieval provides query enhancement steps that occur before retrieval.
steps/retrieval
Package retrieval provides retrieval strategy steps that execute different search algorithms.
Package retrieval provides retrieval strategy steps that execute different search algorithms.
pkg
di
domain/abstraction
Package abstraction defines the storage abstraction interfaces for the goRAG framework.
Package abstraction defines the storage abstraction interfaces for the goRAG framework.
domain/entity
Package entity defines the core entities for the goRAG framework.
Package entity defines the core entities for the goRAG framework.
usecase/dataprep
Package dataprep provides data preparation utilities for RAG systems.
Package dataprep provides data preparation utilities for RAG systems.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL