rag

command

v0.3.3 Latest Latest Go to latest Published: Nov 28, 2025 License: Apache-2.0 Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/kart-io/goagent

Links

Open Source Insights

README ¶

DeepSeek RAG Example

This comprehensive example demonstrates how to build a production-ready Retrieval-Augmented Generation (RAG) system using DeepSeek LLM and the GoAgent framework.

Overview

This example showcases:

Basic RAG Setup: Initialize DeepSeek LLM and Qdrant vector store
Document Management: Add and manage knowledge base documents
Semantic Search: Retrieve relevant documents using vector similarity
RAG Chain: Combine retrieval with generation for contextual answers
Advanced Features:
- TopK configuration for controlling result count
- Score threshold filtering for quality control
- Multi-query retrieval for improved recall
- Document reranking strategies (MMR, Cross-Encoder, Rank Fusion)
- Custom prompt templates

Prerequisites

1. DeepSeek API Key

Get your API key from DeepSeek:

export DEEPSEEK_API_KEY="your-api-key-here"

2. Qdrant Vector Database

You have several options to run Qdrant:

Option A: Docker (Recommended)

docker run -p 6333:6333 -p 6334:6334 \
    -v $(pwd)/qdrant_storage:/qdrant/storage:z \
    qdrant/qdrant

Option B: Qdrant Cloud

Sign up at Qdrant Cloud
Create a cluster
Get your connection URL and API key
Set environment variables:

export QDRANT_URL="https://your-cluster.qdrant.io:6334"
export QDRANT_API_KEY="your-api-key"

Option C: Local Installation

# macOS
brew install qdrant

# Linux
wget https://github.com/qdrant/qdrant/releases/download/v1.7.0/qdrant-x86_64-unknown-linux-gnu.tar.gz
tar -xvf qdrant-x86_64-unknown-linux-gnu.tar.gz
./qdrant

3. Go Dependencies

Ensure you have Go 1.25.0 or later:

go mod download

Running the Example

Basic Execution

cd examples/rag
go run deepseek_rag_example.go

With Custom Qdrant URL

QDRANT_URL="localhost:6334" go run deepseek_rag_example.go

With Qdrant Cloud

QDRANT_URL="https://your-cluster.qdrant.io:6334" \
QDRANT_API_KEY="your-api-key" \
go run deepseek_rag_example.go

Example Output

=== Step 1: Setting up DeepSeek LLM Client ===
✓ DeepSeek client initialized successfully

=== Step 2: Setting up Qdrant Vector Store ===
✓ Qdrant vector store initialized successfully

=== Step 3: Adding Sample Documents ===
✓ Sample documents added successfully

=== Step 4: Creating RAG Retriever ===
✓ RAG retriever created successfully

=== Step 5: Creating RAG Chain ===
✓ RAG chain created successfully

=== Step 6: Basic RAG Query ===
Query: What is machine learning and how does it work?
Retrieving relevant documents and generating answer...

Answer:
Machine Learning is a subset of artificial intelligence that enables systems
to learn and improve from experience without being explicitly programmed...
[Detailed answer generated by DeepSeek based on retrieved documents]

Query completed in: 2.3s

=== Step 7: TopK Configuration ===
--- TopK = 2 ---
Retrieved 2 documents:
1. [Score: 0.8542] Machine Learning is a subset of artificial intelligence...
2. [Score: 0.7834] Deep Learning is a specialized subset of machine learning...

=== Step 8: Score Threshold Filtering ===
--- Score Threshold = 0.50 ---
Retrieved 3 documents (filtered by threshold):
1. [Score: 0.8542] Topic: machine_learning
2. [Score: 0.7834] Topic: deep_learning
3. [Score: 0.6123] Topic: neural_networks

=== Step 9: Multi-Query Retrieval ===
Original Query: How do neural networks learn?
Generating query variations and retrieving documents...
Retrieved 5 unique documents (merged from multiple queries)...

=== Step 10: Document Reranking ===
Query: What are the applications of artificial intelligence?
--- Original Ranking (by similarity score) ---
--- MMR Reranking (lambda=0.7) ---
--- Cross-Encoder Reranking ---
--- Rank Fusion (RRF) ---

=== Step 11: Advanced RAG with Custom Prompts ===
Query: Explain transformers in simple terms
Using custom educational prompt template...
Custom Prompt Response: [Educational explanation tailored for beginners]

=== RAG Demo Completed Successfully ===

Configuration Options

RAG Retriever Configuration

config := retrieval.RAGRetrieverConfig{
    VectorStore:      store,        // Vector store instance
    TopK:             4,             // Number of documents to retrieve
    ScoreThreshold:   0.3,           // Minimum similarity score (0-1)
    IncludeMetadata:  true,          // Include document metadata
    MaxContentLength: 500,           // Max characters per document
}

DeepSeek LLM Configuration

config := &llm.Config{
    Provider:    llm.ProviderDeepSeek,
    APIKey:      apiKey,
    Model:       "deepseek-chat",    // or "deepseek-coder"
    Temperature: 0.7,                 // 0.0 = deterministic, 1.0 = creative
    MaxTokens:   2000,                // Max response length
    Timeout:     60,                  // Request timeout in seconds
}

Qdrant Configuration

config := retrieval.QdrantConfig{
    URL:            "localhost:6334",      // Qdrant server URL
    APIKey:         "",                    // Optional API key
    CollectionName: "my_knowledge_base",   // Collection name
    VectorSize:     384,                   // Embedding dimension
    Distance:       "cosine",              // cosine, euclidean, or dot
    Embedder:       embedder,              // Embedding model
}

Features Demonstrated

1. Basic RAG Query

Retrieves relevant documents and generates contextual answers:

ragChain := retrieval.NewRAGChain(ragRetriever, llmClient)
answer, err := ragChain.Run(ctx, "What is machine learning?")

2. TopK Configuration

Control the number of retrieved documents:

retriever.SetTopK(4)  // Retrieve top 4 documents
docs, err := retriever.Retrieve(ctx, query)

3. Score Threshold Filtering

Filter low-quality results:

retriever.SetScoreThreshold(0.5)  // Only keep scores >= 0.5
docs, err := retriever.Retrieve(ctx, query)

4. Multi-Query Retrieval

Generate query variations for improved recall:

multiQueryRetriever := retrieval.NewRAGMultiQueryRetriever(
    baseRetriever,
    3,          // Generate 3 variations
    llmClient,  // Use LLM to generate variations
)
docs, err := multiQueryRetriever.Retrieve(ctx, query)

5. Document Reranking

MMR Reranking (Maximal Marginal Relevance)

Balances relevance and diversity:

mmrReranker := retrieval.NewMMRReranker(
    0.7,  // Lambda: 0.0 = diversity, 1.0 = relevance
    4,    // TopN results
)
reranked, err := mmrReranker.Rerank(ctx, query, docs)

Cross-Encoder Reranking

Uses a cross-encoder model for precise relevance scoring:

ceReranker := retrieval.NewCrossEncoderReranker(
    "cross-encoder/ms-marco-MiniLM-L-6-v2",
    4,  // TopN results
)
reranked, err := ceReranker.Rerank(ctx, query, docs)

Rank Fusion

Combines multiple ranking strategies:

rankFusion := retrieval.NewRankFusion("rrf")  // Reciprocal Rank Fusion
fusedDocs := rankFusion.Fuse([][]*retrieval.Document{
    ranking1,
    ranking2,
    ranking3,
})

6. Custom Prompt Templates

Create domain-specific prompt templates:

customTemplate := `You are an AI tutor helping students.

Reference Materials:
{documents}

Question: {query}

Provide a beginner-friendly explanation.`

formattedPrompt, err := retriever.RetrieveAndFormat(ctx, query, customTemplate)

Knowledge Base Documents

The example includes sample documents about:

Machine Learning fundamentals
Deep Learning concepts
Natural Language Processing
Computer Vision
Reinforcement Learning
Neural Networks architecture
Transformers architecture
RAG techniques

Production Considerations

1. Embedding Models

For production use, replace the SimpleEmbedder with a real embedding model:

// Using OpenAI embeddings
embedder := openai.NewEmbedder(openaiClient, "text-embedding-3-small")

// Or using other models via LangChain
embedder := langchain.NewEmbedder("sentence-transformers/all-MiniLM-L6-v2")

2. Scalability

Use Qdrant Cloud for production workloads
Enable horizontal scaling with multiple Qdrant nodes
Implement caching for frequently accessed documents
Use batch operations for bulk document uploads

3. Error Handling

// Implement retry logic
maxRetries := 3
for i := 0; i < maxRetries; i++ {
    docs, err := retriever.Retrieve(ctx, query)
    if err == nil {
        break
    }
    time.Sleep(time.Second * time.Duration(i+1))
}

// Handle context cancellation
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()

4. Monitoring

// Add observability
import "github.com/kart-io/goagent/observability"

// Enable tracing
tracer := observability.NewTracer("rag-service")
ctx = tracer.StartSpan(ctx, "rag_query")
defer tracer.EndSpan(ctx)

// Track metrics
metrics.RecordRetrievalLatency(elapsed)
metrics.RecordDocumentCount(len(docs))

5. Document Chunking

For large documents, implement chunking:

// Split documents into chunks
chunker := document.NewRecursiveTextSplitter(
    1000,  // Chunk size
    200,   // Overlap
)
chunks := chunker.Split(largeDocument)

// Add chunks to vector store
for _, chunk := range chunks {
    store.AddDocuments(ctx, []*retrieval.Document{chunk})
}

Troubleshooting

"DEEPSEEK_API_KEY not set"

Make sure you've exported your API key:

export DEEPSEEK_API_KEY="your-key"

"Failed to connect to Qdrant"

Check if Qdrant is running: curl http://localhost:6333
Verify the URL in environment variable: echo $QDRANT_URL
Check Docker logs: docker logs <container-id>

"Collection already exists"

The example creates a collection named goagent_rag_demo. If it exists from a previous run:

# Delete the collection via Qdrant API
curl -X DELETE http://localhost:6333/collections/goagent_rag_demo

# Or use a different collection name
export QDRANT_COLLECTION="my_new_collection"

Low Retrieval Quality

Increase TopK: Retrieve more documents
Adjust threshold: Lower the score threshold
Use multi-query: Enable query variations
Apply reranking: Use MMR or cross-encoder
Improve embeddings: Use better embedding models

Performance Tuning

Query Optimization

// Use smaller TopK for faster retrieval
retriever.SetTopK(3)

// Set timeout for LLM calls
ctx, cancel := context.WithTimeout(ctx, 10*time.Second)
defer cancel()

// Use streaming for large responses
stream, err := llmClient.Stream(ctx, prompt)

Batch Operations

// Add documents in batches
batchSize := 100
for i := 0; i < len(allDocs); i += batchSize {
    end := min(i+batchSize, len(allDocs))
    batch := allDocs[i:end]
    store.AddDocuments(ctx, batch)
}

Advanced Use Cases

1. Conversational RAG

Maintain conversation history:

conversationHistory := []llm.Message{}
for {
    // Add user message
    conversationHistory = append(conversationHistory,
        llm.UserMessage(userQuery))

    // Retrieve context
    context := retriever.RetrieveWithContext(ctx, userQuery)

    // Add context as system message
    conversationHistory = append([]llm.Message{
        llm.SystemMessage(context),
    }, conversationHistory...)

    // Generate response
    response, _ := llmClient.Chat(ctx, conversationHistory)

    // Add assistant response
    conversationHistory = append(conversationHistory,
        llm.AssistantMessage(response.Content))
}

2. Multi-Document RAG

Retrieve from multiple collections:

// Create multiple stores for different domains
techStore := setupQdrantStore(ctx, "tech_docs")
legalStore := setupQdrantStore(ctx, "legal_docs")

// Retrieve from both
techDocs, _ := techStore.Search(ctx, query, 3)
legalDocs, _ := legalStore.Search(ctx, query, 3)

// Merge and rerank
allDocs := append(techDocs, legalDocs...)
reranked, _ := reranker.Rerank(ctx, query, allDocs)

3. Hybrid Search

Combine vector search with keyword search:

// Vector search
vectorDocs, _ := vectorStore.Search(ctx, query, 5)

// Keyword search (BM25)
keywordDocs, _ := keywordIndex.Search(query, 5)

// Fuse results
fusion := retrieval.NewRankFusion("rrf")
finalDocs := fusion.Fuse([][]*retrieval.Document{
    vectorDocs,
    keywordDocs,
})

Resources

License

This example is part of the GoAgent project and is licensed under the same terms.

Support

For issues or questions:

Open an issue on GitHub
Check the documentation
Review other examples

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

deepseek_rag_example.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL