README
¶
RAG with LangChain Embeddings Example
This example demonstrates how to integrate LangChain Go's embeddings with LangGraphGo's RAG system.
Overview
LangChain Go provides excellent embedding models from various providers (OpenAI, Cohere, HuggingFace, etc.). This example shows how to use them seamlessly with our RAG pipeline through a simple adapter.
Key Features
- Direct Integration: Use LangChain embeddings with minimal wrapper
- Multiple Providers: Support for OpenAI, Cohere, and other providers
- Type Conversion: Automatic float32 ↔ float64 conversion
- Complete RAG Pipeline: End-to-end example with real embeddings
- Similarity Comparison: Demonstrate embedding quality
Architecture
Adapter Class
The LangChainEmbedder adapter in prebuilt/rag_langchain_adapter.go:
type LangChainEmbedder struct {
embedder embeddings.Embedder
}
Key Features:
- Wraps any LangChain embedder
- Implements our
Embedderinterface - Converts
float32(LangChain) ↔float64(our type) - Zero overhead, simple pass-through
Usage
Basic Embedding
import (
"github.com/tmc/langchaingo/embeddings"
"github.com/tmc/langchaingo/llms/openai"
"github.com/smallnest/langgraphgo/prebuilt"
)
// Create LangChain embedder
lcEmbedder, _ := embeddings.NewEmbedder(openai.New())
// Wrap with adapter
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
// Use with our interface
queryEmb, _ := embedder.EmbedQuery(ctx, "What is AI?")
docsEmb, _ := embedder.EmbedDocuments(ctx, texts)
In RAG Pipeline
// Create embedder
lcEmbedder, _ := embeddings.NewEmbedder(openai.New())
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
// Create vector store with LangChain embeddings
vectorStore := prebuilt.NewInMemoryVectorStore(embedder)
// Generate embeddings
embeds, _ := embedder.EmbedDocuments(ctx, texts)
vectorStore.AddDocuments(ctx, documents, embeds)
// Build RAG pipeline
retriever := prebuilt.NewVectorStoreRetriever(vectorStore, 3)
config := prebuilt.DefaultRAGConfig()
config.Retriever = retriever
config.LLM = llm
pipeline := prebuilt.NewRAGPipeline(config)
pipeline.BuildBasicRAG()
Running the Example
Prerequisites
For OpenAI embeddings:
export OPENAI_API_KEY=your_api_key_here
For DeepSeek LLM:
export DEEPSEEK_API_KEY=your_api_key_here
Run
cd examples/rag_with_embeddings
go run main.go
Examples Included
1. OpenAI Embeddings
Test OpenAI's text-embedding-ada-002 model:
openaiEmbedder, _ := embeddings.NewEmbedder(openai.New())
embedder := prebuilt.NewLangChainEmbedder(openaiEmbedder)
queryEmb, _ := embedder.EmbedQuery(ctx, "What is machine learning?")
// Returns 1536-dimensional embedding
2. Complete RAG Pipeline
Build a full RAG system with real embeddings:
// Use OpenAI embeddings if available, otherwise mock
vectorStore := prebuilt.NewInMemoryVectorStore(embedder)
embeds, _ := embedder.EmbedDocuments(ctx, texts)
vectorStore.AddDocuments(ctx, documents, embeds)
// Query with semantic search
result, _ := runnable.Invoke(ctx, prebuilt.RAGState{
Query: "What is LangGraph?",
})
3. Similarity Comparison
Compare embeddings to understand semantic similarity:
testTexts := []string{
"Machine learning and AI",
"Deep learning neural networks",
"The weather is sunny",
}
embeds, _ := embedder.EmbedDocuments(ctx, testTexts)
similarity := cosineSimilarity(embeds[0], embeds[1])
// High similarity for related texts, low for unrelated
Supported Embedding Providers
The adapter works with all LangChain embedding providers:
OpenAI
import "github.com/tmc/langchaingo/llms/openai"
lcEmbedder, _ := embeddings.NewEmbedder(openai.New())
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
Models:
text-embedding-ada-002(1536 dimensions) - Defaulttext-embedding-3-small(1536 dimensions)text-embedding-3-large(3072 dimensions)
Cohere
import "github.com/tmc/langchaingo/llms/cohere"
lcEmbedder, _ := embeddings.NewEmbedder(cohere.New())
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
HuggingFace
import "github.com/tmc/langchaingo/llms/huggingface"
lcEmbedder, _ := embeddings.NewEmbedder(huggingface.New())
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
Vertex AI
import "github.com/tmc/langchaingo/llms/vertexai"
lcEmbedder, _ := embeddings.NewEmbedder(vertexai.New())
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
Type Conversion
The adapter handles automatic type conversion:
LangChain → Our Type
// LangChain returns [][]float32
lcEmbeds := [][]float32{{0.1, 0.2, 0.3}}
// Adapter converts to [][]float64
ourEmbeds := [][]float64{{0.1, 0.2, 0.3}}
Performance
- Conversion is O(n) where n is total embedding values
- Minimal overhead for typical embedding sizes
- No memory allocation optimization needed for most use cases
Embedding Dimensions
Different models have different dimensions:
| Model | Dimension | Provider |
|---|---|---|
| text-embedding-ada-002 | 1536 | OpenAI |
| text-embedding-3-small | 1536 | OpenAI |
| text-embedding-3-large | 3072 | OpenAI |
| embed-english-v3.0 | 1024 | Cohere |
| embed-multilingual-v3.0 | 1024 | Cohere |
Ensure your vector store is configured for the correct dimension.
Best Practices
1. Choose the Right Model
- OpenAI ada-002: Good balance of quality and cost
- OpenAI 3-large: Highest quality, higher cost
- Cohere: Good for multilingual content
2. Batch Processing
// Process documents in batches for efficiency
texts := []string{...} // Many documents
embeds, _ := embedder.EmbedDocuments(ctx, texts)
// LangChain handles batching internally
3. Caching
// Cache embeddings for frequently used texts
cache := make(map[string][]float64)
func getEmbedding(text string) []float64 {
if emb, ok := cache[text]; ok {
return emb
}
emb, _ := embedder.EmbedQuery(ctx, text)
cache[text] = emb
return emb
}
4. Error Handling
embeds, err := embedder.EmbedDocuments(ctx, texts)
if err != nil {
// Handle rate limits, network errors, etc.
log.Printf("Embedding failed: %v", err)
// Consider retry logic
}
Comparison: Mock vs Real Embeddings
Mock Embeddings (Development)
embedder := prebuilt.NewMockEmbedder(1536)
- ✅ Fast, no API calls
- ✅ Deterministic
- ✅ Free
- ❌ Not semantically meaningful
Real Embeddings (Production)
lcEmbedder, _ := embeddings.NewEmbedder(openai.New())
embedder := prebuilt.NewLangChainEmbedder(lcEmbedder)
- ✅ Semantically meaningful
- ✅ High quality retrieval
- ✅ Production-ready
- ❌ Requires API key and costs money
Troubleshooting
API Key Not Set
Error: missing API key
Solution: Set the appropriate environment variable:
export OPENAI_API_KEY=your_key
Dimension Mismatch
Error: embedding dimension mismatch
Solution: Ensure vector store dimension matches model:
// For OpenAI ada-002
vectorStore := prebuilt.NewInMemoryVectorStore(embedder)
// Embedder will return 1536-dimensional vectors
Rate Limits
Error: rate limit exceeded
Solution: Implement batching and retry logic:
// LangChain embedders support batching
lcEmbedder, _ := embeddings.NewEmbedder(
openai.New(),
embeddings.WithBatchSize(100),
)
Performance Tips
- Batch Documents: Process multiple documents at once
- Cache Results: Store embeddings for reuse
- Use Appropriate Model: Balance quality vs cost
- Monitor Usage: Track API calls and costs
- Implement Retry: Handle transient failures
Next Steps
- Try different embedding providers (Cohere, HuggingFace)
- Experiment with different models and dimensions
- Build a production RAG system with real embeddings
- Implement caching and optimization strategies
- Compare embedding quality across providers
See Also
Documentation
¶
There is no documentation for this package.