goframe

module
v0.33.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2026 License: MIT

README

GoFrame

Go Reference Go Report Card

A modular Go framework for building production-ready LLM and RAG applications. GoFrame provides a clean, extensible architecture with a powerful set of tools for document processing, embedding, and vector storage.

Overview

GoFrame is designed to simplify the development of applications that leverage Large Language Models, with a strong focus on Retrieval-Augmented Generation (RAG). It provides a set of decoupled components that can be composed to build sophisticated data pipelines.

The framework is built around a set of core interfaces for LLMs, Embedders, and Vector Stores, allowing you to easily swap implementations (e.g., switch from Ollama to another provider) without changing your core application logic.

Core Features

  • Scalable Agentic Infrastructure: Built for high-performance agentic workflows.
    • Streaming Ingestion: Process massive repositories with flat memory usage.
    • Binary Quantization: 30x memory reduction for vector storage.
  • Graph-Like Retrieval: Go beyond simple similarity search.
    • Impact Analysis: Find downstream dependents ("who uses this code?").
    • Dependency Verification: Trace upstream dependencies ("what does this use?").
    • Multi-Language Support: Automatic metadata extraction for Go and TypeScript/TSX.
  • Agent Framework: Programmatic control of AI agents through OpenCode SDK.
    • MCP Server Management: Configure local (stdio) and remote (HTTP/SSE) MCP servers.
    • Session Management: Create, manage, and interact with agent sessions.
    • Permission System: Fine-grained control over agent capabilities.
    • Event Streaming: Real-time response handling.
  • Pluggable Architecture:
    • LLMs: Clean interfaces for Ollama (local) and cloud providers.
    • Vector Stores: Robust Qdrant implementation with metadata filtering.
    • Embeddings: Decoupled embedding generation.
  • Advanced Document Processing:
    • GitLoader: Smart loading with automatic metadata extraction (imports, packages).
    • RSSLoader: Ingest RSS/Atom/JSON feeds with HTML sanitization and content normalization.
    • Code-Aware Splitter: Semantically chunks code while preserving context.
    • Parsers: Plugins for Go, TypeScript, Markdown, JSON, YAML, PDF, and RSS.

Quick Start

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/sevigo/goframe/chains"
    "github.com/sevigo/goframe/embeddings"
    "github.com/sevigo/goframe/llms/ollama"
    "github.com/sevigo/goframe/schema"
    "github.com/sevigo/goframe/vectorstores"
    "github.com/sevigo/goframe/vectorstores/qdrant"
)

func main() {
    ctx := context.Background()

    // 1. Create LLM client
    llm, _ := ollama.New(ollama.WithModel("llama3.2"))

    // 2. Create embedder
    embedder, _ := embeddings.NewEmbedder(llm)

    // 3. Create vector store
    store, _ := qdrant.New(
        qdrant.WithCollectionName("my-docs"),
        qdrant.WithEmbedder(embedder),
    )

    // 4. Add documents
    docs := []schema.Document{
        schema.NewDocument("Go is a programming language created at Google.", nil),
        schema.NewDocument("Rust focuses on memory safety without garbage collection.", nil),
    }
    store.AddDocuments(ctx, docs)

    // 5. Create RAG chain
    retriever := vectorstores.ToRetriever(store, 3)
    ragChain, _ := chains.NewRetrievalQA(retriever, llm)

    // 6. Query
    answer, _ := ragChain.Call(ctx, "What is Go?")
    fmt.Println(answer)
}

Architecture

GoFrame follows a modular pipeline:

[Source Code] -> [GitLoader] -> [Parser Plugin] -> [CodeAwareSplitter] -> [Embedder] -> [VectorStore]
(Go, TS, etc.)   (Extracts Metadata)  (AST Analysis)    (Propagates Metadata)    (Ollama)      (Qdrant)
  1. Load & Analyze: GitLoader reads files and uses language parsers to extract file-level metadata (imports, package name).
  2. Split & Propagate: CodeAwareTextSplitter chunks the code, propagating the file-level metadata to every chunk.
  3. Embed & Index: content is embedded and stored in Qdrant with enriched metadata.
  4. Graph Retrieval: The DependencyRetriever uses this metadata to traverse the dependency graph.

Prerequisites

  • Go 1.21 or later
  • Ollama (for embeddings & local LLMs)
  • Docker (for Qdrant)

Installation

go get github.com/sevigo/goframe@latest

API Reference

Full API documentation is available at pkg.go.dev.

Usage Examples

1. Basic RAG
// Initialize components...
store, _ := qdrant.New(qdrant.WithCollectionName("my-docs"))

// Add documents
docs := []schema.Document{
    schema.NewDocument("Paris is the capital of France.", map[string]any{"continent": "Europe"}),
}
store.AddDocuments(ctx, docs)

// Search
results, _ := store.SimilaritySearch(ctx, "Europe capital", 1)
3. RSS Feed Ingestion

Ingest RSS/Atom/JSON feeds into your RAG pipeline:

import (
    "github.com/sevigo/goframe/documentloaders"
    "github.com/sevigo/goframe/parsers"
)

// Initialize parser registry
registry := parsers.NewRegistry(logger)
registry.RegisterParser(parsers.NewRSSParser())

// Create RSS loader with normalization
loader, _ := documentloaders.NewRSS(
    []string{
        "https://news.ycombinator.com/rss",
        "https://feeds.bbci.co.uk/news/technology/rss.xml",
    },
    registry,
    documentloaders.WithRSSNormalization(documentloaders.NormalizationConfig{
        StripHTML:        true,    // Strip HTML tags vs sanitize
        RemoveTracking:   true,    // Remove UTM/fbclid parameters
        MaxContentLength: 10000,   // Truncate long content
        MinContentLength: 100,     // Skip short items
        NormalizeURLs:    true,    // Remove fragments
        MinTitleLength:   5,       // Minimum title length
        FallbackToURL:    true,    // Use URL as title if missing
        NormalizeAuthors: true,    // Clean author names
    }),
    documentloaders.WithRSSMaxItems(100),           // Max items per feed
    documentloaders.WithRSSSkipDuplicates(true),     // Skip duplicate items
    documentloaders.WithRSSBatchSize(50),            // Batch size
)

// Load documents
docs, _ := loader.Load(ctx)

// Or stream to vector store
loader.LoadAndProcessStream(ctx, func(ctx context.Context, batch []schema.Document) error {
    vectorStore.AddDocuments(ctx, batch)
    return nil
})

Features:

  • HTML sanitization with OWASP-compliant XSS protection
  • URL normalization (remove tracking parameters)
  • Date parsing (RFC1123, RFC3339, ISO8601)
  • Content deduplication by GUID/link
  • Parallel feed fetching with worker pools
  • Batch processing with backpressure
  • Retry logic with exponential backoff
4. Graph / Dependency Analysis

Perform sophisticated code navigation using the DependencyRetriever.

import "github.com/sevigo/goframe/vectorstores"

// Initialize retriever
retriever, err := vectorstores.NewDependencyRetriever(store)
if err != nil {
    log.Fatal(err)
}

// 1. Impact Analysis: Who imports "my/package"?
network, _ := retriever.GetContextNetwork(ctx, "github.com/my/project/pkg", nil)
for _, dependent := range network.Dependents {
    fmt.Printf("File identifying impact: %s\n", dependent.Metadata["source"])
}

// 2. Upstream Verification: What does "my/package" depend on?
// (Pass known imports to verify their existence in the graph)
network, _ = retriever.GetContextNetwork(ctx, "github.com/my/project/pkg", []string{"fmt", "os"})
for _, dep := range network.Dependencies {
    fmt.Printf("Verified dependency: %s\n", dep.Metadata["source"])
}
3. Hybrid Search (Dense + Sparse)

Combine semantic understanding with exact keyword matching using sparse vectors.

import (
    "github.com/sevigo/goframe/embeddings/sparse"
    "github.com/sevigo/goframe/vectorstores/qdrant"
    "github.com/sevigo/goframe/vectorstores"
)

// 1. Configure Store with Named Sparse Vector
store, _ := qdrant.New(
    qdrant.WithCollectionName("hybrid-docs"),
    qdrant.WithSparseVector("bow_sparse"), // Enable sparse vector support
)

// 2. Add Document with Sparse Vector
docContent := "func CalculateTax(income float64) float64 { ... }"
sparseVec, _ := sparse.GenerateSparseVector(ctx, docContent)
doc := schema.NewDocument(docContent, nil)
doc.Sparse = sparseVec
store.AddDocuments(ctx, []schema.Document{doc})

// 3. Perform Hybrid Search
query := "CalculateTax"
sparseQuery, _ := sparse.GenerateSparseVector(ctx, query)

results, _ := store.SimilaritySearch(ctx, query, 5,
    vectorstores.WithSparseQuery(sparseQuery), // Pass sparse query for hybrid retrieval
)
4. Agent Framework with MCP Servers

Create AI agents with MCP (Model Context Protocol) server configuration for tool access.

import "github.com/sevigo/goframe/agent"

// Configure MCP servers
mcpRegistry := agent.NewMCPRegistry(
    // Local MCP server (stdio transport)
    agent.LocalMCPServer("filesystem",
        []string{"mcp-filesystem", "/path/to/repo"},
        agent.WithEnv(map[string]string{"LOG_LEVEL": "debug"}),
        agent.WithEnabled(true),
    ),
    // Remote MCP server (HTTP/SSE transport)
    agent.RemoteMCPServer("brave-search",
        "https://mcp.brave.com/search",
        agent.WithHeaders(map[string]string{"Authorization": "Bearer token"}),
        agent.WithEnabled(true),
    ),
)

// Configure permissions
permissions := agent.NewPermissions().
    AllowBash("go test", "go build").
    AllowEdit().
    DenyWebfetch().
    Build()

// Create agent
ag, _ := agent.New(
    agent.WithModel("anthropic/claude-3-5-sonnet"),
    agent.WithMCPRegistry(mcpRegistry),
    agent.WithPermissions(permissions),
)

// Create session and interact
session, _ := ag.NewSession(ctx, agent.WithTitle("Code Review"))
response, _ := session.Prompt(ctx, "Explain this code")

// Stream responses
events, _ := ag.Stream(ctx, "Write a haiku")
for event := range events {
    if event.Type == agent.EventTypeComplete {
        fmt.Println(event.Data.(agent.Response).Content)
    }
}

Running the Ultimate RAG Demo

The examples/qdrant-ultimate-rag is a production-grade demonstration featuring:

  • Full repository ingestion (Go & TypeScript).
  • Streaming processing pipeline.
  • Graph Retrieval verification.
# Set up environment
export OLLAMA_API_KEY=your_key_if_using_cloud

# Run the full integration test
go run ./examples/qdrant-ultimate-rag/main.go

Core Components

  • /schema: Defines the core data structures used throughout the framework, such as Document, ChatMessage, and ParserPlugin.
  • /llms: Contains interfaces and implementations for LLM clients. The ollama package provides a full-featured client.
  • /embeddings: Provides the Embedder interface and a default implementation that wraps an LLM client to perform embedding tasks.
  • /vectorstores: Contains interfaces and implementations for vector stores. The qdrant package provides a robust client.
  • /agent: Provides the Agent framework for programmatic control of AI agents through OpenCode SDK. Features MCP server management, session handling, permissions, and event streaming.
  • /voice: Provides Text-to-Speech synthesis with OpenAI-compatible API support. Supports both buffered and streaming audio generation, compatible with OpenAI cloud and local servers like Kokoro-FastAPI.
  • /parsers: Home to the language parser plugin system. Each sub-directory (/golang, /markdown, etc.) contains a plugin for a specific file type. See Plugins.md for more details.
  • /textsplitter: Provides the CodeAwareTextSplitter, which uses the parser plugins to perform intelligent, semantic chunking of documents.

How to Contribute

Contributions are welcome! Whether it's a bug fix, a new feature, or documentation improvements, we appreciate your help.

  1. Fork the repository.
  2. Create a new branch for your feature (git checkout -b feature/my-new-feature).
  3. Make your changes and add/update tests.
  4. Run tests to ensure everything is working (go test ./...).
  5. Submit a pull request with a clear description of your changes.
Areas for Contribution
  • New LLM Clients: Add support for providers like OpenAI, Anthropic, or Hugging Face.
  • New Vector Stores: Implement the VectorStore interface for ChromaDB, Pinecone, Weaviate, etc.
  • New Parser Plugins: Add support for more languages like Python, Java, C++, or Rust.
  • Enhance Agent Framework: Add new MCP server types, improve permission handling, or add more event types.
  • Enhance RAG Components: Implement advanced retrieval strategies like re-rankers or query transformers.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Directories

Path Synopsis
Package agent provides an abstraction layer for managing communication with OpenCode in agent mode, with a focus on MCP (Model Context Protocol) server configuration.
Package agent provides an abstraction layer for managing communication with OpenCode in agent mode, with a focus on MCP (Model Context Protocol) server configuration.
examples/basic command
Example demonstrates basic usage of the agent package with MCP servers
Example demonstrates basic usage of the agent package with MCP servers
Package chains provides composable chains for LLM workflows.
Package chains provides composable chains for LLM workflows.
Package documentloaders provides document loading utilities for RAG applications.
Package documentloaders provides document loading utilities for RAG applications.
Package embeddings provides interfaces and utilities for text embedding.
Package embeddings provides interfaces and utilities for text embedding.
examples
hybrid-search command
kokoro-captioned-dialogue command
Package main demonstrates captioned dialogue synthesis with timestamp-based timing.
Package main demonstrates captioned dialogue synthesis with timestamp-based timing.
kokoro-dialogue command
kokoro-tts command
openai-dialogue command
openai-tts command
Package main demonstrates OpenAI Text-to-Speech API usage.
Package main demonstrates OpenAI Text-to-Speech API usage.
qdrant-rerank command
rss-ingestion command
Package httpclient provides a shared HTTP client with sensible defaults for connection pooling, timeouts, and retry logic.
Package httpclient provides a shared HTTP client with sensible defaults for connection pooling, timeouts, and retry logic.
Package llms provides interfaces and utilities for LLM providers.
Package llms provides interfaces and utilities for LLM providers.
Package parsers provides a registry for language-specific parser plugins.
Package parsers provides a registry for language-specific parser plugins.
csv
html
Package html provides an HTML parser plugin for transforming HTML content into clean Markdown suitable for LLM consumption and RAG applications.
Package html provides an HTML parser plugin for transforming HTML content into clean Markdown suitable for LLM consumption and RAG applications.
markdown
core.go - Main plugin file with goldmark integration
core.go - Main plugin file with goldmark integration
pdf
yaml
extractor - Fixed version
extractor - Fixed version
Package prompts provides prompt templates for LLM interactions.
Package prompts provides prompt templates for LLM interactions.
Package schema defines core data structures and interfaces used throughout the goframe library.
Package schema defines core data structures and interfaces used throughout the goframe library.
Package textsplitter provides text splitting utilities for chunking documents.
Package textsplitter provides text splitting utilities for chunking documents.
Package vectorstores provides interfaces and implementations for vector databases.
Package vectorstores provides interfaces and implementations for vector databases.
Package voice provides interfaces and types for Text-to-Speech synthesis.
Package voice provides interfaces and types for Text-to-Speech synthesis.
openai
Package openai provides an OpenAI-compatible Text-to-Speech implementation.
Package openai provides an OpenAI-compatible Text-to-Speech implementation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL