ragkit

package module
v0.3.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 8, 2025 License: MIT Imports: 5 Imported by: 0

README

ragkit: A Go package for document indexing and retrieval in RAG systems

ragkit_logo

ragkit is a Go package designed to simplify the implementation of Retrieval-Augmented Generation (RAG) systems. It includes the definition and implementation of a VectorStore interface that performs document indexing and retrieval, providing tools for vectorization and semantic search capabilities.

Installation

go get github.com/suapapa/go_ragkit

Quick Start

import (
    // ...
    ragkit "github.com/suapapa/go_ragkit"
    vstore_helper "github.com/suapapa/go_ragkit/vector_store/weaviate/helper"
)

func main() {
    // Initialize vector store (Weaviate + Ollama)
    vstore, err := vstore_helper.NewWeaviateOllamaVectorStore(
        "DoolyFamily", // vector DB class name
        vstore_helper.DefaultOllamaEmbedModel, 
    )
    if err != nil {
        panic(err)
    }

    // Create documents from text
    docs := ragkit.MakeDocsFromTexts(
        []string{
            "고길동의 집에는 둘리, 도우너, 또치, 희동이, 철수, 영희가 살고 있다.",
            "희동이는 고길동의 조카이다.",
            // ...
        },
        nil,
    )

    // Index documents
    _, err = vstore.Index(context.Background(), docs...)
    if err != nil {
        panic(err)
    }

    // Perform semantic search
    query := "고길동과 희동이의 관계?"
    results, err := vstore.RetrieveText(context.Background(), query, 1)
    if err != nil {
        panic(err)
    }

    // Print result
    fmt.Println(results[0].Text) // <- 희동이는 고길동의 조카이다.
}

Examples

Pre-requirement - launch Weaviate for local vector DB:

docker run -it --rm \
  -p 8080:8080 -p 50051:50051 \
  cr.weaviate.io/semitechnologies/weaviate:1.30.2

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Documentation

Overview

Package ragkit provides utility functions for RAG (Retrieval-Augmented Generation) applications.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func GenerateID

func GenerateID(text string, metadata map[string]any) string

GenerateID creates a deterministic UUID v5 from the input text and metadata. The generated ID is guaranteed to be unique for different inputs.

func ToCamelCase added in v0.1.4

func ToCamelCase(input string) string

Types

type Document

type Document struct {
	ID       string         // Unique ID
	Text     string         // Original text
	Metadata map[string]any // Optional: Additional metadata
	Vector   []float32      // Optional: Embedding vector (generated by Embeder if not provided)
}

Document is a type that represents a document

func MakeDocsFromTexts

func MakeDocsFromTexts(texts []string, metadata map[string]any) []Document

MakeDocsFromTexts creates a slice of Document from a slice of texts with optional metadata. Each document will have a unique ID generated from its text content.

type Embedder

type Embedder interface {
	// EmbedText: Convert a single text to an embedding vector
	EmbedText(ctx context.Context, text string) ([]float32, error)

	// EmbedTexts: Convert texts to embedding vectors
	EmbedTexts(ctx context.Context, texts ...string) ([][]float32, error)

	fmt.Stringer
}

Embedder is a type that can embed texts into vectors

type Indexer

type Indexer interface {
	// Index: Index multiple documents
	// Returns: IDs of indexed documents
	Index(ctx context.Context, docs ...Document) ([]string, error)

	// Delete: Delete a document by ID
	Delete(ctx context.Context, id string) error

	// Exists: Check if a document with the given ID exists
	Exists(ctx context.Context, id string) (bool, error)
}

Indexer is a type that can index documents into a vector database

type RetrievedDoc

type RetrievedDoc struct {
	// ID       string         // Document ID
	// Score    float32        // Similarity score
	Vector   []float32      // Retrieved vector
	Text     string         // Retrieved text
	Metadata map[string]any // Optional metadata
}

RetrievedDoc is a type that represents a retrieved document from vector database

type Retriever

type Retriever interface {
	// Retrieve: Return top-K documents based on query vector
	Retrieve(ctx context.Context, query []float32, topK int, metadataFieldNames ...string) ([]RetrievedDoc, error)

	// RetrieveText: Return top-K documents based on text query
	RetrieveText(ctx context.Context, text string, topK int, metadataFieldNames ...string) ([]RetrievedDoc, error)
}

Retriever is a type that can retrieve documents from a vector database

type VectorStore added in v0.2.1

type VectorStore interface {
	Indexer
	Retriever

	fmt.Stringer
}

VectorStore is a combination of Indexer and Retriever

Directories

Path Synopsis
embedder
examples
weaviate-ollama command
weaviate-openai command
vector_store

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL