rag

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 1, 2025 License: MIT Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ChunkText

func ChunkText(text string, chunkSize, overlap int) []string

ChunkText takes a text string and divides it into chunks of a specified size with a given overlap. It returns a slice of strings, where each string represents a chunk of the original text.

Parameters:

  • text: The input text to be chunked.
  • chunkSize: The size of each chunk.
  • overlap: The amount of overlap between consecutive chunks.

Returns:

  • []string: A slice of strings representing the chunks of the original text.

func ChunkWithMarkdownHierarchy

func ChunkWithMarkdownHierarchy(content string) []string

ChunkWithMarkdownHierarchy processes markdown content into formatted chunks with hierarchical context

func SplitMarkdownBySections

func SplitMarkdownBySections(markdown string) []string

SplitMarkdownBySections splits markdown content into sections at header boundaries

func SplitTextWithDelimiter

func SplitTextWithDelimiter(text string, delimiter string) []string

SplitTextWithDelimiter splits the given text using the specified delimiter and returns a slice of strings.

Parameters:

  • text: The text to be split.
  • delimiter: The delimiter used to split the text.

Returns:

  • []string: A slice of strings containing the split parts of the text.

Types

type MarkdownChunk

type MarkdownChunk struct {
	Header         string
	Content        string
	Level          int
	Prefix         string
	ParentLevel    int
	ParentHeader   string
	ParentPrefix   string
	Hierarchy      string
	SimpleMetaData string                 // Additional metadata if needed
	Metadata       map[string]interface{} // additional metadata
	KeyWords       []string               // Keywords that could be extracted from the content
}

MarkdownChunk represents a parsed markdown section with hierarchical context

func ParseMarkdownHierarchy

func ParseMarkdownHierarchy(content string) []MarkdownChunk

ParseMarkdownHierarchy parses the given markdown content and returns a slice of MarkdownChunk structs preserving the hierarchical context

type MemoryVectorStore

type MemoryVectorStore struct {
	Records map[string]VectorRecord
}

MemoryVectorStore implements VectorStore using in-memory storage

func (*MemoryVectorStore) GetAll

func (mvs *MemoryVectorStore) GetAll() ([]VectorRecord, error)

GetAll returns all vector records stored in the MemoryVectorStore

func (*MemoryVectorStore) Load

func (mvs *MemoryVectorStore) Load(storeFilePath string) error

Load reads vector records from a JSON file and populates the MemoryVectorStore

func (*MemoryVectorStore) Persist

func (mvs *MemoryVectorStore) Persist(storeFilePath string) error

Persist saves the MemoryVectorStore to a JSON file

func (*MemoryVectorStore) ResetMemory

func (mvs *MemoryVectorStore) ResetMemory() error

ResetMemory clears all vector records from the MemoryVectorStore

func (*MemoryVectorStore) Save

func (mvs *MemoryVectorStore) Save(vectorRecord VectorRecord) (VectorRecord, error)

Save saves a vector record to the MemoryVectorStore. If the record does not have an ID, it generates a new UUID for it. It returns the saved vector record and an error if any occurred during the save operation. If the record already exists, it will be overwritten.

func (*MemoryVectorStore) SearchSimilarities

func (mvs *MemoryVectorStore) SearchSimilarities(embeddingFromQuestion VectorRecord, limit float64) ([]VectorRecord, error)

SearchSimilarities searches for vector records in the MemoryVectorStore that have a cosine distance similarity greater than or equal to the given limit.

Parameters:

  • embeddingFromQuestion: the vector record to compare similarities with.
  • limit: the minimum cosine distance similarity threshold.

Returns:

  • []llm.VectorRecord: a slice of vector records that have a cosine distance similarity greater than or equal to the limit.
  • error: an error if any occurred during the search.

func (*MemoryVectorStore) SearchTopNSimilarities

func (mvs *MemoryVectorStore) SearchTopNSimilarities(embeddingFromQuestion VectorRecord, limit float64, max int) ([]VectorRecord, error)

SearchTopNSimilarities searches for the top N similar vector records based on the given embedding from a question. It returns a slice of vector records and an error if any. The limit parameter specifies the minimum similarity score for a record to be considered similar. The max parameter specifies the maximum number of vector records to return.

type VectorRecord

type VectorRecord struct {
	Id               string    `json:"id"`
	Prompt           string    `json:"prompt"`
	Embedding        []float64 `json:"embedding"`
	CosineSimilarity float64
}

VectorRecord represents a stored vector with metadata and similarity score

type VectorStore

type VectorStore interface {
	GetAll() ([]VectorRecord, error)
	Save(vectorRecord VectorRecord) (VectorRecord, error)
	SearchSimilarities(embeddingFromQuestion VectorRecord, limit float64) ([]VectorRecord, error)
	SearchTopNSimilarities(embeddingFromQuestion VectorRecord, limit float64, max int) ([]VectorRecord, error)
}

VectorStore defines the interface for storing and searching vector embeddings

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL