kodit

package module
v1.1.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2026 License: Apache-2.0 Imports: 33 Imported by: 0

README ΒΆ

Helix Kodit Logo

Kodit: A Code Indexing MCP Server

Kodit connects your AI coding assistant to external codebases to provide accurate and up-to-date snippets of code.

Documentation License Discussions

⭐ Help us reach more developers and grow the Helix community. Star this repo!

Helix Kodit is an MCP server that connects your AI coding assistant to external codebases. It can:

  • Improve your AI-assisted code by providing canonical examples direct from the source
  • Index local and public codebases
  • Integrates with any AI coding assistant via MCP
  • Search using keyword and semantic search
  • Integrate with any OpenAI-compatible or custom API/model

If you're an engineer working with AI-powered coding assistants, Kodit helps by providing relevant and up-to-date examples of your task so that LLMs make less mistakes and produce fewer hallucinations.

Features

Codebase Indexing

Kodit connects to a variety of local and remote codebases to build an index of your code. This index is used to build a snippet library, ready for ingestion into an LLM.

  • Index local directories and public Git repositories
  • Build comprehensive snippet libraries for LLM ingestion
  • Support for 20+ programming languages including Python, JavaScript/TypeScript, Java, Go, Rust, C/C++, C#, HTML/CSS, and more
  • Advanced code analysis with dependency tracking and call graph generation
  • Intelligent snippet extraction with context-aware dependencies
  • Efficient indexing with selective reindexing (only processes modified files)
  • Privacy first: respects .gitignore and .noindex files
  • Auto-indexing configuration for shared server deployments
  • Enhanced Git provider support including Azure DevOps
  • Index private repositories via a PAT
  • Improved progress monitoring and reporting during indexing
  • Advanced code slicing infrastructure with Tree-sitter parsing
  • Automatic periodic sync to keep indexes up-to-date

MCP Server

Relevant snippets are exposed to an AI coding assistant via an MCP server. This allows the assistant to request relevant snippets by providing keywords, code, and semantic intent. Kodit has been tested to work well with:

  • Seamless integration with popular AI coding assistants
  • Tested and verified with:
  • Please contribute more instructions! ... any other assistant is likely to work ...
  • Advanced search filters by source, language, author, date range, and file path
  • Hybrid search combining BM25 keyword search with semantic search
  • Enhanced MCP tools with rich context parameters and metadata

Hosted MCP Server

Try Kodit instantly with our hosted MCP server at https://kodit.helix.ml/mcp! No installation required - just add it to your AI coding assistant and start searching popular codebases immediately.

The hosted server provides:

  • Pre-indexed popular open source repositories
  • Zero configuration - works out of the box
  • Same powerful search capabilities as self-hosted Kodit
  • Perfect for trying Kodit before setting up your own instance

Find out more in the hosted Kodit documentation.

Enterprise Ready

Out of the box, Kodit works with a local SQLite database and very small, local models. But enterprises can scale out with performant databases and dedicated models. Everything can even run securely, privately, with on-premise LLM platforms like Helix.

Supported databases:

Supported providers:

  • Local (which uses tiny CPU-only open-source models)
  • OpenAI
  • Secure, private LLM enclave with Helix.
  • Any other OpenAI compatible API

Enhanced deployment options:

  • Docker Compose configurations with VectorChord
  • Kubernetes manifests for production deployments

Quick Start

  1. Install Kodit
  2. Index codebases
  3. Integrate with your coding assistant

Documentation

Roadmap

The roadmap is currently maintained as a Github Project.

πŸ’¬ Support

For commercial support, please contact Helix.ML. To ask a question, please open a discussion.

License

Apache 2.0 Β© 2026 HelixML, Inc.

Documentation ΒΆ

Overview ΒΆ

Package kodit provides a library for code understanding, indexing, and search.

Kodit indexes Git repositories, extracts semantic code snippets using AST parsing, and provides hybrid search (BM25 + vector embeddings) with LLM-powered enrichments.

Basic usage:

client, err := kodit.New(
    kodit.WithSQLite(".kodit/data.db"),
    kodit.WithOpenAI(os.Getenv("OPENAI_API_KEY")),
)
if err != nil {
    log.Fatal(err)
}
defer client.Close()

// Index a repository
repo, err := client.Repositories.Add(ctx, &service.RepositoryAddParams{
    URL: "https://github.com/kubernetes/kubernetes",
})

// Hybrid search
results, err := client.Search.Query(ctx, "create a deployment",
    service.WithSemanticWeight(0.7),
    service.WithLimit(10),
)

// Iterate results
for _, snippet := range results.Snippets() {
    fmt.Println(snippet.Path(), snippet.Name())
}

Index ΒΆ

Constants ΒΆ

This section is empty.

Variables ΒΆ

View Source
var (
	// ErrEmptySource indicates a source with no content to process.
	ErrEmptySource = errors.New("kodit: source is empty")

	// ErrNotFound indicates a requested resource was not found.
	ErrNotFound = errors.New("kodit: not found")

	// ErrValidation indicates a validation error.
	ErrValidation = errors.New("kodit: validation error")

	// ErrConflict indicates a conflict with existing data.
	ErrConflict = errors.New("kodit: conflict")

	// ErrNoDatabase indicates no database was configured.
	ErrNoDatabase = errors.New("kodit: no database configured")

	// ErrNoProvider indicates no AI provider was configured.
	ErrNoProvider = errors.New("kodit: no AI provider configured")

	// ErrProviderNotCapable indicates the provider lacks required capability.
	ErrProviderNotCapable = errors.New("kodit: provider does not support required capability")

	// ErrClientClosed is the canonical error for a closed client.
	// It references the service-level error so errors.Is works across packages.
	ErrClientClosed = service.ErrClientClosed
)

Exported errors for library consumers.

Functions ΒΆ

This section is empty.

Types ΒΆ

type Client ΒΆ

type Client struct {
	// Public resource fields (direct service access)
	Repositories *service.Repository
	Commits      *service.Commit
	Tags         *service.Tag
	Files        *service.File
	Blobs        *service.Blob
	Enrichments  *service.Enrichment
	Tasks        *service.Queue
	Tracking     *service.Tracking
	Search       *service.Search
	Grep         *service.Grep

	// MCPServer describes the MCP server's tools and instructions.
	MCPServer MCPServer
	// contains filtered or unexported fields
}

Client is the main entry point for the kodit library. The background worker starts automatically on creation.

Access resources via struct fields:

client.Repositories.Find(ctx)
client.Commits.Find(ctx, repository.WithRepoID(id))
client.Search.Query(ctx, "query")

func New ΒΆ

func New(opts ...Option) (*Client, error)

New creates a new Client with the given options. The background worker is started automatically.

func (*Client) Close ΒΆ

func (c *Client) Close() error

Close releases all resources and stops the background worker.

func (*Client) Logger ΒΆ

func (c *Client) Logger() zerolog.Logger

Logger returns the client's logger.

func (*Client) WorkerIdle ΒΆ

func (c *Client) WorkerIdle() bool

WorkerIdle reports whether the background worker has no in-flight tasks.

type MCPServer ΒΆ added in v1.1.3

type MCPServer struct {
	// contains filtered or unexported fields
}

MCPServer describes the metadata of a kodit MCP server: its usage instructions and the tools it provides.

func NewMCPServer ΒΆ added in v1.1.3

func NewMCPServer(instructions string, tools []Tool) MCPServer

NewMCPServer creates an MCPServer.

func (MCPServer) Instructions ΒΆ added in v1.1.3

func (s MCPServer) Instructions() string

Instructions returns the server's usage instructions.

func (MCPServer) Tools ΒΆ added in v1.1.3

func (s MCPServer) Tools() []Tool

Tools returns a copy of the server's tools.

type Option ΒΆ

type Option func(*clientConfig)

Option configures the Client.

func WithAPIKeys ΒΆ

func WithAPIKeys(keys ...string) Option

WithAPIKeys sets the API keys for HTTP API authentication.

func WithAnthropic ΒΆ

func WithAnthropic(apiKey string) Option

WithAnthropic sets Anthropic Claude as the text generation provider. Requires a separate embedding provider since Anthropic doesn't provide embeddings.

func WithAnthropicConfig ΒΆ

func WithAnthropicConfig(cfg provider.AnthropicConfig) Option

WithAnthropicConfig sets Anthropic Claude with custom configuration.

func WithChunkParams ΒΆ

func WithChunkParams(params chunking.ChunkParams) Option

WithChunkParams sets the chunk parameters for simple chunking.

func WithCloneDir ΒΆ

func WithCloneDir(dir string) Option

WithCloneDir sets the directory where repositories are cloned. If not specified, defaults to {dataDir}/repos.

func WithCloser ΒΆ

func WithCloser(c io.Closer) Option

WithCloser registers a resource to be closed when the Client shuts down.

func WithDataDir ΒΆ

func WithDataDir(dir string) Option

WithDataDir sets the data directory for cloned repositories and database storage.

func WithEmbeddingBudget ΒΆ

func WithEmbeddingBudget(b search.TokenBudget) Option

WithEmbeddingBudget sets the token budget for code embedding batches.

func WithEmbeddingParallelism ΒΆ

func WithEmbeddingParallelism(n int) Option

WithEmbeddingParallelism sets how many embedding batches are dispatched concurrently. Defaults to 1. Values <= 0 are ignored.

func WithEmbeddingProvider ΒΆ

func WithEmbeddingProvider(p provider.Embedder) Option

WithEmbeddingProvider sets a custom embedding provider.

func WithEnricherParallelism ΒΆ

func WithEnricherParallelism(n int) Option

WithEnricherParallelism sets how many enrichment LLM requests are dispatched concurrently. Defaults to 1. Values <= 0 are ignored.

func WithEnrichmentBudget ΒΆ

func WithEnrichmentBudget(b search.TokenBudget) Option

WithEnrichmentBudget sets the token budget for enrichment embedding batches.

func WithEnrichmentParallelism ΒΆ

func WithEnrichmentParallelism(n int) Option

WithEnrichmentParallelism sets how many enrichment embedding batches are dispatched concurrently. Defaults to 1. Values <= 0 are ignored.

func WithLogger ΒΆ

func WithLogger(l zerolog.Logger) Option

WithLogger sets a custom logger.

func WithModelDir ΒΆ

func WithModelDir(dir string) Option

WithModelDir sets the directory where built-in model files are stored. Defaults to {dataDir}/models if not specified.

func WithOpenAI ΒΆ

func WithOpenAI(apiKey string) Option

WithOpenAI sets OpenAI as the AI provider (text + embeddings).

func WithOpenAIConfig ΒΆ

func WithOpenAIConfig(cfg provider.OpenAIConfig) Option

WithOpenAIConfig sets OpenAI with custom configuration.

func WithPeriodicSyncConfig ΒΆ

func WithPeriodicSyncConfig(cfg config.PeriodicSyncConfig) Option

WithPeriodicSyncConfig sets the periodic sync configuration.

func WithPostgresVectorchord ΒΆ

func WithPostgresVectorchord(dsn string) Option

WithPostgresVectorchord configures PostgreSQL with VectorChord extension. VectorChord provides both BM25 and vector search.

func WithSQLite ΒΆ

func WithSQLite(path string) Option

WithSQLite configures SQLite as the database. BM25 uses FTS5, vector search uses the configured embedding provider.

func WithSimpleChunking ΒΆ

func WithSimpleChunking() Option

WithSimpleChunking enables fixed-size text chunking instead of AST-based snippet extraction.

func WithSkipProviderValidation ΒΆ

func WithSkipProviderValidation() Option

WithSkipProviderValidation skips the provider configuration validation. This is intended for testing only. In production, embedding and text providers are required for full functionality.

func WithTextProvider ΒΆ

func WithTextProvider(p provider.TextGenerator) Option

WithTextProvider sets a custom text generation provider.

func WithWorkerCount ΒΆ

func WithWorkerCount(n int) Option

WithWorkerCount sets the number of background worker goroutines. Defaults to 1 if not specified.

func WithWorkerPollPeriod ΒΆ

func WithWorkerPollPeriod(d time.Duration) Option

WithWorkerPollPeriod sets how often the background worker checks for new tasks. Defaults to 1 second. Lower values speed up task processing at the cost of more frequent polling β€” useful in tests.

type Parameter ΒΆ added in v1.1.3

type Parameter struct {
	// contains filtered or unexported fields
}

Parameter describes a single parameter accepted by an MCP tool.

func NewParameter ΒΆ added in v1.1.3

func NewParameter(name, description, typ string, required bool) Parameter

NewParameter creates a Parameter.

func (Parameter) Description ΒΆ added in v1.1.3

func (p Parameter) Description() string

Description returns the parameter description.

func (Parameter) Name ΒΆ added in v1.1.3

func (p Parameter) Name() string

Name returns the parameter name.

func (Parameter) Required ΒΆ added in v1.1.3

func (p Parameter) Required() bool

Required reports whether the parameter is required.

func (Parameter) Type ΒΆ added in v1.1.3

func (p Parameter) Type() string

Type returns the parameter type (e.g. "string", "number").

type Tool ΒΆ added in v1.1.3

type Tool struct {
	// contains filtered or unexported fields
}

Tool describes an MCP tool with its parameters.

func NewTool ΒΆ added in v1.1.3

func NewTool(name, description string, parameters []Parameter) Tool

NewTool creates a Tool.

func (Tool) Description ΒΆ added in v1.1.3

func (t Tool) Description() string

Description returns the tool description.

func (Tool) Name ΒΆ added in v1.1.3

func (t Tool) Name() string

Name returns the tool name.

func (Tool) Parameters ΒΆ added in v1.1.3

func (t Tool) Parameters() []Parameter

Parameters returns a copy of the tool's parameters.

Directories ΒΆ

Path Synopsis
application
handler
Package handler provides task handlers for processing queued operations.
Package handler provides task handlers for processing queued operations.
handler/enrichment
Package enrichment provides task handlers for enrichment operations.
Package enrichment provides task handlers for enrichment operations.
service
Package service provides application layer services that orchestrate domain operations.
Package service provides application layer services that orchestrate domain operations.
clients
go
Package kodit provides primitives to interact with the openapi HTTP API.
Package kodit provides primitives to interact with the openapi HTTP API.
cmd
download-model command
Standalone tool that converts the st-codesearch-distilroberta-base model to ONNX format for hugot embedding.
Standalone tool that converts the st-codesearch-distilroberta-base model to ONNX format for hugot embedding.
kodit command
Package main is the entry point for the kodit CLI.
Package main is the entry point for the kodit CLI.
docs
swagger
Package swagger Code generated by swaggo/swag.
Package swagger Code generated by swaggo/swag.
domain
chunk
Package chunk provides domain types for chunk-level metadata.
Package chunk provides domain types for chunk-level metadata.
enrichment
Package enrichment provides domain types for AI-generated semantic metadata.
Package enrichment provides domain types for AI-generated semantic metadata.
repository
Package repository provides Git repository domain types.
Package repository provides Git repository domain types.
search
Package search provides search domain types for hybrid code retrieval.
Package search provides search domain types for hybrid code retrieval.
service
Package service provides domain service interfaces.
Package service provides domain service interfaces.
snippet
Package snippet provides snippet domain types for content-addressed code fragments.
Package snippet provides snippet domain types for content-addressed code fragments.
task
Package task provides task queue domain types for async work processing.
Package task provides task queue domain types for async work processing.
tracking
Package tracking provides progress tracking and reporting types for long-running tasks.
Package tracking provides progress tracking and reporting types for long-running tasks.
infrastructure
api
Package api provides HTTP server and API documentation.
Package api provides HTTP server and API documentation.
api/jsonapi
Package jsonapi provides JSON:API specification compliant types for API responses.
Package jsonapi provides JSON:API specification compliant types for API responses.
api/middleware
Package middleware provides HTTP middleware for the API server.
Package middleware provides HTTP middleware for the API server.
api/v1
Package v1 provides the v1 API routes.
Package v1 provides the v1 API routes.
api/v1/dto
Package dto provides data transfer objects for the API layer.
Package dto provides data transfer objects for the API layer.
chunking
Package chunking provides fixed-size text chunking with overlap for RAG indexing.
Package chunking provides fixed-size text chunking with overlap for RAG indexing.
enricher
Package enricher provides AI-powered enrichment generation.
Package enricher provides AI-powered enrichment generation.
enricher/example
Package example provides extraction of code examples from documentation.
Package example provides extraction of code examples from documentation.
git
Package git provides Git repository infrastructure implementations.
Package git provides Git repository infrastructure implementations.
persistence
Package persistence provides database storage implementations.
Package persistence provides database storage implementations.
provider
Package provider provides AI provider implementations for text generation and embedding generation.
Package provider provides AI provider implementations for text generation and embedding generation.
slicing
Package slicing provides AST-based code snippet extraction using tree-sitter.
Package slicing provides AST-based code snippet extraction using tree-sitter.
slicing/language
Package language provides language-specific AST analyzers.
Package language provides language-specific AST analyzers.
internal
config
Package config provides application configuration.
Package config provides application configuration.
database
Package database provides database connection and session management using GORM.
Package database provides database connection and session management using GORM.
log
Package log provides structured logging with correlation IDs.
Package log provides structured logging with correlation IDs.
mcp
Package mcp provides Model Context Protocol server functionality.
Package mcp provides Model Context Protocol server functionality.
testdb
Package testdb provides a shared test database helper for fast, realistic testing against an in-memory SQLite database.
Package testdb provides a shared test database helper for fast, realistic testing against an in-memory SQLite database.
tools
download-ort command
Build-time tool that downloads the ONNX Runtime shared library and the HuggingFace tokenizers static library for the current platform.
Build-time tool that downloads the ONNX Runtime shared library and the HuggingFace tokenizers static library for the current platform.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL