gora

module
v0.0.0-...-7f6d387 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2026 License: MIT

README

GoRa - A self-hosted AI companion with long-term memory, knowledge graphs, and emotional awareness

Go Version Go Report Card Go Reference

GoRa (Go based Retrieval Augmented Generation) is a self-hosted AI system that goes beyond simple document Q&A. It combines Ollama for local LLM inference, Redis for vector search, and Neo4j for structured knowledge graphs - and can operate in two modes:

  • Documentation mode: Chat with your local docs. Shared vector store, factual answers, no hallucinations.
  • Companion mode: A persistent AI companion that remembers you across sessions, builds a knowledge graph about your life, tracks emotional context, adapts its personality over time, and reaches out when you've been away.

Architecture

GoRa uses a hybrid retrieval strategy combining three complementary memory systems:

  • Vector Search (Redis): Semantic similarity search over document embeddings for context-aware retrieval.
  • Knowledge Graph (Neo4j): Structured facts and relationships extracted from documents and conversations - with confidence scoring, conflict detection, and hard-fact protection.
  • Rolling Memory: Session summaries that compress long conversations into concise context, preventing token overflow. Episodic memories are archived to Neo4j for long-term recall.

Requirements

To run GoRa, you need the following components:

  • Go (1.25.3 or higher)
  • Docker & Docker Compose (for Redis Stack & Neo4j)
  • Ollama with the following models (models can be changed in config.yml):
    • mxbai-embed-large: For high-performance embeddings.
    • mistral-nemo: For synthetic question generation, knowledge extraction, summarization, and safety classification.
    • gpt-oss:20b: For generating precise, context-aware answers.
    • qwen2.5:14b: Used for knowledge graph extraction.

Getting Started

We provide a Makefile to simplify all common tasks.

1. Spin up the infrastructure

Start Redis Stack and Neo4j:

make up
2. Prepare your data

Place your documentation (Markdown or Text files) into the /data directory. The system will automatically parse, chunk, and generate synthetic questions for these files to improve search accuracy.

3. Populate the databases

Convert your text into vectors and extract knowledge graphs:

make import

You can also import a specific file or directory:

go run ./cmd/database/import.go -path /path/to/file_or_dir
4. Start the conversation

Option A: Interactive CLI

make run

Option B: HTTP Server with Web UI

make http

Then open http://localhost:8080 in your browser.

Configuration

GoRa uses a config.yml file for all settings. If an env.yml file exists, it takes priority (useful for local overrides - env.yml is gitignored).

General Settings
Key Description Default
settings.debug Enables verbose logging (similarity scores, prompts, graph queries). false
settings.max_history_messages Number of recent chat messages included in the LLM prompt. 10
settings.chat_history_ttl Time-to-live for chat history in hours. 0 = persist forever. 0
Logging
Key Description Default
settings.log_format Log output format: json for machine-readable, text for human-readable. text
settings.log_level Minimum log level: debug, info, warn, error. info
settings.log_path File path for log output. Empty = per-binary default (see below). logs/gora.log

Environment variable overrides: GORA_LOG_FORMAT, GORA_LOG_LEVEL, GORA_LOG_PATH.

When log_path is not set or empty, each binary writes to its own default file:

  • gora (CLI): ./logs/cli.log
  • gora-server (HTTP): ./logs/http.log
  • gora-import (Import): ./logs/import.log

Log files are automatically rotated: max 10 MB per file, 5 backups retained, 30 days max age, gzip-compressed.

Database - Redis (Vector Store)
Key Description Default
database.redis.ollama_model Model used for creating vector embeddings. mxbai-embed-large
database.redis.ollama_url Ollama URL for embedding generation. http://127.0.0.1:11434
database.redis.uri Redis connection string. Override: GORA_REDIS_URI. redis://localhost:6379
database.redis.index_name Base name for the search index in Redis. gora-doc
database.redis.append_model_name_to_index Appends the model name to the index (e.g., gora-doc-mxbai-embed-large). Prevents index pollution when switching models. true
database.redis.embed_dimension Dimension of the embedding vectors. Must match the model. 1024
database.redis.top_k_results Number of documents returned by similarity search for context. 5
database.redis.hnsw_m HNSW max edges per node. Higher = better recall, more memory. 16
database.redis.hnsw_ef_construction HNSW exploration factor at build time. Higher = better index quality, slower builds. 200
database.redis.hnsw_ef_runtime HNSW exploration factor at query time. Higher = better recall, slower queries. 10
Database - Neo4j (Knowledge Graph)
Key Description Default
database.neo4j.ollama_model Model used for knowledge graph extraction. mistral-nemo
database.neo4j.ollama_url Ollama URL for extraction. http://127.0.0.1:11434
database.neo4j.uri Neo4j Bolt connection URI. Override: GORA_NEO4J_URI. bolt://localhost:7687
database.neo4j.username Neo4j username. Override: GORA_NEO4J_USERNAME. neo4j
database.neo4j.password Neo4j password. Override: GORA_NEO4J_PASSWORD. password
database.neo4j.max_connection_lifetime Max lifetime of a connection in minutes. 30
database.neo4j.max_connection_pool_size Max number of connections in the pool. 50
database.neo4j.graph_search_limit Max results from graph knowledge searches. 100
Setup (Document Import)
Key Description Default
setup.ollama_model_synthetic_questions Fast model for generating synthetic questions during ingestion. mistral-nemo
setup.ollama_synthetic_questions_url Ollama URL for synthetic question generation. http://127.0.0.1:11434
setup.data_root_path Local directory containing your documentation files. data
setup.redis_chunk_size Character count per document chunk. 500
setup.redis_chunk_overlap Character overlap between chunks to maintain context. 100
setup.synthetic_question_worker_count Number of parallel workers for question generation. 8
Summarizer (Rolling Memory)
Key Description Default
summarizer.ollama_model Model used for generating conversation summaries. mistral-nemo
summarizer.ollama_url Ollama URL for summarization. http://127.0.0.1:11434
summarizer.rolling_chunk_size Number of oldest messages summarized per rolling cycle. 10
summarizer.session_timeout_minutes Idle time in minutes before a session is archived and a fresh one starts. 0 = disabled. 15
summarizer.min_starter_gap_hours Minimum hours since last activity before a conversation starter is generated. 2
Safety
Key Description Default
safety.ollama_model Model used for safety classification and crisis detection. mistral-nemo
safety.ollama_url Ollama URL for safety analysis. http://127.0.0.1:11434
safety.crisis_resources Newline-separated helpline information injected during crisis events. (see config.yml)
HTTP Server
Key Description Default
http.port Port for the HTTP server. 8080
http.read_timeout Read timeout in seconds. 120
http.write_timeout Write timeout in seconds. 120
http.sse_flush_tokens Tokens to buffer before flushing SSE. 1 = flush every token. Higher values reduce syscalls. 1
http.trusted_proxies List of IP/CIDR ranges allowed to set X-Forwarded-For. Empty = never trust. []
http.cors_allowed_origins Allowed CORS origins. ["*"]
http.rate_limit.enabled Enable rate limiting. true
http.rate_limit.requests_per_second Rate limit for API requests. 2.0
http.rate_limit.burst Maximum burst size for rate limiting. 5
http.rate_limit.cleanup_interval_s Stale rate limit entry cleanup interval in seconds. 300
http.timing.enabled Enable variable response delays simulating human typing (companion mode). true
http.timing.min_delay_ms Minimum delay before streaming starts in milliseconds. 500
http.timing.max_delay_ms Maximum delay before streaming starts in milliseconds. 3000
AI - Main LLM
Key Description Default
ai.ollama_url Ollama URL for the main LLM. Override: GORA_OLLAMA_URL. http://127.0.0.1:11434
ai.ollama_model Main model for generating final answers. gpt-oss:20b
ai.temperature Creativity of the main model. 0.0 = deterministic, 2.0 = max. 0.0
ai.ollama_draft_url Ollama URL for the draft model. http://127.0.0.1:11434
ai.ollama_draft_model Fast model for drafts, starters, and lightweight tasks. Empty = disabled. mistral-nemo
ai.draft_temperature Temperature for the draft model. 0.0
ai.draft_timeout Timeout for draft generation in seconds. 10
ai.num_ctx Context window size for the main model in tokens. 16384
ai.draft_num_ctx Context window size for the draft model in tokens. 4096
ai.ollama_vision_model Multimodal model for image descriptions (e.g., LLaVA). Empty = disabled. (empty)
ai.ollama_vision_url Ollama URL for the vision model. http://127.0.0.1:11434
ai.vision_timeout Timeout for vision model calls in seconds. 60
AI - Persona
Key Description Default
ai.name The name of your AI assistant. GoRa
ai.gender Gender identity (used in some persona prompts). diverse
ai.persona.style Instruction for the AI's communication style. (see config.yml)
ai.persona.tone Instruction for the AI's tone. (see config.yml)
ai.persona.background Background context the AI is given about itself. Supports {{ .name }} template. (see config.yml)
ai.persona.type_of_relationship Relationship type: friendship, romantic, mentor, or empty. friendship
ai.persona.key_traits Comma-separated personality traits. loyal,sassy,extroverted,proactive
AI - Prompts
Key Description Default
ai.prompts.system_path Path to the main system prompt template. ./prompts/system.txt
ai.prompts.extraction_path Path to the knowledge extraction prompt (user messages). ./prompts/internal/extract.txt
ai.prompts.extraction_ai_path Path to the knowledge extraction prompt (AI responses). ./prompts/internal/extract_ai.txt
ai.prompts.rolling_update_path Path to the rolling summary update prompt. ./prompts/internal/rolling_update.txt
ai.prompts.safety_path Path to the safety classification prompt. ./prompts/internal/safety.txt
ai.prompts.summary_safety_path Path to the summary safety analysis prompt. ./prompts/internal/summary_safety.txt
ai.prompts.emotional_residue_path Path to the emotional residue extraction prompt. ./prompts/internal/emotional_residue.txt
ai.prompts.crisis_template_path Path to the crisis response template. ./prompts/internal/crisis.txt
ai.prompts.starter_path Path to the conversation starter prompt (companion mode). ./prompts/internal/starter.txt
ai.prompts.onboarding_paths List of prompt paths for each onboarding phase. Count must match onboarding_sessions. [onboarding_1.txt, onboarding_2.txt, onboarding_3.txt]
ai.prompts.onboarding_starter_path Path to the onboarding-specific conversation starter prompt. ./prompts/internal/onboarding_starter.txt
ai.prompts.outreach_path Path to the LLM-generated outreach message prompt. ./prompts/internal/outreach.txt
AI - Season Context
Key Description Default
ai.season_context Map of date-range keys to seasonal descriptions injected into the system prompt. Keys: jan_early, jan_late, feb, mar, apr, may, jun, jul, aug, sep, oct, nov, dec_advent, dec_holiday. (see config.yml for defaults - Central European climate)
AI - Graph
Key Description Default
ai.graph.allowed_relationships Whitelist of relationship types the extraction model is allowed to create. (see config.yml for full list)
Features
Key Description Default
features.active_learning Enables companion mode: per-user isolation, knowledge extraction, rolling memory. false
features.onboarding_enabled Enables guided onboarding for new users. Requires active_learning: true. true
features.onboarding_sessions Number of onboarding sessions before switching to the normal system prompt (1-10). 3
features.consent_before_storing Enables per-category opt-in/opt-out before storing personal facts. false
features.life_simulation Enables GoRa's life simulation layer - mood, activity, thoughts at session start. false
features.life_simulation_probability Probability (0.0-1.0) of generating a life state per session. 0.6
features.self_disclosure Enables reciprocal self-disclosure - GoRa occasionally shares about itself. false
features.self_disclosure_max_per_session Max self-disclosures per session. 1
features.self_disclosure_balance_threshold User:AI disclosure ratio above which GoRa is nudged to share. 1.5
features.spontaneous_outreach Enables non-event-based proactive messages after user inactivity. false
features.spontaneous_min_gap_hours Minimum hours since last interaction before spontaneous outreach. 6
features.backup.enabled Enables automatic per-user data backups. false
features.backup.interval_hours Time between backup runs in hours. 24
features.backup.dir Directory for backup files. ./backups/users
features.escalation_enabled Enables emotional escalation in outreach messages (casual → self-doubt → missing). false
features.escalation_min_gap_hours Minimum hours between escalation level increases. 6
features.backup.max_per_user Max backup files per user. Oldest are pruned. 5

Asymmetric Model Strategy

GoRa allows you to use different models for different tasks to optimize performance:

  • Final Generation: Use a large model (e.g., gpt-oss:20b) for high-quality reasoning.
  • Knowledge Extraction / Synthetic Questions / Summarization: Use medium/fast models (e.g., mistral-nemo, qwen2.5:14b) for speed.
  • Embeddings: Specialized models like mxbai-embed-large for state-of-the-art retrieval.

Parallel Ingestion

The synthetic_question_worker_count setting controls how many document chunks are processed simultaneously.

Tip: If you have a high-end GPU with plenty of VRAM (48GB+), increase this value and set OLLAMA_NUM_PARALLEL in your environment to match for true hardware parallelism.

Index Management

By setting append_model_name_to_index: true, GoRa automatically separates your data when you switch embedding models. Since embeddings from different models are not compatible, this prevents polluting your search results.

Prompt Customization

GoRa ships with multiple prompt templates in the /prompts directory:

File Purpose
system.txt Default technical documentation assistant. Strict, factual, no hallucinations.
friend.txt Personal companion with time-awareness, memory usage, and persona adaptation.
onboarding_1.txt Onboarding phase 1: First meeting - introduction, no prior context.
onboarding_2.txt Onboarding phase 2: Reconnecting - getting to know interests, building familiarity.
onboarding_3.txt Onboarding phase 3: Deepening the connection - relationship becomes comfortable.
internal/extract.txt Knowledge graph extraction from user messages (entities & relationships).
internal/extract_ai.txt Knowledge graph extraction from AI responses.
internal/extract_system.txt System extraction prompt.
internal/rolling_update.txt Rolling summary updates during long conversations.
internal/safety.txt Safety classification prompt (sentiment, crisis level).
internal/summary_safety.txt Safety analysis on rolling summaries.
internal/emotional_residue.txt Emotional state extraction at session end.
internal/crisis.txt Crisis response template with helpline resources.
internal/starter.txt Conversation starter generation for returning users (companion mode).
internal/onboarding_starter.txt Phase-aware conversation starter for onboarding sessions.
internal/outreach.txt LLM-generated outreach message template (escalation levels).

Switch the active prompt by changing ai.prompts.system_path in your config.

Makefile Commands

Infrastructure
Command Description
make up Start Docker stack (Redis & Neo4j).
make down Stop Docker stack.
make restart Restart Docker stack (down + up).
make status Show Docker container status.
Application
Command Description
make build Compile all binaries into /bin.
make build-release-doku Build a .deb package for server deployment (VERSION=x.y.z DEB_ARCH=amd64|arm64).
make run Start GoRa interactive CLI.
make http Start GoRa HTTP server.
make import Chunk documents and populate Redis + Neo4j.
make create-user Create a new user for companion mode (active_learning).
make deps Update all Go dependencies and tidy go.mod.
make clean Remove compiled binaries.
Testing
Command Description
make test Run unit + integration tests.
make test-unit Run unit tests only.
make test-integration Run integration tests (requires Redis, Neo4j, and Ollama).
make test-integration-llm Run integration tests including actual LLM calls (5 min timeout).
make test-coverage Run tests with coverage and open HTML report.
Database Management
Command Description
make wipe Wipe both Redis and Neo4j.
make wipe-redis Wipe Redis only.
make wipe-graph Wipe Neo4j only.
Ollama
Command Description
make ollama Pull all models and list all.
make ollama-pull Download models from the Ollama library.
make ollama-list List all locally available Ollama models.
Backup & Restore
Command Description
make backup Backup both databases.
make backup-redis Backup Redis only (snapshot to /backups).
make backup-graph Backup Neo4j only (archive to /backups).
make restore Restore both databases from latest backups.
make restore-redis Restore Redis from latest backup.
make restore-graph Restore Neo4j from latest backup.

API Endpoints

Core (always available)
Method Endpoint Description
POST /api/chat Send a message, receive full response as JSON.
POST /api/chat/stream Send a message, receive response as SSE stream.
GET /api/history?session_id=... Retrieve recent chat history for a session.
GET /health Health check - returns Redis, Neo4j, and Ollama connectivity status.
Companion Mode (active_learning: true)

These endpoints are only available when companion mode is enabled.

Conversation
Method Endpoint Description
GET /api/starter?session_id=... Generate a conversation starter based on past context.
GET /api/notifications Get pending proactive notifications and follow-ups.
Memory Management
Method Endpoint Description
GET /api/memory Retrieve all stored entities and relationships for the authenticated user.
DELETE /api/memory/entity?name=... Delete an entity and all its relationships.
DELETE /api/memory/relationship?source=...&target=...&type=... Delete a specific relationship between two entities.
DELETE /api/memory/forget?keyword=... Bulk delete entities and relationships matching a keyword.
PUT /api/memory/sensitivity?source=...&target=...&type=...&level=... Set sensitivity level (low, medium, high) on a relationship.
PUT /api/memory/confirm?source=...&target=...&type=... Mark a fact as confirmed (prevents overwriting and decay).
PUT /api/memory/approve?source=...&target=...&type=... Approve a pending consent fact.
DELETE /api/memory/reject?source=...&target=...&type=... Reject a pending consent fact.
Method Endpoint Description
GET /api/consent Get user consent preferences (per-category opt-in/opt-out).
PUT /api/consent Update user consent preferences.
GET /api/outreach/preference Get spontaneous outreach preference.
PUT /api/outreach/preference Set spontaneous outreach preference.
Data Management
Method Endpoint Description
GET /api/export Export all user data as JSON (graph, history, episodic memories, profile).
POST /api/import Import user data from JSON backup.
DELETE /api/delete-account Delete all user data permanently (right to be forgotten).
Administration
Method Endpoint Description
POST /api/admin/users Create a new user.
GET /api/admin/metrics Get aggregated conversation quality metrics.
Request Body (/api/chat and /api/chat/stream)
{
  "message": "How does GoRa handle embeddings?",
  "session_id": "my-session-id",
  "image": "<base64-encoded image data (optional)>",
  "mime_type": "image/png (required when image is set)"
}
Request Body (/api/admin/users)
{
  "name": "Alice",
  "email": "alice@example.com"
}
Health Check Response (/health)
{
  "status": "ok",
  "redis": "ok",
  "neo4j": "ok",
  "ollama": "ok"
}

Returns 200 OK when all services are healthy, 503 Service Unavailable when degraded.

Security

API Authentication

GoRa supports two authentication modes depending on the operating mode:

Documentation mode (active_learning: false): Optional shared API key via the GORA_API_KEY environment variable. If no key is configured, the API is accessible without authentication - this is acceptable for local development but should never be used in a publicly reachable environment.

Companion mode (active_learning: true): Per-user API key authentication. Each user receives a unique API key when created via make create-user or the /api/admin/users endpoint. The first user must be created via CLI.

Setting up shared API key (documentation mode)

Generate a cryptographically secure key and export it:

export GORA_API_KEY="$(openssl rand -hex 32)"

Then start the server as usual:

make http

You can create a .env file, too, add the GORA_API_KEY=1234567890abcdef and run it like

source .env && make http

All API requests must include the key as a Bearer token in the Authorization header:

curl -X POST http://localhost:8080/api/chat \
 -H "Authorization: Bearer <your-key>" \
 -H "Content-Type: application/json" \
 -d '{"message": "What is GoRa?", "session_id": "my-session"}'

Requests without a valid key will receive a 401 Unauthorized response. If you are running GoRa behind a reverse proxy (e.g. nginx or Traefik), configure http.trusted_proxies with your proxy IPs so that rate limiting operates on the real client IP rather than the proxy address.

Security Headers

GoRa sets the following security headers on all responses:

  • Content-Security-Policy
  • X-Content-Type-Options: nosniff
  • X-Frame-Options: DENY
  • Referrer-Policy
CORS

CORS is configured globally via http.cors_allowed_origins. Defaults to ["*"] - restrict this in production.

Environment Variables

Variable Description
GORA_API_KEY Shared API key (documentation mode) or admin key (companion mode).
GORA_REDIS_URI Override Redis connection URI.
GORA_NEO4J_URI Override Neo4j connection URI.
GORA_NEO4J_USERNAME Override Neo4j username.
GORA_NEO4J_PASSWORD Override Neo4j password.
GORA_OLLAMA_URL Override main Ollama URL.
GORA_OLLAMA_SUMMARIZER_URL Override Ollama URL for summarization.
GORA_OLLAMA_EMBED_URL Override Ollama URL for embeddings.
GORA_OLLAMA_GRAPH_URL Override Ollama URL for graph extraction.
GORA_OLLAMA_SYNTHETIC_QUESTIONS_URL Override Ollama URL for synthetic question generation.
GORA_LOG_FORMAT Override log format (json or text).
GORA_LOG_LEVEL Override log level (debug, info, warn, error).
GORA_LOG_PATH Override log file path.

License

This project is licensed under the MIT License.

Directories

Path Synopsis
cmd
cli command
database command
http command
user command
pkg
backup
Package backup provides automatic per-user data backups (7.3).
Package backup provides automatic per-user data backups (7.3).
config
Package config handles loading and validation of GoRa's YAML configuration.
Package config handles loading and validation of GoRa's YAML configuration.
engine
Package engine orchestrates the RAG conversation loop.
Package engine orchestrates the RAG conversation loop.
extractor
Package extractor uses an LLM to extract structured knowledge graph data (entities and typed relationships) from natural language text.
Package extractor uses an LLM to extract structured knowledge graph data (entities and typed relationships) from natural language text.
graph
Package graph implements all Neo4j operations for the knowledge graph.
Package graph implements all Neo4j operations for the knowledge graph.
memory
Package memory provides rolling session summarization and long-term episodic archival.
Package memory provides rolling session summarization and long-term episodic archival.
metrics
Package metrics tracks conversation quality and usage statistics in Redis.
Package metrics tracks conversation quality and usage statistics in Redis.
middleware
Package middleware provides HTTP middleware for authentication, rate limiting, security headers, and request tracing.
Package middleware provides HTTP middleware for authentication, rate limiting, security headers, and request tracing.
resilience
Package resilience provides error recovery primitives: retry with exponential backoff and circuit breaker pattern for external service calls (R.2).
Package resilience provides error recovery primitives: retry with exponential backoff and circuit breaker pattern for external service calls (R.2).
safety
Package safety provides asynchronous content classification for user messages in companion mode.
Package safety provides asynchronous content classification for user messages in companion mode.
slogger
Package slogger provides structured logging utilities built on log/slog (R.1).
Package slogger provides structured logging utilities built on log/slog (R.1).
store
Package store provides the unified data access layer for GoRa.
Package store provides the unified data access layer for GoRa.
user
Package user manages user accounts for companion mode.
Package user manages user accounts for companion mode.
util
Package util provides shared helper functions for sanitizing LLM output and protecting template rendering.
Package util provides shared helper functions for sanitizing LLM output and protecting template rendering.
validate
Package validate provides input validation at system boundaries.
Package validate provides input validation at system boundaries.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL