mnemonic

module

v0.0.9 Latest Latest Go to latest Published: Apr 26, 2026 License: Apache-2.0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/jimschubert/mnemonic

Links

Open Source Insights

README ¶

mnemonic

Attention-based MCP memory controller for LLM coding agents.

mnemonic turns project guidance, lessons learned, and architectural decisions into a shared MCP memory layer. Instead of stuffing a growing pile of AGENTS.md, CLAUDE.md, .cursorrules, and editor-specific instructions into every prompt, agents can retrieve only the memories that matter for the task at hand.

It's built for people who use multiple agents, multiple IDEs, or multiple repositories and want one memory system that stays searchable, versionable, and reusable.

The design mirrors the transformer attention mechanism:

Transformer	mnemonic
Query (Q)	The agent's current task
Key (K)	Entry tags and embeddings
Value (V)	Memory content injected into context
Attention heads	Memory categories

Motivation

mnemonic was inspired by a mix of:

Andrej Karpathy's LLM Wiki
The Attention Is All You Need paper

The problem

Agent memory is fragmented and expensive to maintain. Different tools expect different instruction files, so shared guidance gets copy-pasted across repos and editors. Static files waste context window space regardless of relevance, and when the context window fills up something gets dropped — often the detail that mattered. High-value lessons learned during a session disappear into chat history instead of becoming reusable knowledge.

The solution

mnemonic exposes memory as MCP tools backed by local YAML files and optional semantic search.

Memory is organized into categories and scopes — global, project, or team — so agents query only what's relevant to the task at hand rather than ingesting a monolithic instruction file every session. Entries are version-controlled plain YAML, scored by hit count and recency, and decay naturally over time so high-signal memories stay visible without manual curation.

Unlike Karpathy's wiki (which is agent-controlled), mnemonic is built for collaboration between agents and humans. Agents query for context and store lessons learned according to your system prompt, but humans can also add entries, reinforce confirmed patterns, or demote approaches that didn't work. Optional semantic retrieval via embeddings and a local HNSW index upgrades query quality significantly once your memory store grows beyond a few dozen entries.

Quick start

Install

go install github.com/jimschubert/mnemonic/cmd/mnemonic@latest

Or download a binary from the releases page.

Configure your client

For clients that support stdio transports, use mnemonic stdio.

{
  "mcpServers": {
    "mnemonic": {
      "command": "mnemonic",
      "args": [
        "stdio"
      ]
    }
  }
}

If your client only supports HTTP, run mnemonic server and connect to http://localhost:20001/mcp.

Configure your agent

This is a good starting instruction block for most coding agents:

## Memory

Before starting any task, call `mnemonic_query` with a description of the work.
Always query the `avoidance` and `security` categories first.

Available categories:

- avoidance — mistakes, failed approaches, things that do not work
- security — security constraints and risks
- architecture — design decisions and rationale
- syntax — patterns and code conventions that worked well
- domain — project-specific knowledge

Do not create new categories unless a human explicitly asks for one.
Default scope should be `project`.
Default source should be `agent:YYYY-MM-DD`.

If the user says "remember this" or "add this to memory", call `mnemonic_add`.
Use `mnemonic_reinforce` with `+0.1` for confirmed patterns and `-0.2` for rejected ones.

Enable semantic search (optional)

mnemonic performs standard searches without embeddings, but semantic search is better. Configure an embedding endpoint and build the local HNSW index to use semantic querying and deduplication.

# ~/.mnemonic/config.yaml
embeddings:
  endpoint: http://127.0.0.1:1234/v1/embeddings
  model: nomic-ai/nomic-embed-text-v1.5

mnemonic embed

If embeddings are unavailable, mnemonic falls back to category and keyword search.

Keeping the store tidy

Use mnemonic lint to analyze your memory store for any similarities which need to be merged/deleted (requires embeddings):

# Analyze with default 90% similarity threshold
mnemonic lint

# Use a lower threshold to catch more potential duplicates
mnemonic lint --threshold 0.85

This is an interactive command allowing you to preview and merge/delete entries.

[!NOTE] The index uses approximate nearest neighbor (ANN) search, so it may not return all similar entries all the time. That is, you might run mnemonic lint ten times and have fewer entries 2-3 of those times.

MCP Tools

mnemonic exposes four MCP tools:

Tool	Purpose
`mnemonic_query`	Retrieve relevant memories for a task, optionally filtered by category and scope
`mnemonic_add`	Store a new memory entry
`mnemonic_reinforce`	Increase or decrease a memory's score
`mnemonic_list_heads`	List available categories and entry counts

Typical flow:

Query avoidance and security first, either separately or together.
Query another category or a broader task description.
Use the returned memories while doing the work.
Store or reinforce anything worth keeping.

Example mnemonic_query input:

{
  "query": "update GitHub workflows for Go 1.26 and verify pwn-request safety",
  "categories": [
    "avoidance",
    "security"
  ],
  "top_k": 5,
  "scopes": [
    "project",
    "global"
  ]
}

category is allowed for a single category, but categories is the preferred field for multi-category queries. top_k is an overall limit across the returned result set, so top_k: 5 may return 3 avoidance and 2 security entries, for example.

Example mnemonic_add input:

{
  "content": "When an MCP stdio client receives `session not found`, invalidate the session and reconnect.",
  "category": "architecture",
  "tags": [
    "mcp",
    "stdio",
    "session-management"
  ],
  "scope": "project",
  "source": "agent:2026-04-19"
}

How it works

Runtime model

mnemonic stdio is the default path for editor integrations.
- It auto-starts the daemon if needed.
- It proxies MCP calls over stdio to the daemon, which handles storage and embedding.
mnemonic server starts the HTTP MCP server and daemon-backed storage directly.
- You can start mnemonic stdio separately; it knows what to do.
mnemonic stop asks the running daemon to shut down cleanly.
- To avoid stale sessions and errors, any open stdio processes will detect the shutdown and exit.

The MCP server is hosted over unix socket by default with an optional HTTP server. The unix socket is a streaming JSON-RPC server which is accessible easily using the MCP SDK. See socketSend in store.go for an example of how to interact with the server programatically.

Storage model

Memory is stored as YAML on disk, grouped by scope and category.

Scope	Description
`global`	User-wide memory shared across repositories
`project`	Repository-local memory
`team:<name>`	0..N shared team directories you opt into

Example of the default directory layout:

~/.mnemonic/
├── config.yaml
├── global/
│   ├── avoidance.yaml
│   ├── security.yaml
│   └── syntax.yaml
└── index.hnsw

.mnemonic/
└── project/
    ├── architecture.yaml
    └── domain.yaml

Each category file contains versioned entries such as:

version: 1
entries:
  - id: go-error-wrapping
    content:
      Wrap errors with context using fmt.Errorf("doing X: %w", err).
    tags: [ go, errors, style, fmt ]
    category: syntax
    scope: global
    score: 0.9
    hit_count: 12
    last_hit: 2026-04-08T00:00:00Z
    created: 2026-03-20T00:00:00Z
    source: manual

Retrieval model

When embeddings are configured and indexed, mnemonic_query attempts semantic search first. If embeddings are not configured or semantic lookup fails, it falls back to keyword and category-based search.

Ranking is influenced by score, hit count, and recency of use.

That means important memories stay visible, but stale memories naturally decay over time.

Commands

mnemonic --help

Command	Description
`mnemonic stdio`	Serve MCP over stdio and auto-start the daemon if needed
`mnemonic server`	Start the HTTP MCP server and backing daemon
`mnemonic embed`	Fetch embeddings and build or refresh the HNSW index
`mnemonic lint`	Analyze memory store for redundancy and resolve issues interactively (requires embeddings)
`mnemonic store`	Interact with the memory store directly (daemon must be running)
`mnemonic stop`	Request shutdown of the running daemon
`mnemonic compact`	Compact the text of all memories in the store to reduce token usage

Run mnemonic <command> --help for options or subcommands.

[!TIP] mnemonic compact requires an OpenAI-compatible /chat/completions endpoint. If you are using a Mac with a Silicon chip, you can expose the Apple Tahoe (and higher) 3B parameter LLM. It's local, won't use your token quota, and may be faster than other locally hosted models for this task. See https://apfel.franzai.com/ for more details and installation instructions.

Once installed, run apfel --serve and run mnemonic compact with these options:
mnemonic compact --base-url http://127.0.0.1:11434/v1 \
    --api-key abcd123 \
    --model apple-foundationmodel

Useful examples

# Start the MCP server (default, or explicitly with stdio)
mnemonic server --server-addr localhost:9999
mnemonic stdio

# Start the MCP server with additional team scopes
mnemonic server --team /shared/acme --team /shared/platform

# Manage embeddings and index
mnemonic embed
# Or, if your index gets corrupted (e.g. changing embedding model and/or dimensions)
mnemonic embed --force

# Clean up your memory store (merge/delete)
mnemonic lint --threshold 0.85

# Interact with the store outside of an agent (daemon must be running)
mnemonic store query --query "Go error handling" --category syntax
mnemonic store query --query "workflow safety" --category avoidance,security
mnemonic store add --content "Example pattern" --category syntax --tags go,error
mnemonic store list-heads
mnemonic store reinforce --id go-error-wrapping --delta 0.1

# Stop the server and daemon
mnemonic stop

Configuration

Configuration is resolved in this order, highest precedence first:

CLI flags
Environment variables
.mnemonic/config.yaml
~/.mnemonic/config.yaml
Built-in defaults

Example global config:

log_level: info
server_addr: localhost:20001
socket_path: ~/.mnemonic/mnemonic.sock
client_timeout_sec: 5

logging:
  store: debug
  server: warn

embeddings:
  endpoint: http://127.0.0.1:1234/v1/embeddings
  model: nomic-ai/nomic-embed-text-v1.5

index:
  # NOTE: This must match the length of the vectors returned by the embedding endpoint
  # and is validated during `mnemonic embed` preflight.
  # For OpenAI's text-embedding-3-small, use 1536. For LM Studio's nomic-embed-text-v1.5, use 768.
  # A mismatch with an existing index requires a force rebuild with `mnemonic embed --force`.
  dimensions: 768
  # The number of bi-directional links created for each new entry.
  # A good default for OpenAI embeddings is 16.
  connections: 16
  # The level generation factor.
  # For 0.25, each layer is 1/4 the size of the previous layer.
  level_factor: 0.25
  # The maximum number of entries per layer. Higher values improve search accuracy at the expense of memory.
  # 20-50 is a reasonable default.
  ef_search: 50

Example project config:

log_level: debug
server_addr: localhost:9999

Key options:

Option	Purpose
`log_level`	Default log level
`logging`	Per-scope log levels, such as `store` or `server`
`server_addr`	HTTP MCP listen address
`socket_path`	Unix socket path used by the daemon
`client_timeout_sec`	Timeout for embedding and daemon HTTP clients
`embeddings.*`	Embedding endpoint, model, auth token, and preflight behavior
`index.*`	HNSW index parameters

For the full configuration surface, see internal/config/config.go.

Team scopes

Pass one or more --team directories to load additional shared scopes. Each team directory becomes team:<basename>, so /shared/acme becomes team:acme.

mnemonic server --team /shared/acme --team /shared/platform --server-addr localhost:9999

This makes it easy to layer memory like this:

global: your personal reusable patterns
team:acme: shared team conventions
project: repo-specific context

Embeddings and semantic search

Semantic search is optional, but it is one of the biggest quality-of-life upgrades once you have more than a few dozen memories.

mnemonic embed:

validates the embedding endpoint unless you disable preflight
embeds stored entries
builds or refreshes the HNSW index
enables semantic retrieval for mnemonic_query

The default embedding settings are aimed at a local LM Studio-compatible endpoint, but any compatible embeddings API should work if it returns vectors with the configured dimensions.

License

Apache 2.0, see LICENSE

Directories ¶

Path	Synopsis
cmd
mnemonic command
internal
compact
config
controller
daemon
embed
index
index/hnsw
lint
logging
server
store
store/yamlstore

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL