texvec

command module

v1.0.0 Latest Latest Go to latest Published: Mar 22, 2026 License: MIT Imports: 1 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/arcnem-ai/texvec

Links

Open Source Insights

README ¶

texvec

Local-first text similarity search.

日本語 · Install · Quick Start · Models · Development

texvec is an open-source CLI for text similarity search. It summarizes documents locally with ONNX models, embeds both summaries and overlapping document chunks, stores the vectors in a local libsql database, and ranks matches with cosine distance.

Built by Arcnem AI, texvec reflects how we like to ship applied AI tools: local-first, inspectable, and useful without a cloud control plane.

Why texvec

Local summaries, local embeddings, local storage. Your documents stay on your machine.
Simple CLI workflow. init, summarize, embed, search, and list are enough to get useful results quickly.
No external vector database. Similarity search runs from a local libsql database.
Practical text indexing. texvec stores both a document summary embedding and chunk embeddings so searches can match the gist or a specific section.

Install

Download a release asset from GitHub Releases, or install with Go:

go install github.com/arcnem-ai/texvec@latest

To build from source:

git clone https://github.com/arcnem-ai/texvec.git
cd texvec
go build -o texvec

Release archives are expected to follow the same primary targets as picvec: macOS (arm64) and Linux (amd64).

Quick Start

texvec init
texvec summarize test_texts/galaxies.md
texvec embed test_texts/galaxies.md
texvec search --text "dark matter in spiral galaxies"
texvec list

texvec init downloads ONNX Runtime, creates ~/.texvec/, initializes the local database, and fetches the default summary and embedding models.

Commands

Command	What it does
`init`	Download ONNX Runtime and the default models
`summarize [document]`	Generate and print a summary without writing to the database
`embed [document]`	Summarize, chunk, embed, and store a document
`search [document]`	Find similar indexed documents
`search --text "..."`	Search from raw text
`list`	List indexed documents
`set-embedding-model [name]`	Set the default embedding model
`set-summary-model [name]`	Set the default summary model
`config`	Show current configuration
`clean`	Remove all `texvec` data

Global flag:

-v, --verbose enables extra runtime output

Common Examples

Preview a summary:

texvec summarize notes.md
texvec summarize notes.md --summary-model flan-t5-small

texvec summarize is preview-only. It does not write to the database.

Index a document:

texvec embed notes.md
texvec embed notes.md -m bge-small-en-v1.5
texvec embed notes.md --summary-model flan-t5-small

If the document content hash is unchanged, texvec reuses existing summary and chunk data where possible.

Search for similar documents:

texvec search notes.md
texvec search --text "barred spiral galaxy dark matter"
texvec search --text "barred spiral galaxy dark matter" -k 10
texvec search notes.md -m bge-small-en-v1.5

Results are sorted by cosine distance. When searching with an already indexed document path, texvec excludes that same path from the results.

Flag	Description	Default
`-k, --limit`	Number of results	5
`-m, --model`	Embedding model to use	Config default
`--summary-model`	Summary model to use for long-query reduction	Config default

List indexed documents:

texvec list
texvec list -m all-minilm-l6-v2
texvec list -k 20

Flag	Description	Default
`-k, --limit`	Max documents to show	All
`-m, --model`	Filter by embedding model	All

Change defaults:

texvec set-embedding-model bge-small-en-v1.5
texvec set-summary-model flan-t5-small

Models

Embedding Models

Name	Embedding Dim	Notes
`all-minilm-l6-v2`	384	Default. Fast and good for general-purpose retrieval.
`bge-small-en-v1.5`	384	Retrieval-focused model with a query prefix for search.

Summary Models

Name	Notes
`flan-t5-small`	Default summary model for `1.0.0`. Small, local, and easy to ship in a plain Go CLI.

Models are downloaded from Hugging Face on first use and stored locally under ~/.texvec/models/.

How It Works

A supported text document is loaded from .txt, .md, or .markdown.
texvec computes a content hash to determine whether indexing work needs to be refreshed.
The selected summary model generates a document summary.
The selected embedding model embeds both the summary and overlapping chunks from the original document.
Search compares the query embedding against stored summary embeddings and chunk embeddings.
texvec merges those scores and returns document-level results ordered by cosine distance.

Data Storage

All runtime data lives in ~/.texvec/:

~/.texvec/
  config.json       # Configuration such as the default models
  texvec.db         # libsql database
  models/           # Downloaded ONNX model files and tokenizer assets
  lib/              # ONNX Runtime shared library

Use texvec clean to remove everything.

Repository Layout

cmd/ Cobra commands and user-facing output
core/ Model registry, runtime setup, downloads, text chunking, summarization, and embedding pipeline
store/ Schema migration, inserts, listing, and vector search queries
config/ ~/.texvec path helpers and config bootstrapping
test_texts/ Sample documents for manual testing

Platforms

OS	Published Release	Hardware Acceleration
macOS	`arm64`	CPU
Linux	`amd64`	CPU

texvec currently defaults to CPU execution for predictable CLI behavior and cleaner output across machines.

ONNX Runtime 1.24.3 is downloaded automatically on first run.

Development

go test ./...
go build -o texvec

See CONTRIBUTING.md for contribution workflow and AGENTS.md for repo-specific agent instructions.

Built by Arcnem AI.

Documentation ¶

Overview ¶

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
cmd
config
core
store

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL