constellation

package
v0.0.0-...-dddc2fe Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: MIT Imports: 19 Imported by: 0

Documentation

Overview

Package constellation implements a knowledge graph over cogdocs.

This package indexes all *.cog.md files in the workspace and provides fast FTS5-powered search for cog-chat Tier 4 context retrieval.

Index

Constants

View Source
const Schema = `` /* 4964-byte string literal not displayed */

Schema defines the SQLite database schema for the constellation graph. Based on the Database Architect's design from constellation-design council.

Variables

This section is empty.

Functions

func BytesToFloat32

func BytesToFloat32(b []byte) []float32

BytesToFloat32 deserializes a little-endian byte slice back to float32s.

func CosineSimilarity

func CosineSimilarity(a, b []float32) float64

CosineSimilarity computes cosine similarity between two float32 vectors. Vectors are assumed to be L2-normalized (so this is just a dot product). Returns 0.0 if either vector is nil or lengths don't match.

func Float32ToBytes

func Float32ToBytes(v []float32) []byte

Float32ToBytes serializes a float32 slice to a little-endian byte slice. Used for storing embeddings as BLOBs in SQLite.

func SortNodesByScore

func SortNodesByScore(candidates []NodeWithScore)

SortNodesByScore sorts candidates by combined score (descending).

Types

type BusEvent

type BusEvent struct {
	BusID     string // Bus identifier (e.g., "bus_chat_cog-discord-100000000000000001")
	Seq       int    // Sequence number within the bus
	Timestamp string // RFC3339Nano timestamp
	From      string // Sender (e.g., "http:user", "kernel:cogos")
	Type      string // Event type ("chat.request" or "chat.response")
	Content   string // Message content
	Hash      string // Content-addressed hash of the CogBlock
	Origin    string // Origin platform (e.g., "discord", "http", "claude-code")
	Agent     string // Agent name (optional)
	UserID    string // User identifier (optional)
	UserName  string // User display name (optional)
}

BusEvent represents a bus event to be indexed in the constellation. Only chat.request and chat.response events with non-empty content should be indexed — system events, tool invocations, etc. are skipped.

type Cogdoc

type Cogdoc struct {
	ID               string
	Path             string
	Type             string
	Title            string
	Created          string
	Updated          string
	Sector           string
	Status           string
	Salience         string
	Confidence       string
	Ingested         string
	Tags             []string
	Refs             []Reference
	Content          string
	FrontmatterBytes int // Size of YAML frontmatter in bytes
}

Cogdoc represents a parsed cogdoc with frontmatter and content.

type Constellation

type Constellation struct {
	// contains filtered or unexported fields
}

Constellation provides access to the cogdoc knowledge graph.

func Open

func Open(workspaceRoot string) (*Constellation, error)

Open opens the constellation database at the workspace root. Creates the database and schema if it doesn't exist.

func (*Constellation) Close

func (c *Constellation) Close() error

Close closes the database connection.

func (*Constellation) DB

func (c *Constellation) DB() *sql.DB

DB returns the underlying sql.DB for direct queries.

func (*Constellation) FindLeafNodes

func (c *Constellation) FindLeafNodes(substanceThreshold float64, maxRefs int) ([]SubstanceMetrics, error)

FindLeafNodes returns documents with high substance and few references. These are "leaf" knowledge documents with actual content.

func (*Constellation) FindRoutingLayers

func (c *Constellation) FindRoutingLayers(substanceThreshold float64, minRefs int) ([]SubstanceMetrics, error)

FindRoutingLayers returns documents that are mostly metadata (low substance, high refs). These are potential "over-abstracted" documents that may be pure wiring.

func (c *Constellation) GetBacklinks(docID string) ([]Node, error)

GetBacklinks finds documents that reference the given document ID.

func (*Constellation) GetRecentBySector

func (c *Constellation) GetRecentBySector(sector string, limit int) ([]Node, error)

GetRecentBySector returns recently updated documents in a sector.

func (*Constellation) GetRelated

func (c *Constellation) GetRelated(docID string, relType string) ([]Node, error)

GetRelated finds documents related to the given document ID by following edges.

func (*Constellation) Health

func (c *Constellation) Health() (map[string]interface{}, error)

Health returns database health information.

func (*Constellation) IndexBusEvent

func (c *Constellation) IndexBusEvent(evt BusEvent) error

IndexBusEvent indexes a single bus event into the constellation for full-text search. Inserts into both the documents table and documents_fts for immediate searchability. Idempotent: uses INSERT OR REPLACE keyed on the deterministic document ID.

func (*Constellation) IndexFile

func (c *Constellation) IndexFile(path string) error

IndexFile indexes a single cogdoc file into the constellation. This is the public entry point for incremental indexing (e.g., after a decomposition stores a new CogDoc). It handles its own transaction, FTS rebuild for the affected document, and optional async embedding.

func (*Constellation) IndexWorkspace

func (c *Constellation) IndexWorkspace() error

IndexWorkspace scans the workspace and indexes all cogdocs.

func (*Constellation) QueryRelevant

func (c *Constellation) QueryRelevant(anchor, goal string, limit int) ([]Node, error)

QueryRelevant searches for documents relevant to anchor and goal. Extracts keywords from anchor/goal and performs FTS search.

func (*Constellation) QueryRelevantWithEmbedding

func (c *Constellation) QueryRelevantWithEmbedding(anchor, goal string, maxCandidates, maxResults int, filter SubstanceFilterConfig, queryEmb128 []float32) ([]NodeWithScore, error)

QueryRelevantWithEmbedding is like QueryRelevantWithSubstance but also computes embedding similarity for each candidate (shadow scoring for Phase C). The heuristic score still controls final ranking — embedding scores are recorded for shadow logging and eventual blend-in.

func (*Constellation) QueryRelevantWithSubstance

func (c *Constellation) QueryRelevantWithSubstance(anchor, goal string, maxCandidates, maxResults int, filter SubstanceFilterConfig) ([]Node, error)

QueryRelevantWithSubstance searches for documents with substance-aware ranking. It fetches more candidates than needed, filters by substance ratio, and returns results ranked by a combined score of BM25 relevance and substance metrics.

func (*Constellation) Search

func (c *Constellation) Search(query string, limit int) ([]Node, error)

Search performs full-text search across all cogdocs.

func (*Constellation) SearchWithFilters

func (c *Constellation) SearchWithFilters(query string, types []string, sector string, limit int) ([]Node, error)

SearchWithFilters performs filtered full-text search.

func (*Constellation) SetEmbedClient

func (c *Constellation) SetEmbedClient(client *EmbedClient)

SetEmbedClient sets the embedding client for async embedding on document writes. When set, newly indexed documents will be embedded asynchronously.

func (*Constellation) SubstanceReport

func (c *Constellation) SubstanceReport() ([]SubstanceMetrics, error)

SubstanceReport returns substance metrics aggregated by sector.

func (*Constellation) SubstanceReportByType

func (c *Constellation) SubstanceReportByType() ([]SubstanceMetrics, error)

SubstanceReportByType returns substance metrics aggregated by document type.

func (*Constellation) SubstanceSummary

func (c *Constellation) SubstanceSummary() (*SubstanceMetrics, error)

SubstanceSummary returns overall workspace substance statistics.

type EmbedClient

type EmbedClient struct {
	// contains filtered or unexported fields
}

EmbedClient calls the CogOS embedding server to generate vectors.

func NewEmbedClient

func NewEmbedClient(cfg EmbedConfig) *EmbedClient

NewEmbedClient creates a client for the embedding server. Prefers Unix socket if socketPath is set and the socket exists.

func (*EmbedClient) Embed

func (ec *EmbedClient) Embed(texts []string, prefix string) ([]EmbedResult, error)

Embed sends texts to the embedding server and returns vectors. prefix should be "search_document" for indexing or "search_query" for queries.

func (*EmbedClient) EmbedOne

func (ec *EmbedClient) EmbedOne(text string, prefix string) (*EmbedResult, error)

EmbedOne is a convenience wrapper for embedding a single text.

func (*EmbedClient) Healthy

func (ec *EmbedClient) Healthy() bool

Healthy checks if the embedding server is responding.

type EmbedConfig

type EmbedConfig struct {
	Enabled        bool   `yaml:"enabled"`
	ServerSocket   string `yaml:"server_socket"`
	ServerHTTP     string `yaml:"server_http"`
	DimsFull       int    `yaml:"dims_full"`
	DimsCompressed int    `yaml:"dims_compressed"`
	TimeoutMs      int    `yaml:"timeout_ms"`
}

EmbedConfig holds embedding server connection parameters.

func DefaultEmbedConfig

func DefaultEmbedConfig() EmbedConfig

DefaultEmbedConfig returns sensible defaults for the embedding server.

type EmbedIndexer

type EmbedIndexer struct {
	// contains filtered or unexported fields
}

EmbedIndexer handles embedding generation for documents in the constellation.

func NewEmbedIndexer

func NewEmbedIndexer(c *Constellation, client *EmbedClient) *EmbedIndexer

NewEmbedIndexer creates an indexer that generates embeddings for documents.

func (*EmbedIndexer) BackfillAll

func (ei *EmbedIndexer) BackfillAll(batchSize int) (int, error)

BackfillAll generates embeddings for all documents that don't have them yet, or whose content has changed since last embedding. Processes in batches. Returns the number of documents embedded.

func (*EmbedIndexer) CheckFreshness

func (ei *EmbedIndexer) CheckFreshness() (*EmbedStatus, error)

CheckFreshness reports how many documents need re-embedding.

func (*EmbedIndexer) EmbedSingleDoc

func (ei *EmbedIndexer) EmbedSingleDoc(docID string) error

EmbedSingleDoc generates and stores an embedding for a single document by ID. Used by the write hook for incremental updates.

type EmbedResult

type EmbedResult struct {
	Embedding768 []float32
	Embedding128 []float32
}

EmbedResult holds the computed embeddings for a single text.

type EmbedStatus

type EmbedStatus struct {
	TotalDocs  int      // Total documents in constellation
	Embedded   int      // Documents with embeddings
	Stale      int      // Documents where content changed since embedding
	Missing    int      // Documents without any embedding
	StalePaths []string // Paths of stale documents (for diagnostics)
}

EmbedStatus reports on the state of embeddings in the constellation.

func (*EmbedStatus) FormatStatus

func (s *EmbedStatus) FormatStatus() string

FormatStatus returns a human-readable summary of embedding freshness.

type Node

type Node struct {
	ID      string
	URI     string
	Type    string
	Title   string
	Path    string
	Content string
	Sector  string
	Status  string
	Rank    float64 // BM25 rank for FTS queries
}

Node represents a document node in the constellation graph.

type NodeWithScore

type NodeWithScore struct {
	Node                Node
	BM25Score           float64
	SubstanceScore      float64
	EmbeddingSimilarity float64 // Cosine similarity between query and doc embeddings
	ProbeScore          float64 // Trained probe relevance probability (Phase E)
	CombinedScore       float64
	IsLeaf              bool
	Embedding128        []float32 // Cached 128-dim embedding (avoids re-query for probe scoring)
}

NodeWithScore wraps a Node with its combined ranking score.

type Reference

type Reference struct {
	URI string
	Rel string
}

Reference represents a document reference from frontmatter.

type SubstanceFilterConfig

type SubstanceFilterConfig struct {
	MinSubstanceRatio float64 // Minimum substance ratio to include (0.0-1.0)
	PreferLeafNodes   bool    // Boost high-substance, low-ref documents
	LeafThreshold     float64 // Substance ratio to consider a "leaf node"
	LeafMaxRefs       int     // Max refs for a document to be a "leaf node"
	BM25Weight        float64 // Weight for BM25 relevance (0.0-1.0)
	SubstanceWeight   float64 // Weight for substance ratio (0.0-1.0)
}

SubstanceFilterConfig holds parameters for substance-aware filtering.

func DefaultSubstanceFilter

func DefaultSubstanceFilter() SubstanceFilterConfig

DefaultSubstanceFilter returns sensible defaults for substance filtering.

type SubstanceMetrics

type SubstanceMetrics struct {
	Path             string  // Document path (empty for aggregates)
	Sector           string  // Sector name
	Type             string  // Document type
	DocCount         int     // Number of documents (for aggregates)
	FrontmatterBytes int     // Total frontmatter bytes
	ContentBytes     int     // Total content bytes
	SubstanceRatio   float64 // Content / (Content + Frontmatter)
	RefCount         int     // Number of outgoing references
	RefDensity       float64 // Refs per KB of content
}

SubstanceMetrics represents substance analysis for a document or aggregate.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL