Documentation
¶
Index ¶
- Constants
- func ClassifyPath(path string) string
- type Categorizer
- type GitStore
- type Index
- func (idx *Index) CategoryCounts(domain string) (map[string]int, error)
- func (idx *Index) Close() error
- func (idx *Index) DeleteSite(domain string) error
- func (idx *Index) GetCacheHeaders(domain, urlPath string) (etag, lastModified string, err error)
- func (idx *Index) GetCategory(domain, urlPath string) (string, error)
- func (idx *Index) GetContentType(domain, urlPath string) (string, error)
- func (idx *Index) GetSummary(domain, urlPath string) (summary, summaryAt string, err error)
- func (idx *Index) IndexFile(domain, urlPath, contentType, body string, category ...string) error
- func (idx *Index) Rebuild(store *Store) error
- func (idx *Index) Search(query string, opts SearchOpts) ([]SearchHit, error)
- func (idx *Index) SetCategory(domain, urlPath, category string) error
- func (idx *Index) SetSummary(domain, urlPath, summary string) error
- func (idx *Index) UpdateCacheHeaders(domain, urlPath, etag, lastModified string) error
- type Indexer
- type LogEntry
- type RuleCategorizer
- type SearchHit
- type SearchOpts
- type Store
- func (s *Store) EnsureSiteDir(domain string) error
- func (s *Store) ListSites() ([]string, error)
- func (s *Store) MetaDir(domain string) string
- func (s *Store) ReadContent(domain, urlPath string) ([]byte, error)
- func (s *Store) ReadMeta(domain, name string, v any) error
- func (s *Store) SiteDir(domain string) string
- func (s *Store) SiteFileCount(domain string) (int, error)
- func (s *Store) SitesDir() string
- func (s *Store) WriteContent(domain, urlPath string, data []byte) (string, error)
- func (s *Store) WriteMeta(domain, name string, v any) error
- type VersionStore
Constants ¶
const ( CatAPIReference = "api-reference" CatTutorial = "tutorial" CatGuide = "guide" CatSpec = "spec" CatChangelog = "changelog" CatMarketing = "marketing" CatLegal = "legal" CatCommunity = "community" CatContext7 = "context7" CatIndex = "index" CatOther = "other" )
Content categories classify pages by their purpose, enabling agents to filter search results by task (e.g. coding agents want api-reference, not marketing).
Variables ¶
This section is empty.
Functions ¶
func ClassifyPath ¶
ClassifyPath returns a content type string for a URL path.
Types ¶
type Categorizer ¶
Categorizer assigns semantic categories to indexed pages. Implementations can use path patterns, body analysis, ML models, etc.
type GitStore ¶
type GitStore struct {
// contains filtered or unexported fields
}
GitStore wraps git operations on the workspace repository.
func InitGit ¶
InitGit initializes or opens the git repository in the workspace root. On a fresh workspace it creates a seed commit so HEAD is valid immediately.
func (*GitStore) Commit ¶
Commit stages all changes and creates a commit. Returns true if a commit was created, false if there was nothing to commit.
func (*GitStore) Diff ¶
Diff returns the diff between two refs (e.g., "HEAD~1", "HEAD"). If from is empty, defaults to the parent of to.
func (*GitStore) HasChanges ¶
HasChanges returns true if the worktree has uncommitted changes.
type Index ¶
type Index struct {
// contains filtered or unexported fields
}
Index manages the SQLite FTS5 search index.
func (*Index) CategoryCounts ¶
CategoryCounts returns category distribution for a domain.
func (*Index) DeleteSite ¶
DeleteSite removes all index entries for a domain.
func (*Index) GetCacheHeaders ¶
GetCacheHeaders returns the stored ETag and Last-Modified for a file.
func (*Index) GetCategory ¶
GetCategory returns the category for a specific file.
func (*Index) GetContentType ¶
GetContentType returns the content type for a specific file from the index.
func (*Index) GetSummary ¶
GetSummary returns the agent-submitted summary for a file, if any.
func (*Index) IndexFile ¶
IndexFile adds or updates a file in the search index. Category is derived automatically from path, content type, and body.
func (*Index) Search ¶
func (idx *Index) Search(query string, opts SearchOpts) ([]SearchHit, error)
Search performs a full-text search across all indexed content. Path matches in the query are boosted: if a query term appears in the file path, that result ranks higher. This helps agents find e.g. "getting-started" docs when searching for "getting started".
func (*Index) SetCategory ¶
SetCategory overrides the category for a specific file (agent feedback). The override is marked as user-set so re-indexing preserves it.
func (*Index) SetSummary ¶
SetSummary stores an agent-submitted summary for a file and re-indexes the FTS entry so the summary is searchable.
func (*Index) UpdateCacheHeaders ¶
UpdateCacheHeaders stores ETag and Last-Modified for a file.
type Indexer ¶
type Indexer interface {
IndexFile(domain, path, contentType, body string, category ...string) error
Search(query string, opts SearchOpts) ([]SearchHit, error)
DeleteSite(domain string) error
Rebuild(store *Store) error
GetCacheHeaders(domain, path string) (etag, lastModified string, err error)
UpdateCacheHeaders(domain, path, etag, lastModified string) error
GetCategory(domain, path string) (string, error)
GetContentType(domain, path string) (string, error)
SetCategory(domain, path, category string) error
GetSummary(domain, path string) (summary, summaryAt string, err error)
SetSummary(domain, path, summary string) error
CategoryCounts(domain string) (map[string]int, error)
Close() error
}
Indexer is the interface for full-text search backends. The default implementation uses SQLite FTS5. Alternative implementations could provide semantic/vector search or different storage engines.
type LogEntry ¶
type LogEntry struct {
Hash string `json:"hash"`
Message string `json:"message"`
When time.Time `json:"when"`
}
Log returns recent commit entries, optionally filtered to a site's path.
type RuleCategorizer ¶
type RuleCategorizer struct{}
RuleCategorizer is the default implementation using path patterns and body heuristics.
func (*RuleCategorizer) Categorize ¶
func (r *RuleCategorizer) Categorize(domain, path, contentType, body string) string
type SearchHit ¶
type SearchHit struct {
Domain string `json:"domain"`
Path string `json:"path"`
ContentType string `json:"content_type"`
Category string `json:"category"`
Snippet string `json:"snippet"`
Summary string `json:"summary,omitempty"`
Rank float64 `json:"rank"`
}
SearchHit represents a single search result.
type SearchOpts ¶
type SearchOpts struct {
Site string // filter to a specific domain
ContentType string // filter to a content type
Category string // filter to a category (e.g. "api-reference")
Path string // filter to paths containing this substring
Limit int
Offset int
}
SearchOpts controls search behavior.
type Store ¶
type Store struct {
Root string // Root directory of the doctrove workspace
}
Store manages the filesystem layout for mirrored content.
func (*Store) EnsureSiteDir ¶
EnsureSiteDir creates the directory structure for a site.
func (*Store) ReadContent ¶
ReadContent reads a file from a site's directory by its URL path. If the direct path is a directory (promoted via conflict resolution), it falls back to reading <path>/_index.
func (*Store) SiteFileCount ¶
SiteFileCount returns the number of content files (excluding _meta) for a site.
func (*Store) WriteContent ¶
WriteContent writes content to the appropriate path under a site's directory. urlPath is the path portion of the URL (e.g., "/llms.txt", "/docs/api.html.md").
If a path component already exists as a regular file (e.g., writing "/deploy/getting_started" when "/deploy" is a file), the conflicting file is promoted into a directory by renaming it to "<dir>/_index". This matches how web servers treat directory-index pages.
type VersionStore ¶
type VersionStore interface {
Commit(message string) (bool, error)
Log(site string, limit int) ([]LogEntry, error)
Diff(from, to string) (string, error)
HasChanges() (bool, error)
}
VersionStore provides git-based change tracking for mirrored content. The default implementation uses go-git. Alternative implementations could use shell git, a different VCS, or a no-op for environments without git.