Documentation
¶
Overview ¶
Package context provides building blocks for context construction in agentic systems.
Overview ¶
This package implements the "embed context, don't make agents discover it" pattern. Consumers use these utilities to build ConstructedContext before dispatching agents, enabling precise token budget management and eliminating runtime context discovery.
The key insight is that SemStreams provides the HOW (building blocks), while consumers decide the WHAT (what's relevant for their domain). This separation allows domain-specific context construction while providing reusable utilities for token management, batch queries, and LLM-friendly formatting.
Core Types ¶
ConstructedContext wraps formatted context with token count and source tracking. It contains everything needed to embed context in an agent task:
- Content: The formatted string ready for LLM consumption
- TokenCount: Exact token count for budget management
- Entities: Entity IDs included in the context
- Sources: Provenance tracking (where context came from)
- ConstructedAt: Timestamp for cache management
Source tracks where context originated. Source types include:
- graph_entity: Context from a knowledge graph entity
- graph_relationship: Context from graph relationships
- document: Context from a document or chunk
Building Block Functions ¶
Token estimation functions help manage context budgets:
- EstimateTokens: Estimates tokens using 4-char-per-token heuristic
- EstimateTokensForModel: Model-specific token estimation
- FitsInBudget: Checks if content fits within token budget
- TruncateToBudget: Truncates at word boundaries to fit budget
- BudgetAllocation: Tracks token budget allocation across sections
Batch graph query functions fetch entities efficiently:
- BatchQueryEntities: Batch entity lookup with default options
- BatchQueryEntitiesWithOptions: Configurable batch queries with relationships
- ExpandWithNeighbors: Expands entity IDs to include N-hop neighbors
- CollectEntityIDs: Extracts unique entity IDs from relationships
Context formatting functions prepare content for LLMs:
- FormatEntitiesForContext: Formats entities for LLM consumption
- FormatRelationshipsForContext: Formats relationships for LLM
- FormatBatchResultForContext: Formats complete batch results
- BuildContextFromBatch: Creates ConstructedContext from batch result
Example Usage ¶
Building context for an agent task:
// Query entities from the graph
result, err := context.BatchQueryEntitiesWithOptions(ctx, client, entityIDs,
context.BatchQueryOptions{
IncludeRelationships: true,
Depth: 1,
})
if err != nil {
return err
}
// Build constructed context with token tracking
opts := context.FormatOptions{
MaxTokens: 8000,
PrettyPrint: true,
SectionHeaders: true,
}
constructed, err := context.BuildContextFromBatch(result, opts)
if err != nil {
return err
}
// Embed in TaskMessage - token count is exact
task.Context = constructed
Token budget management:
budget := context.NewBudgetAllocation(10000)
budget.Allocate("system_prompt", 500)
budget.Allocate("entities", 4000)
remaining := budget.Remaining() // 5500 for other content
Integration with Workflows ¶
When using ConstructedContext with the workflow processor's publish_agent action, the context is embedded directly in the TaskMessage. This enables the pattern:
- Consumer builds context using domain-specific logic
- Exact token count known before agent dispatch
- Agent loop receives pre-built context (no discovery needed)
- Fresh context per task (no pollution from prior work)
Design Rationale ¶
This package follows the principle that "what's relevant" is domain knowledge. A code review system has different relevance criteria than a logistics system. Rather than embedding domain-specific heuristics, SemStreams provides utilities that any domain can use:
- Token counting and budget management
- Efficient batch graph queries
- LLM-friendly formatting
- Source tracking for provenance
The consumer (e.g., SemSpec) implements the relevance logic and uses these building blocks to construct the final context.
Package context provides building blocks for context construction in agentic systems. Consumers use these utilities to build ConstructedContext before dispatching agents, enabling "embed context, don't make agents discover it" pattern.
Index ¶
- Constants
- func CollectEntityIDs(relationships []Relationship) []string
- func CountWords(s string) int
- func EstimateTokens(s string) int
- func EstimateTokensForModel(s string, model string) int
- func ExpandWithNeighbors(ctx context.Context, client GraphClient, entityIDs []string, depth int) ([]string, error)
- func FitsInBudget(content string, budget int) bool
- func FormatBatchResultForContext(result *BatchQueryResult, opts FormatOptions) (string, int, error)
- func FormatEntitiesForContext(entities map[string]json.RawMessage, opts FormatOptions) (string, int, error)
- func FormatRelationshipsForContext(relationships []Relationship, opts FormatOptions) (string, int, error)
- func TokensFromWords(wordCount int) int
- func TruncateToBudget(content string, budget int) string
- type BatchQueryOptions
- type BatchQueryResult
- type BudgetAllocation
- type ConstructedContext
- type FormatOptions
- type GraphClient
- type Relationship
- type Source
Constants ¶
const DefaultCharsPerToken = 4
DefaultCharsPerToken is the average characters per token for most LLMs. Claude uses roughly 4 characters per token for English text.
Variables ¶
This section is empty.
Functions ¶
func CollectEntityIDs ¶
func CollectEntityIDs(relationships []Relationship) []string
CollectEntityIDs extracts unique entity IDs from relationships
func CountWords ¶
CountWords counts words in a string (useful for rough estimates)
func EstimateTokens ¶
EstimateTokens estimates token count for a string. Uses a heuristic of ~4 characters per token, which is accurate for English text with Claude models.
func EstimateTokensForModel ¶
EstimateTokensForModel estimates tokens for a specific model. Currently all models use the same estimate, but this allows for model-specific adjustments in the future.
func ExpandWithNeighbors ¶
func ExpandWithNeighbors(ctx context.Context, client GraphClient, entityIDs []string, depth int) ([]string, error)
ExpandWithNeighbors expands entity IDs to include their neighbors
func FitsInBudget ¶
FitsInBudget checks if content fits within a token budget.
func FormatBatchResultForContext ¶
func FormatBatchResultForContext(result *BatchQueryResult, opts FormatOptions) (string, int, error)
FormatBatchResultForContext formats a BatchQueryResult for LLM context.
func FormatEntitiesForContext ¶
func FormatEntitiesForContext(entities map[string]json.RawMessage, opts FormatOptions) (string, int, error)
FormatEntitiesForContext formats entity data for LLM context. Returns the formatted string, token count, and any error.
func FormatRelationshipsForContext ¶
func FormatRelationshipsForContext(relationships []Relationship, opts FormatOptions) (string, int, error)
FormatRelationshipsForContext formats relationships for LLM context.
func TokensFromWords ¶
TokensFromWords estimates tokens from word count. Roughly 1.3 tokens per word for English.
func TruncateToBudget ¶
TruncateToBudget truncates content to fit within a token budget. Attempts to truncate at word boundaries.
Types ¶
type BatchQueryOptions ¶
type BatchQueryOptions struct {
IncludeRelationships bool
Depth int
MaxConcurrent int // Max concurrent queries (default: 10)
}
BatchQueryOptions configures batch query behavior
type BatchQueryResult ¶
type BatchQueryResult struct {
Entities map[string]json.RawMessage
Relationships []Relationship
NotFound []string
Errors map[string]error
}
BatchQueryResult contains results from a batch query
func BatchQueryEntities ¶
func BatchQueryEntities(ctx context.Context, client GraphClient, entityIDs []string) (*BatchQueryResult, error)
BatchQueryEntities performs batch entity lookups efficiently. Returns all found entities and tracks which were not found.
func BatchQueryEntitiesWithOptions ¶
func BatchQueryEntitiesWithOptions(ctx context.Context, client GraphClient, entityIDs []string, opts BatchQueryOptions) (*BatchQueryResult, error)
BatchQueryEntitiesWithOptions performs batch entity lookups with options.
type BudgetAllocation ¶
BudgetAllocation helps allocate token budget across multiple content sections.
func NewBudgetAllocation ¶
func NewBudgetAllocation(totalBudget int) *BudgetAllocation
NewBudgetAllocation creates a new budget allocation tracker.
func (*BudgetAllocation) Allocate ¶
func (b *BudgetAllocation) Allocate(section string, requested int) int
Allocate allocates budget for a section. Returns the actual allocation (may be less than requested if budget is exhausted).
func (*BudgetAllocation) AllocateProportionally ¶
func (b *BudgetAllocation) AllocateProportionally(sections []string, weights []float64) map[string]int
AllocateProportionally allocates remaining budget proportionally across sections.
func (*BudgetAllocation) Remaining ¶
func (b *BudgetAllocation) Remaining() int
Remaining returns the remaining budget.
type ConstructedContext ¶
type ConstructedContext = types.ConstructedContext
ConstructedContext is an alias for types.ConstructedContext. The canonical type is defined in pkg/types/context.go.
func BuildContextFromBatch ¶
func BuildContextFromBatch(result *BatchQueryResult, opts FormatOptions) (*ConstructedContext, error)
BuildContextFromBatch creates a ConstructedContext from a BatchQueryResult.
func NewConstructedContext ¶
func NewConstructedContext(content string, entities []string, sources []Source) *ConstructedContext
NewConstructedContext creates a new ConstructedContext from parts.
type FormatOptions ¶
type FormatOptions struct {
MaxTokens int // Max tokens for output
PrettyPrint bool // Pretty print JSON
IncludeMetadata bool // Include entity metadata
EntityOrder []string // Explicit order for entities (if empty, uses map order)
SectionHeaders bool // Add section headers
}
FormatOptions configures context formatting
func DefaultFormatOptions ¶
func DefaultFormatOptions() FormatOptions
DefaultFormatOptions returns sensible defaults for formatting
type GraphClient ¶
type GraphClient interface {
// QueryEntities fetches multiple entities by their IDs
QueryEntities(ctx context.Context, entityIDs []string) (map[string]json.RawMessage, error)
// QueryRelationships fetches relationships for an entity with depth
QueryRelationships(ctx context.Context, entityID string, depth int) ([]Relationship, error)
}
GraphClient defines the interface for batch graph query operations. This mirrors the interface in workflow actions but is defined here for use by context construction utilities.
type Relationship ¶
type Relationship = types.Relationship
Relationship is an alias for the shared type
type Source ¶
type Source = types.ContextSource
Source is an alias for types.ContextSource, tracking where context came from. The canonical type is defined in pkg/types/context.go.
func DocumentSource ¶
DocumentSource creates a Source for a document
func EntitySource ¶
EntitySource creates a Source for a graph entity
func RelationshipSource ¶
RelationshipSource creates a Source for a graph relationship