context

package

v1.0.0-alpha.23 Latest Latest Go to latest Published: Mar 10, 2026 License: MIT Imports: 9 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/c360studio/semstreams

Links

Open Source Insights

README ¶

context

Building blocks for context construction in agentic systems.

Overview

The context package implements the "embed context, don't make agents discover it" pattern. It provides utilities for:

Token estimation - Manage context budgets precisely
Batch graph queries - Efficiently fetch entities and relationships
Context formatting - Prepare content for LLM consumption
Source tracking - Track provenance of context

SemStreams provides the HOW (building blocks), consumers decide the WHAT (relevance).

Quick Start

import "github.com/c360studio/semstreams/pkg/context"

// Query entities from the graph
result, err := context.BatchQueryEntitiesWithOptions(ctx, graphClient, entityIDs,
    context.BatchQueryOptions{
        IncludeRelationships: true,
        Depth:                1,
    })
if err != nil {
    return err
}

// Build constructed context with token tracking
opts := context.FormatOptions{
    MaxTokens:      8000,
    PrettyPrint:    true,
    SectionHeaders: true,
}
constructed, err := context.BuildContextFromBatch(result, opts)
if err != nil {
    return err
}

// Embed in TaskMessage - token count is exact
task.Context = constructed
fmt.Printf("Context uses %d tokens\n", constructed.TokenCount)

API Reference

Core Types

Type	Description
`ConstructedContext`	Formatted context with token count and source tracking
`Source`	Tracks where context originated (entity, relationship, document)
`BatchQueryResult`	Results from batch entity queries
`BudgetAllocation`	Tracks token budget allocation across sections
`FormatOptions`	Configures context formatting

Token Estimation

Function	Description
`EstimateTokens(s string) int`	Estimate tokens (~4 chars/token)
`EstimateTokensForModel(s, model string) int`	Model-specific estimation
`FitsInBudget(content string, budget int) bool`	Check if content fits budget
`TruncateToBudget(content string, budget int) string`	Truncate at word boundaries
`CountWords(s string) int`	Count words in string
`TokensFromWords(wordCount int) int`	Estimate tokens from word count

Batch Graph Queries

Function	Description
`BatchQueryEntities(ctx, client, entityIDs)`	Batch lookup with defaults
`BatchQueryEntitiesWithOptions(ctx, client, entityIDs, opts)`	Configurable batch lookup
`ExpandWithNeighbors(ctx, client, entityIDs, depth)`	Expand to include N-hop neighbors
`CollectEntityIDs(relationships)`	Extract unique entity IDs

Context Formatting

Function	Description
`FormatEntitiesForContext(entities, opts)`	Format entities for LLM
`FormatRelationshipsForContext(relationships, opts)`	Format relationships for LLM
`FormatBatchResultForContext(result, opts)`	Format complete batch result
`BuildContextFromBatch(result, opts)`	Create ConstructedContext

Helper Functions

Function	Description
`NewConstructedContext(content, entities, sources)`	Create ConstructedContext
`EntitySource(entityID)`	Create entity source
`RelationshipSource(relationshipID)`	Create relationship source
`DocumentSource(docID)`	Create document source
`NewBudgetAllocation(totalBudget)`	Create budget tracker

Token Budget Management

The package provides tools for managing token budgets across context sections:

// Allocate budget across sections
budget := context.NewBudgetAllocation(10000)
budget.Allocate("system_prompt", 500)
budget.Allocate("entities", 4000)
budget.Allocate("relationships", 2000)
remaining := budget.Remaining() // 3500 for conversation

// Or allocate proportionally
budget := context.NewBudgetAllocation(8000)
budget.Allocate("system_prompt", 500)
allocations := budget.AllocateProportionally(
    []string{"entities", "relationships", "history"},
    []float64{0.5, 0.2, 0.3},
)

BatchQueryOptions

Configure batch queries with these options:

Option	Type	Default	Description
`IncludeRelationships`	bool	false	Fetch relationships for each entity
`Depth`	int	0	Relationship traversal depth
`MaxConcurrent`	int	10	Max concurrent relationship queries

FormatOptions

Configure formatting with these options:

Option	Type	Default	Description
`MaxTokens`	int	4000	Maximum tokens for output
`PrettyPrint`	bool	true	Pretty print JSON
`IncludeMetadata`	bool	false	Include entity metadata
`EntityOrder`	[]string	nil	Explicit entity ordering
`SectionHeaders`	bool	true	Add section headers

Integration with Workflows

When using ConstructedContext with the workflow processor's publish_agent action:

{
  "name": "review",
  "action": {
    "type": "publish_agent",
    "role": "reviewer",
    "prompt": "Review the following code",
    "context": "${steps.build_context.output}"
  }
}

The context construction step produces a ConstructedContext that is embedded directly in the agent task. This enables:

Exact token budgets - Know context size before dispatch
Fresh context per task - No pollution from prior agent work
Source tracking - Trace which entities contributed to decisions

Design Philosophy

This package follows the principle that "what's relevant" is domain knowledge:

A code review system has different relevance criteria than a logistics system
Rather than embedding domain-specific heuristics, SemStreams provides utilities
The consumer (e.g., SemSpec) implements the relevance logic

Pattern:

Consumer:
1. Analyze task to determine relevant entities (domain logic)
2. Use pkg/context to query and format entities (building blocks)
3. Embed ConstructedContext in TaskMessage (integration)

SemStreams:
4. Agent loop receives pre-built context
5. No runtime discovery needed
6. Token budget is known precisely

Agentic Systems - Overview of agentic loop
Workflow Configuration - Workflow processor
Context Construction - Concept guide

Documentation ¶

Overview ¶

Package context provides building blocks for context construction in agentic systems.

Overview ¶

This package implements the "embed context, don't make agents discover it" pattern. Consumers use these utilities to build ConstructedContext before dispatching agents, enabling precise token budget management and eliminating runtime context discovery.

The key insight is that SemStreams provides the HOW (building blocks), while consumers decide the WHAT (what's relevant for their domain). This separation allows domain-specific context construction while providing reusable utilities for token management, batch queries, and LLM-friendly formatting.

Core Types ¶

ConstructedContext wraps formatted context with token count and source tracking. It contains everything needed to embed context in an agent task:

Content: The formatted string ready for LLM consumption
TokenCount: Exact token count for budget management
Entities: Entity IDs included in the context
Sources: Provenance tracking (where context came from)
ConstructedAt: Timestamp for cache management

Source tracks where context originated. Source types include:

graph_entity: Context from a knowledge graph entity
graph_relationship: Context from graph relationships
document: Context from a document or chunk

Building Block Functions ¶

Token estimation functions help manage context budgets:

EstimateTokens: Estimates tokens using 4-char-per-token heuristic
EstimateTokensForModel: Model-specific token estimation
FitsInBudget: Checks if content fits within token budget
TruncateToBudget: Truncates at word boundaries to fit budget
BudgetAllocation: Tracks token budget allocation across sections

Batch graph query functions fetch entities efficiently:

BatchQueryEntities: Batch entity lookup with default options
BatchQueryEntitiesWithOptions: Configurable batch queries with relationships
ExpandWithNeighbors: Expands entity IDs to include N-hop neighbors
CollectEntityIDs: Extracts unique entity IDs from relationships

Context formatting functions prepare content for LLMs:

FormatEntitiesForContext: Formats entities for LLM consumption
FormatRelationshipsForContext: Formats relationships for LLM
FormatBatchResultForContext: Formats complete batch results
BuildContextFromBatch: Creates ConstructedContext from batch result

Example Usage ¶

Building context for an agent task:

// Query entities from the graph
result, err := context.BatchQueryEntitiesWithOptions(ctx, client, entityIDs,
    context.BatchQueryOptions{
        IncludeRelationships: true,
        Depth:                1,
    })
if err != nil {
    return err
}

// Build constructed context with token tracking
opts := context.FormatOptions{
    MaxTokens:      8000,
    PrettyPrint:    true,
    SectionHeaders: true,
}
constructed, err := context.BuildContextFromBatch(result, opts)
if err != nil {
    return err
}

// Embed in TaskMessage - token count is exact
task.Context = constructed

Token budget management:

budget := context.NewBudgetAllocation(10000)
budget.Allocate("system_prompt", 500)
budget.Allocate("entities", 4000)
remaining := budget.Remaining() // 5500 for other content

Integration with Workflows ¶

When using ConstructedContext with the workflow processor's publish_agent action, the context is embedded directly in the TaskMessage. This enables the pattern:

Consumer builds context using domain-specific logic
Exact token count known before agent dispatch
Agent loop receives pre-built context (no discovery needed)
Fresh context per task (no pollution from prior work)

Design Rationale ¶

This package follows the principle that "what's relevant" is domain knowledge. A code review system has different relevance criteria than a logistics system. Rather than embedding domain-specific heuristics, SemStreams provides utilities that any domain can use:

Token counting and budget management
Efficient batch graph queries
LLM-friendly formatting
Source tracking for provenance

The consumer (e.g., SemSpec) implements the relevance logic and uses these building blocks to construct the final context.

Package context provides building blocks for context construction in agentic systems. Consumers use these utilities to build ConstructedContext before dispatching agents, enabling "embed context, don't make agents discover it" pattern.

Index ¶

Constants
func CollectEntityIDs(relationships []Relationship) []string
func CountWords(s string) int
func EstimateTokens(s string) int
func EstimateTokensForModel(s string, model string) int
func ExpandWithNeighbors(ctx context.Context, client GraphClient, entityIDs []string, depth int) ([]string, error)
func FitsInBudget(content string, budget int) bool
func FormatBatchResultForContext(result *BatchQueryResult, opts FormatOptions) (string, int, error)
func FormatEntitiesForContext(entities map[string]json.RawMessage, opts FormatOptions) (string, int, error)
func FormatRelationshipsForContext(relationships []Relationship, opts FormatOptions) (string, int, error)
func TokensFromWords(wordCount int) int
func TruncateToBudget(content string, budget int) string
type BatchQueryOptions
type BatchQueryResult
- func BatchQueryEntities(ctx context.Context, client GraphClient, entityIDs []string) (*BatchQueryResult, error)
- func BatchQueryEntitiesWithOptions(ctx context.Context, client GraphClient, entityIDs []string, ...) (*BatchQueryResult, error)
type BudgetAllocation
- func NewBudgetAllocation(totalBudget int) *BudgetAllocation
- func (b *BudgetAllocation) Allocate(section string, requested int) int
- func (b *BudgetAllocation) AllocateProportionally(sections []string, weights []float64) map[string]int
- func (b *BudgetAllocation) Remaining() int
type ConstructedContext
- func BuildContextFromBatch(result *BatchQueryResult, opts FormatOptions) (*ConstructedContext, error)
- func NewConstructedContext(content string, entities []string, sources []Source) *ConstructedContext
type FormatOptions
- func DefaultFormatOptions() FormatOptions
type GraphClient
type Relationship
type Source
- func DocumentSource(docID string) Source
- func EntitySource(entityID string) Source
- func RelationshipSource(relationshipID string) Source

Constants ¶

View Source

const DefaultCharsPerToken = 4

DefaultCharsPerToken is the average characters per token for most LLMs. Claude uses roughly 4 characters per token for English text.

Variables ¶

This section is empty.

Functions ¶

func CollectEntityIDs ¶

func CollectEntityIDs(relationships []Relationship) []string

CollectEntityIDs extracts unique entity IDs from relationships

func CountWords ¶

func CountWords(s string) int

CountWords counts words in a string (useful for rough estimates)

func EstimateTokens ¶

func EstimateTokens(s string) int

EstimateTokens estimates token count for a string. Uses a heuristic of ~4 characters per token, which is accurate for English text with Claude models.

func EstimateTokensForModel ¶

func EstimateTokensForModel(s string, model string) int

EstimateTokensForModel estimates tokens for a specific model. Currently all models use the same estimate, but this allows for model-specific adjustments in the future.

func ExpandWithNeighbors ¶

func ExpandWithNeighbors(ctx context.Context, client GraphClient, entityIDs []string, depth int) ([]string, error)

ExpandWithNeighbors expands entity IDs to include their neighbors

func FitsInBudget ¶

func FitsInBudget(content string, budget int) bool

FitsInBudget checks if content fits within a token budget.

func FormatBatchResultForContext ¶

func FormatBatchResultForContext(result *BatchQueryResult, opts FormatOptions) (string, int, error)

FormatBatchResultForContext formats a BatchQueryResult for LLM context.

func FormatEntitiesForContext ¶

func FormatEntitiesForContext(entities map[string]json.RawMessage, opts FormatOptions) (string, int, error)

FormatEntitiesForContext formats entity data for LLM context. Returns the formatted string, token count, and any error.

func FormatRelationshipsForContext ¶

func FormatRelationshipsForContext(relationships []Relationship, opts FormatOptions) (string, int, error)

FormatRelationshipsForContext formats relationships for LLM context.

func TokensFromWords ¶

func TokensFromWords(wordCount int) int

TokensFromWords estimates tokens from word count. Roughly 1.3 tokens per word for English.

func TruncateToBudget ¶

func TruncateToBudget(content string, budget int) string

TruncateToBudget truncates content to fit within a token budget. Attempts to truncate at word boundaries.

Types ¶

type BatchQueryOptions ¶

type BatchQueryOptions struct {
	IncludeRelationships bool
	Depth                int
	MaxConcurrent        int // Max concurrent queries (default: 10)
}

BatchQueryOptions configures batch query behavior

type BatchQueryResult ¶

type BatchQueryResult struct {
	Entities      map[string]json.RawMessage
	Relationships []Relationship
	NotFound      []string
	Errors        map[string]error
}

BatchQueryResult contains results from a batch query

func BatchQueryEntities ¶

func BatchQueryEntities(ctx context.Context, client GraphClient, entityIDs []string) (*BatchQueryResult, error)

BatchQueryEntities performs batch entity lookups efficiently. Returns all found entities and tracks which were not found.

func BatchQueryEntitiesWithOptions ¶

func BatchQueryEntitiesWithOptions(ctx context.Context, client GraphClient, entityIDs []string, opts BatchQueryOptions) (*BatchQueryResult, error)

BatchQueryEntitiesWithOptions performs batch entity lookups with options.

type BudgetAllocation ¶

type BudgetAllocation struct {
	TotalBudget int
	Allocated   int
	Sections    map[string]int
}

BudgetAllocation helps allocate token budget across multiple content sections.

func NewBudgetAllocation ¶

func NewBudgetAllocation(totalBudget int) *BudgetAllocation

NewBudgetAllocation creates a new budget allocation tracker.

func (*BudgetAllocation) Allocate ¶

func (b *BudgetAllocation) Allocate(section string, requested int) int

Allocate allocates budget for a section. Returns the actual allocation (may be less than requested if budget is exhausted).

func (*BudgetAllocation) AllocateProportionally ¶

func (b *BudgetAllocation) AllocateProportionally(sections []string, weights []float64) map[string]int

AllocateProportionally allocates remaining budget proportionally across sections.

func (*BudgetAllocation) Remaining ¶

func (b *BudgetAllocation) Remaining() int

Remaining returns the remaining budget.

type ConstructedContext ¶

type ConstructedContext = types.ConstructedContext

ConstructedContext is an alias for types.ConstructedContext. The canonical type is defined in pkg/types/context.go.

func BuildContextFromBatch ¶

func BuildContextFromBatch(result *BatchQueryResult, opts FormatOptions) (*ConstructedContext, error)

BuildContextFromBatch creates a ConstructedContext from a BatchQueryResult.

func NewConstructedContext ¶

func NewConstructedContext(content string, entities []string, sources []Source) *ConstructedContext

NewConstructedContext creates a new ConstructedContext from parts.

type FormatOptions ¶

type FormatOptions struct {
	MaxTokens       int      // Max tokens for output
	PrettyPrint     bool     // Pretty print JSON
	IncludeMetadata bool     // Include entity metadata
	EntityOrder     []string // Explicit order for entities (if empty, uses map order)
	SectionHeaders  bool     // Add section headers
}

FormatOptions configures context formatting

func DefaultFormatOptions ¶

func DefaultFormatOptions() FormatOptions

DefaultFormatOptions returns sensible defaults for formatting

type GraphClient ¶

type GraphClient interface {
	// QueryEntities fetches multiple entities by their IDs
	QueryEntities(ctx context.Context, entityIDs []string) (map[string]json.RawMessage, error)

	// QueryRelationships fetches relationships for an entity with depth
	QueryRelationships(ctx context.Context, entityID string, depth int) ([]Relationship, error)
}

GraphClient defines the interface for batch graph query operations. This mirrors the interface in workflow actions but is defined here for use by context construction utilities.

type Relationship ¶

type Relationship = types.Relationship

Relationship is an alias for the shared type

type Source ¶

type Source = types.ContextSource

Source is an alias for types.ContextSource, tracking where context came from. The canonical type is defined in pkg/types/context.go.

func DocumentSource ¶

func DocumentSource(docID string) Source

DocumentSource creates a Source for a document

func EntitySource ¶

func EntitySource(entityID string) Source

EntitySource creates a Source for a graph entity

func RelationshipSource ¶

func RelationshipSource(relationshipID string) Source

RelationshipSource creates a Source for a graph relationship

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

context

Overview

Quick Start

API Reference

Core Types

Token Estimation

Batch Graph Queries

Context Formatting

Helper Functions

Token Budget Management

BatchQueryOptions

FormatOptions

Integration with Workflows

Design Philosophy

Related Documentation

Documentation ¶

Overview ¶

Overview ¶

Core Types ¶

Building Block Functions ¶

Example Usage ¶

Integration with Workflows ¶

Design Rationale ¶

Index ¶

Constants ¶

Variables ¶

Functions ¶

func CollectEntityIDs ¶

func CountWords ¶

func EstimateTokens ¶

func EstimateTokensForModel ¶

func ExpandWithNeighbors ¶

func FitsInBudget ¶

func FormatBatchResultForContext ¶

func FormatEntitiesForContext ¶

func FormatRelationshipsForContext ¶

func TokensFromWords ¶

func TruncateToBudget ¶

Types ¶

type BatchQueryOptions ¶

type BatchQueryResult ¶

func BatchQueryEntities ¶

func BatchQueryEntitiesWithOptions ¶

type BudgetAllocation ¶

func NewBudgetAllocation ¶

func (*BudgetAllocation) Allocate ¶

func (*BudgetAllocation) AllocateProportionally ¶

func (*BudgetAllocation) Remaining ¶

type ConstructedContext ¶

func BuildContextFromBatch ¶

func NewConstructedContext ¶

type FormatOptions ¶

func DefaultFormatOptions ¶

type GraphClient ¶

type Relationship ¶

type Source ¶

func DocumentSource ¶

func EntitySource ¶

func RelationshipSource ¶

Source Files ¶