Documentation
¶
Overview ¶
Package stats provides workspace metrics and statistics tracking.
tmux_sampler.go — sample CPU/memory for tmux-backed agents.
Docker-backed agents are sampled via `docker stats`. Tmux-backed agents live as plain processes under a tmux session, so we walk the PID tree from the tmux pane PID down through its descendants (shell → claude → any subprocesses) and sum %CPU + RSS.
Network bytes are not populated here: per-process network accounting requires either DTrace (darwin) or cgroup access (linux container namespace only). The collector leaves those fields zero and the UI renders "Network tracking requires container runtime".
Index ¶
- Constants
- func StatsDSN() string
- type AgentFilter
- type AgentMetric
- type AgentMetrics
- type AgentStat
- type AgentSummary
- type CPUSummary
- type ChannelFilter
- type ChannelMetric
- type CostSummary
- type DefaultTmuxProcRunner
- func (DefaultTmuxProcRunner) Children(ctx context.Context, pid int) ([]int, error)
- func (DefaultTmuxProcRunner) ListSessions(ctx context.Context) ([]string, error)
- func (DefaultTmuxProcRunner) PSStats(ctx context.Context, pids []int) (float64, int64, error)
- func (DefaultTmuxProcRunner) PanePIDs(ctx context.Context, session string) ([]int, error)
- type DiskSummary
- type MemorySummary
- type ModelCostBreakdown
- type NetSummary
- type Stats
- type Store
- func (s *Store) Close() error
- func (s *Store) DB() *sql.DB
- func (s *Store) QueryAgentCPU(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
- func (s *Store) QueryAgentCost(ctx context.Context, f AgentFilter, tr TimeRange) ([]TokenMetric, error)
- func (s *Store) QueryAgentDisk(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
- func (s *Store) QueryAgentMem(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
- func (s *Store) QueryAgentNet(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
- func (s *Store) QueryAgentSummary(ctx context.Context, agentName string, tr TimeRange) (*AgentSummary, error)
- func (s *Store) QueryAgentTokens(ctx context.Context, f AgentFilter, tr TimeRange) ([]TokenMetric, error)
- func (s *Store) QueryChannelMembers(ctx context.Context, f ChannelFilter, tr TimeRange) ([]ChannelMetric, error)
- func (s *Store) QueryChannelMessages(ctx context.Context, f ChannelFilter, tr TimeRange) ([]ChannelMetric, error)
- func (s *Store) QueryChannelReactions(ctx context.Context, f ChannelFilter, tr TimeRange) ([]ChannelMetric, error)
- func (s *Store) QueryLatestAgentMetrics(ctx context.Context) ([]AgentMetric, error)
- func (s *Store) QuerySystemCPU(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
- func (s *Store) QuerySystemDisk(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
- func (s *Store) QuerySystemMem(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
- func (s *Store) QuerySystemNet(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
- func (s *Store) RecordAgent(ctx context.Context, m AgentMetric) error
- func (s *Store) RecordChannel(ctx context.Context, m ChannelMetric) error
- func (s *Store) RecordSystem(ctx context.Context, m SystemMetric) error
- func (s *Store) RecordToken(ctx context.Context, m TokenMetric) error
- type SystemMetric
- type TimeRange
- type TmuxProcRunner
- type TmuxSample
- type TmuxSampler
- type TokenMetric
- type TokenSummary
Constants ¶
const DefaultStatsDSN = "postgres://bc:bc@localhost:5432/bc" //nolint:gosec // not a credential, it's a default DSN
DefaultStatsDSN is the connection string for the unified bc-db TimescaleDB container.
Variables ¶
This section is empty.
Functions ¶
Types ¶
type AgentFilter ¶
AgentFilter specifies which agents to query.
type AgentMetric ¶
type AgentMetric struct {
Time time.Time `json:"time"`
AgentName string `json:"agent_name"`
Role string `json:"role"`
Tool string `json:"tool"`
Runtime string `json:"runtime"`
State string `json:"state"`
CPUPercent float64 `json:"cpu_percent"`
MemUsedBytes int64 `json:"mem_used_bytes"`
MemLimitBytes int64 `json:"mem_limit_bytes"`
MemPercent float64 `json:"mem_percent"`
NetRxBytes int64 `json:"net_rx_bytes"`
NetTxBytes int64 `json:"net_tx_bytes"`
DiskReadBytes int64 `json:"disk_read_bytes"`
DiskWriteBytes int64 `json:"disk_write_bytes"`
}
AgentMetric represents an agent container resource sample.
type AgentMetrics ¶
type AgentMetrics struct {
// Per-agent stats
AgentStats []AgentStat `json:"agent_stats"`
// Counts
TotalAgents int `json:"total_agents"`
ActiveAgents int `json:"active_agents"`
Coordinators int `json:"coordinators"`
Workers int `json:"workers"`
// By state
Idle int `json:"idle"`
Working int `json:"working"`
Done int `json:"done"`
Stuck int `json:"stuck"`
Error int `json:"error"`
Stopped int `json:"stopped"`
}
AgentMetrics tracks agent statistics.
type AgentStat ¶
type AgentStat struct {
Name string `json:"name"`
Role string `json:"role"`
State string `json:"state"`
Uptime time.Duration `json:"uptime"`
}
AgentStat holds stats for a single agent.
type AgentSummary ¶
type AgentSummary struct {
// Token and cost totals (over period)
Models []ModelCostBreakdown `json:"models,omitempty"`
AgentName string `json:"agent_name"`
Role string `json:"role"`
Tool string `json:"tool"`
Runtime string `json:"runtime"`
State string `json:"state"`
// Resource metrics (latest sample)
CPU CPUSummary `json:"cpu"`
Memory MemorySummary `json:"memory"`
Disk DiskSummary `json:"disk"`
Net NetSummary `json:"network"`
Tokens TokenSummary `json:"tokens"`
Cost CostSummary `json:"cost"`
}
AgentSummary combines resource metrics and token/cost totals for a single agent.
type CPUSummary ¶
type CPUSummary struct {
AvgPercent float64 `json:"avg_percent"`
MaxPercent float64 `json:"max_percent"`
}
CPUSummary holds aggregated CPU metrics.
type ChannelFilter ¶
type ChannelFilter struct {
Channel []string
}
ChannelFilter specifies which channels to query.
type ChannelMetric ¶
type ChannelMetric struct {
Time time.Time `json:"time"`
ChannelName string `json:"channel_name"`
MessageCount int64 `json:"message_count"`
MemberCount int `json:"member_count"`
ReactionCount int64 `json:"reaction_count"`
}
ChannelMetric represents channel activity at a point in time.
type CostSummary ¶
type CostSummary struct {
TotalUSD float64 `json:"total_usd"`
}
CostSummary holds aggregated cost data.
type DefaultTmuxProcRunner ¶ added in v0.2.0
type DefaultTmuxProcRunner struct{}
DefaultTmuxProcRunner is the production runner that shells out to `tmux`, `pgrep`, and `ps`. Works on darwin + linux.
func (DefaultTmuxProcRunner) Children ¶ added in v0.2.0
Children shells to `pgrep -P <pid>`. Works the same on darwin + linux. On linux we could read /proc/<pid>/task/<tid>/children for fewer forks, but pgrep is consistent and the sample interval (30s) makes the cost irrelevant.
func (DefaultTmuxProcRunner) ListSessions ¶ added in v0.2.0
func (DefaultTmuxProcRunner) ListSessions(ctx context.Context) ([]string, error)
ListSessions shells to `tmux list-sessions -F '#{session_name}'`.
type DiskSummary ¶
type DiskSummary struct {
ReadBytes int64 `json:"read_bytes"`
WriteBytes int64 `json:"write_bytes"`
}
DiskSummary holds aggregated disk I/O metrics.
type MemorySummary ¶
type MemorySummary struct {
AvgBytes int64 `json:"avg_bytes"`
MaxBytes int64 `json:"max_bytes"`
AvgPercent float64 `json:"avg_percent"`
}
MemorySummary holds aggregated memory metrics.
type ModelCostBreakdown ¶
type ModelCostBreakdown struct {
Model string `json:"model"`
CostUSD float64 `json:"cost_usd"`
InputTokens int64 `json:"input_tokens"`
OutputTokens int64 `json:"output_tokens"`
}
ModelCostBreakdown shows cost per model.
type NetSummary ¶
NetSummary holds aggregated network metrics.
type Stats ¶
type Stats struct {
CollectedAt time.Time `json:"collected_at"`
WorkspacePath string `json:"workspace_path"`
Agents AgentMetrics `json:"agents"`
// contains filtered or unexported fields
}
Stats holds all workspace statistics.
func (*Stats) Utilization ¶
Utilization returns current agent utilization (working/active).
type Store ¶
type Store struct {
// contains filtered or unexported fields
}
Store provides time-series metrics storage backed by TimescaleDB.
func (*Store) QueryAgentCPU ¶
func (s *Store) QueryAgentCPU(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
QueryAgentCPU returns CPU metrics for agents.
func (*Store) QueryAgentCost ¶
func (s *Store) QueryAgentCost(ctx context.Context, f AgentFilter, tr TimeRange) ([]TokenMetric, error)
QueryAgentCost returns cost metrics for agents.
func (*Store) QueryAgentDisk ¶
func (s *Store) QueryAgentDisk(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
QueryAgentDisk returns disk metrics for agents.
func (*Store) QueryAgentMem ¶
func (s *Store) QueryAgentMem(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
QueryAgentMem returns memory metrics for agents.
func (*Store) QueryAgentNet ¶
func (s *Store) QueryAgentNet(ctx context.Context, f AgentFilter, tr TimeRange) ([]AgentMetric, error)
QueryAgentNet returns network metrics for agents.
func (*Store) QueryAgentSummary ¶
func (s *Store) QueryAgentSummary(ctx context.Context, agentName string, tr TimeRange) (*AgentSummary, error)
QueryAgentSummary returns a combined resource + token + cost summary for a single agent. It runs two queries: one for resource metrics (avg/max over period) and one for token totals.
func (*Store) QueryAgentTokens ¶
func (s *Store) QueryAgentTokens(ctx context.Context, f AgentFilter, tr TimeRange) ([]TokenMetric, error)
QueryAgentTokens returns token usage metrics for agents.
func (*Store) QueryChannelMembers ¶
func (s *Store) QueryChannelMembers(ctx context.Context, f ChannelFilter, tr TimeRange) ([]ChannelMetric, error)
QueryChannelMembers returns member count metrics.
func (*Store) QueryChannelMessages ¶
func (s *Store) QueryChannelMessages(ctx context.Context, f ChannelFilter, tr TimeRange) ([]ChannelMetric, error)
QueryChannelMessages returns message count metrics.
func (*Store) QueryChannelReactions ¶
func (s *Store) QueryChannelReactions(ctx context.Context, f ChannelFilter, tr TimeRange) ([]ChannelMetric, error)
QueryChannelReactions returns reaction count metrics.
func (*Store) QueryLatestAgentMetrics ¶
func (s *Store) QueryLatestAgentMetrics(ctx context.Context) ([]AgentMetric, error)
QueryLatestAgentMetrics returns the most recent metric sample for each agent. Used by the agents list table to show current CPU/Mem without N+1 queries.
func (*Store) QuerySystemCPU ¶
func (s *Store) QuerySystemCPU(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
QuerySystemCPU returns CPU metrics for system containers.
func (*Store) QuerySystemDisk ¶
func (s *Store) QuerySystemDisk(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
QuerySystemDisk returns disk metrics for system containers.
func (*Store) QuerySystemMem ¶
func (s *Store) QuerySystemMem(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
QuerySystemMem returns memory metrics for system containers.
func (*Store) QuerySystemNet ¶
func (s *Store) QuerySystemNet(ctx context.Context, systems []string, tr TimeRange) ([]SystemMetric, error)
QuerySystemNet returns network metrics for system containers.
func (*Store) RecordAgent ¶
func (s *Store) RecordAgent(ctx context.Context, m AgentMetric) error
RecordAgent inserts an agent container metric sample.
func (*Store) RecordChannel ¶
func (s *Store) RecordChannel(ctx context.Context, m ChannelMetric) error
RecordChannel inserts a channel activity sample.
func (*Store) RecordSystem ¶
func (s *Store) RecordSystem(ctx context.Context, m SystemMetric) error
RecordSystem inserts a system container metric sample.
func (*Store) RecordToken ¶
func (s *Store) RecordToken(ctx context.Context, m TokenMetric) error
RecordToken inserts a token usage sample. Duplicate entries (same time, agent, model) are silently skipped via ON CONFLICT, making this idempotent across bcd restarts.
type SystemMetric ¶
type SystemMetric struct {
Time time.Time `json:"time"`
SystemName string `json:"system_name"`
CPUPercent float64 `json:"cpu_percent"`
MemUsedBytes int64 `json:"mem_used_bytes"`
MemLimitBytes int64 `json:"mem_limit_bytes"`
MemPercent float64 `json:"mem_percent"`
NetRxBytes int64 `json:"net_rx_bytes"`
NetTxBytes int64 `json:"net_tx_bytes"`
DiskReadBytes int64 `json:"disk_read_bytes"`
DiskWriteBytes int64 `json:"disk_write_bytes"`
}
SystemMetric represents a system container resource sample.
type TimeRange ¶
type TimeRange struct {
From time.Time
To time.Time
Interval string // e.g. "5m", "1h" — converted to Postgres interval via PGInterval()
}
TimeRange specifies a query window and aggregation interval.
func (TimeRange) PGInterval ¶
PGInterval converts short notation to Postgres interval format. Uses an allowlist to prevent SQL injection via the interval query parameter.
type TmuxProcRunner ¶ added in v0.2.0
type TmuxProcRunner interface {
// PanePIDs returns the pane PIDs for the given tmux session. The
// session name is resolved by the caller (e.g. `bc-<hash>-<agent>`).
// Returns an empty slice (and nil error) when the session does not
// exist — callers treat that as "agent not running, record 0".
PanePIDs(ctx context.Context, session string) ([]int, error)
// ListSessions returns the list of live tmux session names. Used to
// resolve the workspace-hashed session name from a bare agent name.
ListSessions(ctx context.Context) ([]string, error)
// Children returns the direct child PIDs of the given PID.
Children(ctx context.Context, pid int) ([]int, error)
// PSStats returns the (%cpu, rssBytes) pair for each supplied PID.
// Missing PIDs are silently skipped (they may have exited between
// the walk and the ps call).
PSStats(ctx context.Context, pids []int) (cpuPercent float64, rssBytes int64, err error)
}
TmuxProcRunner abstracts the subprocess calls used by TmuxSampler so tests can inject a fake without needing a real tmux server.
type TmuxSample ¶ added in v0.2.0
type TmuxSample struct {
CPUPercent float64
MemBytes int64
// PIDsWalked is the count of descendant PIDs included in the sum;
// useful for debug logging and tests.
PIDsWalked int
}
TmuxSample is the result of sampling a tmux agent.
type TmuxSampler ¶ added in v0.2.0
type TmuxSampler struct {
// contains filtered or unexported fields
}
TmuxSampler samples CPU/memory for a tmux-backed agent via its PID tree.
Zero value is not usable; use NewTmuxSampler.
func NewTmuxSampler ¶ added in v0.2.0
func NewTmuxSampler(runner TmuxProcRunner) *TmuxSampler
NewTmuxSampler constructs a sampler. Pass DefaultTmuxProcRunner for production or a mock for tests.
func (*TmuxSampler) Sample ¶ added in v0.2.0
func (s *TmuxSampler) Sample(ctx context.Context, session, agentName string) (TmuxSample, error)
Sample walks the PID tree rooted at the tmux panes for `session` (trying both the literal name and any name containing `agentName` if the literal miss). It returns a zero sample (no error) when the session has no panes — stopped agents should read 0, not error.
func (*TmuxSampler) WarnNoNetworkOnce ¶ added in v0.2.0
func (s *TmuxSampler) WarnNoNetworkOnce(log func())
WarnNoNetworkOnce invokes `log` exactly once per sampler instance. The server wires this to pkg/log so we avoid an import cycle.
type TokenMetric ¶
type TokenMetric struct {
Time time.Time `json:"time"`
AgentName string `json:"agent_name"`
Model string `json:"model"`
InputTokens int64 `json:"input_tokens"`
OutputTokens int64 `json:"output_tokens"`
CacheRead int64 `json:"cache_read"`
CacheCreate int64 `json:"cache_create"`
CostUSD float64 `json:"cost_usd"`
}
TokenMetric represents token consumption at a point in time.