shell

package
v0.2.6-alpha.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 22, 2026 License: MIT Imports: 18 Imported by: 0

Documentation

Overview

Package shell hosts shell-side tools: Bash, Ls, Grep, Tree.

Bash is synchronous shell execution; Ls/Tree are filesystem listing helpers (cheaper than a Bash round-trip when you just want to look at a directory); Grep is a regex search across files. Long-running process monitoring is intentionally elsewhere (the monitor package).

Index

Constants

View Source
const BgTaskDomain = "bg_tasks"

BgTaskDomain is the observable.Change.Domain value the store emits. Subscribers route renders by matching this on KindStoreUpdate.

Variables

View Source
var Grep tools.Tool = &GrepTool{}

Grep is the singleton GrepTool. Delegates to system grep via Bash.

View Source
var Tree tools.Tool = &TreeTool{}

Tree is the singleton TreeTool. Stateless.

Functions

func GenerateID

func GenerateID() string

GenerateID returns a wire-stable "b" + 8 random base-36 characters, mirroring ref's generateTaskId for type=local_bash (b-prefixed IDs are recognisable in transcripts and don't collide with monitor IDs which use "m").

func Names

func Names() []tools.ToolName

Names lists every tool name this package contributes, in canonical order.

func TaskNames

func TaskNames() []tools.ToolName

TaskNames lists the companion task tools this package contributes. Composed into a profile's DeferredTools (the Main profile in internal/agent/profiles.go) — the model discovers them through tool_search after spawning its first background task.

Types

type BashTool

type BashTool struct {
	// contains filtered or unexported fields
}

BashTool runs `/bin/sh -c <command>` with cmd.Dir set to the workdir captured at construction. One BashTool instance per agent — the toolset factory in internal/toolset/builtins.go calls NewBash(s.Workdir()) so each agent (including subagents spawned with isolation: "worktree") gets a tool that runs in its own directory. The bash process is fresh per call — shell env state does NOT persist between invocations.

When run_in_background=true and a BgTaskHost is installed on the agent's ToolState, Execute returns immediately with a task id and the command runs in a detached goroutine. Completion routes back to the agent's signal pump for idle-wake + drain-at-iter-start delivery (see Phase 16 design).

func NewBash

func NewBash(workdir string) *BashTool

NewBash constructs a BashTool bound to workdir. An empty workdir means "use the process's current directory" (cmd.Dir = "" — exec defaults). Use this for tests / narrow callers; production tooling always passes the agent's workdir.

host may be nil — in that case run_in_background falls back to the historical "not supported" error path. Production callers (the toolset builtins factory) pass a non-nil host so the Phase 16 detached path works.

func NewBashWithHost

func NewBashWithHost(workdir string, host BgTaskHost) *BashTool

NewBashWithHost is the production constructor used by the toolset builtins factory. The host supplies the BgTaskStore + signal sender run_in_background needs; without it the flag is rejected with a clear message.

func (*BashTool) Description

func (t *BashTool) Description() string

func (*BashTool) Execute

func (t *BashTool) Execute(ctx context.Context, logger *slog.Logger, input json.RawMessage) (tools.Result, error)

func (*BashTool) Name

func (t *BashTool) Name() string

func (*BashTool) Schema

func (t *BashTool) Schema() json.RawMessage

type BgTask

type BgTask struct {
	BgTaskSnapshot
	// Cancel terminates the underlying process. nil for tasks that have
	// already exited.
	Cancel context.CancelFunc
}

BgTask is the live record the store mutates. Cancel is the func that kills the underlying process (set by Bash before it spawns the goroutine); calling it from task_stop transitions the snapshot to BgKilled when the process exits.

type BgTaskHost

type BgTaskHost interface {
	// BgTaskStore returns the agent's background-task catalog. Never nil
	// for the host that the agent installs.
	BgTaskStore() *BgTaskStore
	// RootCtx returns the agent-lifetime context. Bg goroutines bind to
	// this rather than the per-call ctx so they survive the LLM call
	// that spawned them.
	RootCtx() context.Context
	// AgentID returns the spawning agent's id; copied into the snapshot
	// so the TUI can label rows by owner.
	AgentID() string
	// NotifyBgResult fires the agent's signal pump with this terminal
	// snapshot. Non-blocking; drops the signal if the chan buffer is
	// full (drain on next iter is the fallback).
	NotifyBgResult(snap BgTaskSnapshot)
}

BgTaskHost is the narrow surface every bg-tasks-aware tool (Bash with run_in_background, task_list, task_output, task_stop) reads. The host is implemented by *toolset.ToolState; the tool type-asserts on it at the start of Execute, same pattern the SKILL tool uses for its registry.

type BgTaskSnapshot

type BgTaskSnapshot struct {
	ID          string
	Command     string
	Description string
	Status      BgTaskStatus
	ExitCode    int
	Output      string
	StartedAt   time.Time
	CompletedAt time.Time
	// AgentID is the agent that spawned this task — used by the TUI to
	// prefix subagent rows with their owner.
	AgentID string
}

BgTaskSnapshot is the public shape of one background task. The store hands out snapshots by value so observers don't race the goroutine holding the live struct.

type BgTaskStatus

type BgTaskStatus string

BgTaskStatus is the lifecycle state of one detached Bash command.

Transitions:

BgRunning → BgCompleted   (process exited with code 0)
BgRunning → BgFailed      (process exited with non-zero code)
BgRunning → BgKilled      (task_stop or root ctx cancelled)

Terminal states are drained out of the store via DrainCompleted; their snapshots survive in the loop's iteration history as <system-reminder> blocks but are removed from the live store the moment they're folded in.

const (
	BgRunning   BgTaskStatus = "running"
	BgCompleted BgTaskStatus = "completed"
	BgFailed    BgTaskStatus = "failed"
	BgKilled    BgTaskStatus = "killed"
)

type BgTaskStore

type BgTaskStore struct {
	*observable.Observable
	// contains filtered or unexported fields
}

BgTaskStore is the agent-owned catalog of background tasks. Embedded Observable fans every Change to subscribers (the TUI strip + the agent's KindStoreUpdate bridge). The store is safe for concurrent use — every mutator takes mu.

Lifecycle:

  • Bash creates a snapshot via Add when spawning a bg task.
  • The bg goroutine calls Complete / Fail when the process exits.
  • task_stop calls Stop to flip Cancel and let the goroutine close out.
  • The agent loop calls DrainCompleted at iter start to pull terminal entries into the conversation; drained entries are removed.

func NewBgTaskStore

func NewBgTaskStore() *BgTaskStore

NewBgTaskStore returns an empty store. Construction is cheap; the embedded Observable allocates its observer slice lazily.

func (*BgTaskStore) Add

func (s *BgTaskStore) Add(snap BgTaskSnapshot, cancel context.CancelFunc)

Add registers a freshly-spawned task. Status MUST be BgRunning. Cancel is the func that kills the bg process (called by Stop). Add emits a "started" observable.Change.

func (*BgTaskStore) Complete

func (s *BgTaskStore) Complete(id string, status BgTaskStatus, exitCode int, output string)

Complete transitions the task to a terminal state (BgCompleted / BgFailed / BgKilled) with the captured output, exit code, and finish time. The Cancel func is cleared so task_stop on a finished task is a clean no-op. Emits an Op matching the terminal status so subscribers can render distinct outcomes.

func (*BgTaskStore) Domain

func (s *BgTaskStore) Domain() string

Domain returns the observable store domain. Implements observable.Store.

func (*BgTaskStore) DrainCompleted

func (s *BgTaskStore) DrainCompleted() []BgTaskSnapshot

DrainCompleted pulls every terminal task out of the store, returning snapshots in completion order. The store emits a "removed" Change for each drained task so the TUI strip can render the "task-xxx completed" transcript line before the chip disappears.

Running tasks stay untouched — the next drain picks them up when they finish.

func (*BgTaskStore) Get

func (s *BgTaskStore) Get(id string) (BgTaskSnapshot, bool)

Get returns a snapshot of one task. ok=false when unknown.

func (*BgTaskStore) HasPending

func (s *BgTaskStore) HasPending() bool

HasPending reports whether any task is in a terminal state but has not been drained yet. The agent loop calls this before returning from a terminal turn — pending entries force one more iteration so the model sees the result before the loop releases the run flag.

func (*BgTaskStore) Snapshot

func (s *BgTaskStore) Snapshot() []BgTaskSnapshot

Snapshot returns every task in started-at order. Used by the TUI strip; safe to call from any goroutine.

func (*BgTaskStore) Stop

func (s *BgTaskStore) Stop(id string) (BgTaskSnapshot, bool)

Stop signals the named task to terminate. Returns ok=true when the task was running and a cancel was invoked, ok=false when the task is unknown or already terminal (task_stop surfaces this as a no-op).

type Classification

type Classification struct {
	Risk       Risk
	IsCommonFS bool
	Matched    string
	Reason     string
}

Classification is the structured result of Classify.

Matched is the rule entry that triggered the verdict — for ReadOnly it's the binary name; for Dangerous it's the prefix that matched. Surfaced in the approval UI so the user knows *why* a prompt is showing.

IsCommonFS is an orthogonal flag (NOT a Risk level) set when the binary is one of {mkdir, touch, mv, cp, rmdir, ln, chmod, chown}. The gate uses it to auto-allow these in accept_edits mode while leaving them as regular RiskMutate calls in default mode (which still asks).

func Classify

func Classify(command string) Classification

Classify inspects a shell command string and returns a structured Risk assessment. The classifier:

  1. Splits on safe operators (`|`, `&&`) and recurses on each segment. The combined Risk is the max — a chain is only ReadOnly if every segment is ReadOnly, and is Dangerous if any segment is Dangerous.
  2. Blocks unsafe operators (`;`, `||`, `>`, `>>`, `<`) — they collapse the command to RiskUnknown so the gate asks.
  3. Strips leading `VAR=value` env assignments (POSIX prefix; not the same as `env VAR=value cmd`).
  4. Checks the first non-env token against the dangerous-prefix list.
  5. Checks the first token against the read-only allowlist.

Empty input is RiskUnknown — defensive.

type GrepTool

type GrepTool struct{}

func NewGrep

func NewGrep() *GrepTool

func (*GrepTool) Description

func (t *GrepTool) Description() string

func (*GrepTool) Execute

func (t *GrepTool) Execute(ctx context.Context, logger *slog.Logger, input json.RawMessage) (tools.Result, error)

func (*GrepTool) Name

func (t *GrepTool) Name() string

func (*GrepTool) Schema

func (t *GrepTool) Schema() json.RawMessage

type Risk

type Risk int

Risk classifies a shell command's safety from the gate's perspective.

The gate uses Risk to drive auto mode (RiskReadOnly → allow, RiskDangerous → ask with a hint, RiskMutate/RiskUnknown → ask without a hint). Default mode treats every risk level as "ask," so a misclassification can't bypass the user — at worst it shows a prompt that didn't need to appear.

Conservative bias: RiskUnknown is the catch-all. Anything we can't confidently rate as ReadOnly stays Unknown, which forces ask.

const (
	RiskUnknown   Risk = iota // safe fallback: forces ask in auto mode
	RiskReadOnly              // read-only allowlist binary with safe arguments
	RiskMutate                // a non-allowlisted, non-dangerous command
	RiskDangerous             // matches a known code-execution prefix
)

func (Risk) String

func (r Risk) String() string

String returns a stable, human-readable name for the risk level.

type TaskListTool

type TaskListTool struct {
	// contains filtered or unexported fields
}

TaskListTool enumerates every background task in the agent's BgTaskStore. Pure read; safe in any permission mode. Mirrors ref's TaskListTool.

func NewTaskList

func NewTaskList(host BgTaskHost) *TaskListTool

NewTaskList constructs the tool. host may be nil — Execute reports a clear error in that case so the model gets a useful message instead of a nil panic.

func (*TaskListTool) Description

func (t *TaskListTool) Description() string

func (*TaskListTool) Execute

func (t *TaskListTool) Execute(_ context.Context, logger *slog.Logger, _ json.RawMessage) (tools.Result, error)

func (*TaskListTool) Name

func (t *TaskListTool) Name() string

func (*TaskListTool) Schema

func (t *TaskListTool) Schema() json.RawMessage

type TaskOutputTool

type TaskOutputTool struct {
	// contains filtered or unexported fields
}

TaskOutputTool returns the captured stdout+stderr of one task. Works for running and terminal tasks (running tasks return whatever has been buffered so far at snapshot time — Phase 16's store captures output only at Complete time, so running tasks read empty).

func NewTaskOutput

func NewTaskOutput(host BgTaskHost) *TaskOutputTool

func (*TaskOutputTool) Description

func (t *TaskOutputTool) Description() string

func (*TaskOutputTool) Execute

func (t *TaskOutputTool) Execute(_ context.Context, logger *slog.Logger, raw json.RawMessage) (tools.Result, error)

func (*TaskOutputTool) Name

func (t *TaskOutputTool) Name() string

func (*TaskOutputTool) Schema

func (t *TaskOutputTool) Schema() json.RawMessage

type TaskStopTool

type TaskStopTool struct {
	// contains filtered or unexported fields
}

TaskStopTool terminates a running task. Idempotent on tasks that have already finished (returns a no-op message). The store's Stop method cancels the task's per-process ctx; the bg goroutine then calls Complete with Status=BgKilled and the agent's signal pump delivers the killed snapshot like any other terminal result.

func NewTaskStop

func NewTaskStop(host BgTaskHost) *TaskStopTool

func (*TaskStopTool) Description

func (t *TaskStopTool) Description() string

func (*TaskStopTool) Execute

func (t *TaskStopTool) Execute(_ context.Context, logger *slog.Logger, raw json.RawMessage) (tools.Result, error)

func (*TaskStopTool) Name

func (t *TaskStopTool) Name() string

func (*TaskStopTool) Schema

func (t *TaskStopTool) Schema() json.RawMessage

type TreeTool

type TreeTool struct{}

func (*TreeTool) Description

func (t *TreeTool) Description() string

func (*TreeTool) Execute

func (t *TreeTool) Execute(_ context.Context, logger *slog.Logger, input json.RawMessage) (tools.Result, error)

func (*TreeTool) Name

func (t *TreeTool) Name() string

func (*TreeTool) Schema

func (t *TreeTool) Schema() json.RawMessage

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL