awf

module
v0.0.0-...-89e6720 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 15, 2026 License: Apache-2.0

README

AWF

CI Go Report Card Go Reference Go 1.26 License

AWF is a single-binary runtime for agentic workflows that need a real acceptance gate: run the agent, check the result independently, repair from the critique, and resume safely after crashes without redoing committed work.

It is built for workflows where a model saying "done" is not good enough: coding agents, research agents, data-migration agents, support-reply agents, security triage agents, or any pipeline where each stage should advance only after an external check passes.

CLI reference | Workflow format

Why AWF Exists

Agentic workflows are useful because they let models do open-ended work. They are risky for the same reason: the step can report success without actually meeting the requirement.

AWF makes the acceptance check part of the runtime.

flowchart LR
    generate["generate<br/>agent or command"]
    evaluate["evaluate<br/>fresh judge or deterministic check"]
    pass{"passes?"}
    next["next stage"]
    repair["repair with critique"]

    generate --> evaluate --> pass
    pass -->|"yes"| next
    pass -->|"no"| repair --> generate

The central primitive is the gate: a generate block, an independent evaluate block, an until condition over the evaluator's typed output, and a bounded repair loop. The evaluator runs in a fresh context, so the generator never marks its own homework. A crash is not a verdict; only a real evaluation with a false until consumes a repair attempt.

What You Get

  • Independent gates: engine-enforced generate -> evaluate -> repair loops, with the prior verdict automatically fed into the next generate attempt.
  • Black-box agents: wrap existing CLIs such as Claude Code, Factory droid, Block Goose, OpenAI Codex, or use awf/llm for direct OpenAI-compatible HTTP.
  • Typed outputs: downstream steps bind to validated output_schema fields, not fragile free text.
  • Checkpoint/resume: step outputs and declared files commit to a content-addressed artifact store before the journal pointer moves.
  • Pinned replay: resume hard-fails if the workflow definition, imported files, assets, container digests, or resolved runtime versions drift.
  • Real workspaces: run against long-lived native processes, digest-pinned containers, or Compose labs; Docker handles networking, healthchecks, and multi-service wiring.
  • Traceable runs: inspect runs, fold status from the log, and export traces without putting observability in the execution path.

A First Workflow

This workflow asks a model to write a release note, then has an independent judge approve it or send feedback into the next repair attempt.

workflow: gated-release-note
version: 1

graph:
  - gate:
      generate:
        - id: draft
          uses: awf/llm
          with:
            base_url: http://localhost:11434/v1
            model: llama3.1
            api_key_env: OPENAI_API_KEY
            system_prompt: "You write concise release notes."
            prompt: |
              Write a three-sentence release note for a new `awf run --json` flag.
              Prior review feedback, if any: {{ evaluate.feedback }}
          output_schema:
            type: object
            additionalProperties: false
            required: [release_note]
            properties:
              release_note: { type: string }

      evaluate:
        - id: judge
          uses: awf/llm
          with:
            base_url: http://localhost:11434/v1
            model: llama3.1
            api_key_env: OPENAI_API_KEY
            system_prompt: "You are a strict release-note reviewer."
            prompt: |
              Review this release note:

              {{ step.draft.release_note }}

              Approve it only if it is accurate, specific, and exactly three sentences.
          output_schema:
            type: object
            additionalProperties: false
            required: [approved, feedback]
            properties:
              approved: { type: boolean }
              feedback: { type: string }

      until: "{{ evaluate.approved }}"
      max_attempts: 3

That same shape works for higher-stakes tasks: generate a patch and run tests, draft a customer reply and judge it against account data, triage a CVE and check the exploitability claim, or migrate data and verify the target state.

Quickstart

AWF is a Go 1.26 CLI. Build the binary:

git clone https://github.com/valbaudo/awf.git
cd awf
make build

Validate a workflow without running agents, containers, or network I/O:

bin/awf validate examples/awf-llm-ollama/workflow.yaml

Run the local Ollama example after starting an OpenAI-compatible Ollama server and forwarding an API-key env var. Ollama can ignore the bearer value; AWF still requires the named env var to be present because adapters never inline secrets into workflow files.

export OPENAI_API_KEY=ollama
bin/awf run examples/awf-llm-ollama/workflow.yaml

Use Docker when you want resumable runs or workflows that need isolated containers/Compose labs:

bin/awf run --backend docker path/to/workflow.yaml
bin/awf resume <run-id> path/to/workflow.yaml

Useful development checks:

make lint test        # pre-commit bar
make build            # build ./bin/awf
make integ            # Docker/native integration suite; no live API spend

How AWF Is Different

Concern Common agent framework shape AWF shape
Model execution Framework calls the model in-process Runtime wraps external CLIs or one HTTP LLM call as black boxes
Quality check Author wires evaluator nodes or post-hoc evals Engine owns the independent gate and repair loop
State Mutable snapshots or app-managed memory Append-only journal plus content-addressed artifacts
Resume Recompute, restore app state, or apply latest definition Replay committed steps, rerun only the uncommitted frontier
Drift Often latest-wins Definition digest and runtime versions are pinned; drift is a hard error
Infra Usually in-process or platform-owned workers Single-host native/Docker/Compose workspaces, rebuilt from pinned recipes

AWF is not trying to be a distributed scheduler, a durable-execution platform, or a general agent-team framework. It is a narrow runtime for single-host, checkpointed agentic pipelines where every meaningful stage has to pass an independent check before the workflow can move on.

Core Concepts

  • Workflow: a YAML document with input schema, optional assets/imports, execution infrastructure, and a graph of steps/control flow. The resolved document is content-addressed at run start.
  • Step: a run: command, uses: agent invocation, await signal, or imported workflow call. Steps can produce typed JSON outputs and named output files.
  • Gate: a control node that runs generate steps, evaluates them independently, checks until, and repairs with feedback until the condition passes or max_attempts is exhausted.
  • Commit: the durable unit of progress. AWF writes typed outputs, declared files, and optional workspace snapshots to the blob store first; only then does it append the completed journal event.
  • Resume: a fold over the journal. Completed steps are replayed from committed artifacts; only the uncommitted frontier re-executes. Changed definitions or runtime versions stop the resume instead of silently adapting.

Adapters and Backends

Agent steps name a runtime with uses:. The step's with: map is opaque to the engine; only the named adapter validates and interprets it.

Built-in adapters:

  • anthropic/claude-code
  • factory/droid
  • block/goose
  • openai/codex
  • awf/llm for OpenAI-compatible Chat Completions endpoints, including local Ollama, vLLM, llama.cpp, LM Studio, LiteLLM, and Bifrost-style gateways
  • openai/codex-live for Codex app-server-backed live sessions

The openai/codex-live adapter uses the same uses: resolution, runtime pinning, live event stream, trace, and UI surfaces as other agent steps. It stores provider session metadata under the live home and keeps raw live transcripts provider-owned. block/goose-live remains a reserved implementation track ref, and anthropic/claude-code-live remains deferred behind a PTY proof spike.

Execution backends:

  • native: host processes, fastest path, no isolation; not resumable (awf resume hard-errors — use --backend docker for resumable runs)
  • docker: digest-pinned images and Compose projects, resumable
  • fake: in-memory backend for conformance tests

See awf(1) for adapter environment variables, streaming notes, security caveats, and CLI flags.

Documentation

  • awf(1): command reference, flags, exit status, environment, tracing, and examples.
  • awf-workflow(5): the workflow-format reference and stable contract for fields, control flow, templating, typed outputs, and checkpoint/resume.
  • examples/: runnable examples for awf/llm, droid BYOK, and engine-owned conversation threads.

Further Reading

These are the sources AWF actually draws from. The list is intentionally narrow.

Runtime foundations

Directories

Path Synopsis
Package agent provides the Adapter interface — the seam between the engine's dispatcher and external agent CLIs (Claude Code first, per awf-workflow(5) (Agent step) and runtime-design.md §8).
Package agent provides the Adapter interface — the seam between the engine's dispatcher and external agent CLIs (Claude Code first, per awf-workflow(5) (Agent step) and runtime-design.md §8).
awfllm
Package awfllm implements agent.Adapter as a single, streaming LLM call against any OpenAI-compatible Chat Completions endpoint (OpenAI, Ollama, vLLM, llama.cpp, LM Studio, LiteLLM/Bifrost gateways).
Package awfllm implements agent.Adapter as a single, streaming LLM call against any OpenAI-compatible Chat Completions endpoint (OpenAI, Ollama, vLLM, llama.cpp, LM Studio, LiteLLM/Bifrost gateways).
claude
Package claude implements agent.Adapter against the Claude Code CLI.
Package claude implements agent.Adapter against the Claude Code CLI.
codex
Package codex implements agent.Adapter against OpenAI's `codex` CLI (the `codex exec` non-interactive subcommand).
Package codex implements agent.Adapter against OpenAI's `codex` CLI (the `codex exec` non-interactive subcommand).
droid
Package droid implements agent.Adapter against Factory AI's `droid` CLI (the `droid exec` non-interactive subcommand).
Package droid implements agent.Adapter against Factory AI's `droid` CLI (the `droid exec` non-interactive subcommand).
fake
Package fake provides an in-memory scripted agent.Adapter implementation.
Package fake provides an in-memory scripted agent.Adapter implementation.
goose
Package goose implements agent.Adapter against Block's `goose` CLI (the `goose run` non-interactive subcommand).
Package goose implements agent.Adapter against Block's `goose` CLI (the `goose run` non-interactive subcommand).
Package cli assembles the command-line surface.
Package cli assembles the command-line surface.
Package clock provides the Clock and IDGen interfaces, injected wherever time/ids are needed.
Package clock provides the Clock and IDGen interfaces, injected wherever time/ids are needed.
cmd
awf command
Command awf is the AWF runtime CLI entry point.
Command awf is the AWF runtime CLI entry point.
genrates command
Command genrates regenerates pricing/rates.json from the models.dev pricing database (with a LiteLLM fallback), filtered to a curated allowlist of the models AWF prices.
Command genrates regenerates pricing/rates.json from the models.dev pricing database (with a LiteLLM fallback), filtered to a curated allowlist of the models AWF prices.
Package conformance is the Backend-parameterized test suite the design spec §H calls "the definition of done" for Phase 2 onward.
Package conformance is the Backend-parameterized test suite the design spec §H calls "the definition of done" for Phase 2 onward.
Package container provides the Backend seam — the interface the engine's Dispatcher uses to run commands inside long-lived containers (a single digest-pinned image or a compose project, per awf-workflow(5), CONTAINERS).
Package container provides the Backend seam — the interface the engine's Dispatcher uses to run commands inside long-lived containers (a single digest-pinned image or a compose project, per awf-workflow(5), CONTAINERS).
backendtest
Package backendtest is the parameterized interface-conformance test for container.Backend.
Package backendtest is the parameterized interface-conformance test for container.Backend.
docker
Package docker implements container.Backend against the Docker Engine SDK.
Package docker implements container.Backend against the Docker Engine SDK.
native
Package native implements container.Backend by running commands directly on the host via os/exec.
Package native implements container.Backend by running commands directly on the host via os/exec.
engine/local_dispatcher_agent.go — the AgentStep dispatch path.
engine/local_dispatcher_agent.go — the AgentStep dispatch path.
frontend
yaml
Package yaml is the YAML frontend: parses YAML workflows into the ir.Workflow IR.
Package yaml is the YAML frontend: parses YAML workflows into the ir.Workflow IR.
Package graph projects a workflow into a node/edge graph for visualization — the JSON contract behind a future visual graph tool (`awf graph --json`).
Package graph projects a workflow into a node/edge graph for visualization — the JSON contract behind a future visual graph tool (`awf graph --json`).
Package ir provides stable IR types (the contract), structural validation, and the definition digest.
Package ir provides stable IR types (the contract), structural validation, and the definition digest.
Package loader reads workflow YAML files and their referenced compose files/assets into an ir.LoadedDefinition.
Package loader reads workflow YAML files and their referenced compose files/assets into an ir.LoadedDefinition.
Package obs is the read-only OpenTelemetry projection of the AWF event log.
Package obs is the read-only OpenTelemetry projection of the AWF event log.
Package pricing converts normalized token counts into a per-model USD cost.
Package pricing converts normalized token counts into a per-model USD cost.
Package retry provides retry policy, backoff math, and per-policy exit-code classification — the data primitive that engine.RunWithRetry composes with a Dispatcher + Clock into the retry loop.
Package retry provides retry policy, backoff math, and per-policy exit-code classification — the data primitive that engine.RunWithRetry composes with a Dispatcher + Clock into the retry loop.
Package runlock is the sidecar run-liveness lock: an exclusive BSD flock(2) held by `awf run` / `awf resume` for a run's lifetime, plus a non-blocking shared probe (Held) that distinguishes a live run (lock held) from a crashed one (lock free).
Package runlock is the sidecar run-liveness lock: an exclusive BSD flock(2) held by `awf run` / `awf resume` for a run's lifetime, plus a non-blocking shared probe (Held) that distinguishes a live run (lock held) from a crashed one (lock free).
Package signal is the cross-process control-surface for `awf signal` / `awf pause` / `awf cancel`.
Package signal is the cross-process control-surface for `awf signal` / `awf pause` / `awf cancel`.
Package state provides the durability core — an append-only Log and content-addressed Blobs — that the engine (Phase 2) sits on for commit/resume and that obs (Phase 6) reads to project OTel spans.
Package state provides the durability core — an append-only Log and content-addressed Blobs — that the engine (Phase 2) sits on for commit/resume and that obs (Phase 6) reads to project OTel spans.
Package template implements the §7 mini-language: lexer + recursive-descent parser for the bounded expression grammar from the Phase 1 design spec §B, plus reference extraction over the resulting AST and a `{{ … }}` slot scanner for substitution-bearing host strings.
Package template implements the §7 mini-language: lexer + recursive-descent parser for the bounded expression grammar from the Phase 1 design spec §B, plus reference extraction over the resulting AST and a `{{ … }}` slot scanner for substitution-bearing host strings.
Package ui serves the awf visual graph tool: a localhost HTTP server that renders a workflow's graph (via the Slice-1 graph package) and overlays run state.
Package ui serves the awf visual graph tool: a localhost HTTP server that renders a workflow's graph (via the Slice-1 graph package) and overlays run state.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL