llm-compiler

module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 3, 2026 License: Apache-2.0

README

llm-compiler

Go Version License CI

llm-compiler is a Go library and CLI that compiles LLM workflow definitions into standalone binaries with embedded local inference. Use it as a CLI tool or as a Go library to build CLI tools, local agents, and offline edge deployments.

Focus: Local inference, modular backends, and production-oriented design.

Who is this for?

  • Go developers working with local or small LLMs
  • CLI/service builders embedding LLMs into command-line tools or backend services
  • Performance-focused developers who prefer native performance over Python stacks

Supported platforms

This project is tested on macOS, Linux (Ubuntu), and Windows. CI builds and tests run on all three platforms.

Key features

  • Modular LLM backends – llama.cpp included via submodule; designed for extensibility
  • Go-first architecture – Native performance, single binary deployment
  • CLI and library API – Use the llmc CLI or import pkg/llmc as a Go library
  • Workflow compilation – Compile YAML workflows into standalone Go binaries
  • Cross-workflow synchronization via wait_for with optional timeouts and fail-fast error propagation
  • Shell steps with template substitution using workflow outputs
  • Optional subprocess worker mode (LLMC_SUBPROCESS=1) for true concurrent model execution
  • Integration test harness that compiles example workflows and persists outputs for debugging in CI

Quickstart

  1. Clone the repo:
git clone --recurse-submodules https://github.com/LiboWorks/llm-compiler.git
cd llm-compiler
  1. Build llama.cpp (required for local_llm steps):
./scripts/build-llama.sh

The script auto-detects your OS and configures the appropriate backend:

  • macOS: Metal + Apple BLAS (GPU acceleration)
  • Linux: CPU backend (use --cuda or --vulkan for GPU)
  • Windows: CPU backend via MinGW (use --cuda or --vulkan for GPU)

Options:

./scripts/build-llama.sh --clean    # Clean build
./scripts/build-llama.sh --cuda     # Enable CUDA (Linux/Windows)
./scripts/build-llama.sh --vulkan   # Enable Vulkan (Linux/Windows)
Manual build instructions (if script doesn't work)

macOS (Metal backend):

cd third_party/llama.cpp
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_SHARED_LIBS=OFF \
  -DGGML_METAL=ON \
  -DGGML_BLAS=ON \
  -DGGML_BLAS_VENDOR=Apple
cmake --build . --config Release -j$(sysctl -n hw.ncpu)
cd ../../..

Linux (Ubuntu, CPU backend):

cd third_party/llama.cpp
mkdir -p build && cd build
cmake .. -DCMAKE_BUILD_TYPE=Release \
  -DBUILD_SHARED_LIBS=OFF \
  -DGGML_METAL=OFF \
  -DGGML_BLAS=OFF \
  -DLLAMA_CURL=OFF
cmake --build . --config Release -j$(nproc)
cd ../../..

Windows (CPU backend with MinGW):

cd third_party/llama.cpp
mkdir -p build; cd build
cmake .. -G "MinGW Makefiles" -DCMAKE_BUILD_TYPE=Release `
  -DBUILD_SHARED_LIBS=OFF `
  -DGGML_METAL=OFF `
  -DGGML_BLAS=OFF `
  -DGGML_OPENMP=OFF `
  -DLLAMA_CURL=OFF `
  -DLLAMA_BUILD_COMMON=OFF
cmake --build . --config Release -j $env:NUMBER_OF_PROCESSORS
cd ../../..
  1. Build the CLI:
go build ./cmd/llmc
  1. Compile your workflows (example):
./llmc compile -i example.yaml -o ./build
  1. Run the generated program:
# Run in-process (LLM calls are serialized):
./build/workflows
# Or run with subprocess workers for true concurrency:
LLMC_SUBPROCESS=1 ./build/workflows

Go Library API

Use llm-compiler programmatically by importing the public API:

import "github.com/LiboWorks/llm-compiler/pkg/llmc"

// Compile a workflow file to a binary
result, err := llmc.CompileFile("workflow.yaml", &llmc.CompileOptions{
    OutputDir: "./build",
})

// Or load and inspect workflows first
workflows, err := llmc.LoadWorkflows("workflow.yaml")

// Build workflows programmatically
wf := llmc.NewWorkflow("my-workflow").
    AddStep(llmc.LLMStep("analyze", "Analyze these items and summarize: {{items}}").
		WithModel("gpt-4").
		WithMaxTokens(1024).
		WithOutput("analysis").
		Build())

See pkg/llmc for the full API surface.

Public API Stability

Only packages under pkg/ are considered public API.

  • Packages under internal/ are private implementation details
  • CLI behavior may change between minor versions
  • Public APIs may change during v0.x, but breaking changes will be documented

Do not depend on non-pkg/ packages.

Notes about concurrency

  • By default the local LLM runtime serializes C-level Predict calls to avoid concurrency issues with the ggml/llama C binding. This means multiple local_llm steps will be queued when running in-process.
  • Use LLMC_SUBPROCESS=1 to enable subprocess workers; each worker is an isolated process that can load models independently and run in parallel.

Building with Pro features

This repo supports an optional private pro module. To build with Pro features locally use a go.work or replace to make the private module available and build with -tags pro.

Third-Party Dependencies

This project integrates the following open-source software:

llama.cpp is included as a git submodule and remains under its original license.

License

This project is licensed under the Apache 2.0 License. See LICENSE for details.

How to contribute

See CONTRIBUTING.md for guidelines on opening issues and submitting pull requests.

Roadmap

  • Stable public API
  • Additional backend support
  • Example projects

Directories

Path Synopsis
cmd
llmc command
internal
backend
Package backend defines interfaces for workflow step execution backends.
Package backend defines interfaces for workflow step execution backends.
compiler
Package compiler provides the core compilation logic for llm-compiler.
Package compiler provides the core compilation logic for llm-compiler.
config
Package config provides centralized configuration management for llm-compiler.
Package config provides centralized configuration management for llm-compiler.
runtime
Package runtime provides runtime helpers for generated llm-compiler programs.
Package runtime provides runtime helpers for generated llm-compiler programs.
testing
Package testing provides test utilities and helpers for llm-compiler tests.
Package testing provides test utilities and helpers for llm-compiler tests.
worker
Package worker provides subprocess worker management for llm-compiler.
Package worker provides subprocess worker management for llm-compiler.
pkg
llmc
Package llmc provides a public API for the llm-compiler.
Package llmc provides a public API for the llm-compiler.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL