Porter
A streaming-first Arrow Flight SQL server for DuckDB β simple, sharp, and built for motion.
π§ Overview
Porter is a DuckDB-backed Arrow Flight SQL server designed around one idea:
SQL goes in. Arrow streams out. Everything else is detail.
It sits directly on top of Apache Arrow Flight SQL and exposes a clean execution surface for both raw SQL and prepared statements.
No orchestration layer. No distributed query engine. No abstraction sprawl.
Just a tight execution loop between Flight and DuckDB.
β‘ Key Characteristics
- Streaming-first execution model (Arrow RecordBatch streams)
- Native DuckDB execution via ADBC
- Full prepared statement lifecycle with parameter binding
- TTL-based handle management with background GC
- Minimal, explicit Flight SQL surface area
π§± Architecture
Porter keeps the control flow linear:
+-------------------+
| Flight Client |
+-------------------+
|
gRPC / Flight
|
+-------------------+
| Porter Server |
|-------------------|
| Flight SQL Layer |
| Handle Manager |
| Prepared Stmts |
| Stream Engine |
+-------------------+
|
+-------------------+
| DuckDB |
| (via ADBC) |
+-------------------+
|
+-------------------+
| Arrow RecordBatches|
+-------------------+
The server is intentionally thin: routing, lifecycle, and streaming glue only.
DuckDB does the heavy lifting.
π Getting Started
You have three ways to run Porter depending on how you like to work:
- Docker (fastest path)
go install (clean local toolchain)
- Build from source (full control)
π³ Option 1 β Run with Docker (fastest)
docker build -t porter .
docker run -p 32010:32010 porter
Run with a persistent database:
docker run -p 32010:32010 -v $(pwd)/data:/data porter --db /data/porter.duckdb
Defaults:
- Address:
0.0.0.0:32010
- Database: in-memory (
:memory:)
βοΈ Option 2 β Install via go install
1. Install Porter
go install github.com/TFMV/porter/cmd/porter@latest
This installs porter into your $GOBIN.
2. Install ADBC CLI (dbc)
curl -LsSf https://dbc.columnar.tech/install.sh | sh
3. Install DuckDB ADBC driver
dbc install duckdb
Verify installation:
dbc list
You should see duckdb listed.
π Option 3 β Build from Source
1. Clone
git clone https://github.com/TFMV/porter.git
cd porter
2. Install DuckDB ADBC driver
./install_duckdb.sh
3. Run
go run ./cmd/porter serve
π» CLI Usage
Porter exposes a composable CLI:
porter --help
Run the server
porter serve --db :memory: --port 32010
Execute a query
porter query "SELECT 1 AS value"
REPL
porter repl
Load Parquet
porter load data.parquet
Inspect schema
porter schema table_name
Environment variables
π§ Execution Model
Porter supports two execution paths:
1. One-shot SQL
GetFlightInfoStatement β plan + handle
DoGetStatement β stream results
2. Prepared Statements
CreatePreparedStatement
DoPutPreparedStatementQuery
DoGetPreparedStatement
ClosePreparedStatement
Parameter batches are real Arrow RecordBatches with explicit ownership.
π Streaming Core
DuckDB β Arrow RecordReader β Channel β Flight StreamChunks
Backpressure is enforced naturally via the channel boundary.
π Wire Contract
| Operation |
Behavior |
| SQL Query |
Raw SQL β FlightInfo β DoGet stream |
| Prepared Statements |
Handle-based execution with binding |
| Schema Introspection |
Lightweight probe execution |
π£οΈ Roadmap
- Streaming Flight SQL execution
- Prepared statements
- TTL-based lifecycle
- Background GC
- WebSocket transport
- Session context
- Improved schema probing
- Benchmark suite
π€ Contributing
If youβve ever looked at a data system and thought:
βWhy is this so complicated?β
Youβre in the right place.
Build it smaller. Make it clearer. Keep it moving.