porter

module
v0.14.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 27, 2026 License: MIT

README ΒΆ

Porter Logo

Porter

A streaming-first Arrow Flight SQL server for DuckDB β€” simple, sharp, and built for motion.


🧭 Overview

Porter is a DuckDB-backed Arrow Flight SQL server designed around one idea:

SQL goes in. Arrow streams out. Everything else is detail.

It sits directly on top of Apache Arrow Flight SQL and exposes a clean execution surface for both raw SQL and prepared statements.

No orchestration layer. No distributed query engine. No abstraction sprawl.

Just a tight execution loop between Flight and DuckDB.


⚑ Key Characteristics

  • Streaming-first execution model (Arrow RecordBatch streams)
  • Native DuckDB execution via ADBC
  • Full prepared statement lifecycle with parameter binding
  • TTL-based handle management with background GC
  • Minimal, explicit Flight SQL surface area

🧱 Architecture

Porter keeps the control flow linear:

           +-------------------+
           |   Flight Client   |
           +-------------------+
                     |
               gRPC / Flight
                     |
           +-------------------+
           |   Porter Server   |
           |-------------------|
           | Flight SQL Layer  |
           | Handle Manager    |
           | Prepared Stmts    |
           | Stream Engine     |
           +-------------------+
                     |
           +-------------------+
           |     DuckDB        |
           |   (via ADBC)      |
           +-------------------+
                     |
           +-------------------+
           | Arrow RecordBatches|
           +-------------------+

The server is intentionally thin: routing, lifecycle, and streaming glue only. DuckDB does the heavy lifting.


πŸš€ Getting Started

0. Install DuckDB driver

Before anything else:

./install_duckdb.sh

This sets up the required DuckDB ADBC driver environment.


1. Run the Server

go run ./cmd/server

Defaults:

  • Address: localhost:32010
  • Database: in-memory DuckDB (:memory:)

2. Run a Client

You have two ways to exercise the system:

Native client
go run ./cmd/client
Example harness
go run ./example

Both will issue queries and stream Arrow record batches back from Flight.


πŸ’» CLI Usage

Porter also exposes a developer-facing CLI under cmd/porter. The built CLI is a small, composable tool for local workflows.

Build and use the Porter CLI

go build -o porter ./cmd/porter
./porter --help

Run the server

The default action is serve, so ./porter behaves the same as ./porter serve.

./porter serve --db :memory: --port 32010
# or simply
./porter --db :memory: --port 32010

Execute a single query

./porter query "SELECT 1 AS value"

Start an interactive REPL

./porter repl

Load Parquet data

./porter load data.parquet

Inspect a table schema

./porter schema table_name

Environment variables

PORTER_DB and PORTER_PORT are supported as alternate configuration sources.


🧠 Execution Model

Porter supports two execution paths:

1. One-shot SQL

  • GetFlightInfoStatement β†’ plan + handle
  • DoGetStatement β†’ stream results

Ephemeral handles, auto-expire under TTL.


2. Prepared Statements

  • CreatePreparedStatement β†’ persistent handle
  • DoPutPreparedStatementQuery β†’ bind parameters
  • DoGetPreparedStatement β†’ execute + stream
  • ClosePreparedStatement β†’ cleanup

Parameter batches are real Arrow RecordBatches, reference-counted and safely transferred across execution boundaries.


🧬 Design Rules

Porter is built on strict invariants:

  • Flight SQL owns protocol routing (via fsql.NewFlightServer)
  • Porter only implements execution semantics
  • Handles are in-memory and TTL-bound
  • GC runs in the background (no inline eviction logic)
  • Arrow memory is explicitly retained/released

Nothing implicit. Nothing magical.


🌊 Streaming Core

All query results flow through a single pattern:

DuckDB β†’ Arrow RecordReader β†’ Channel β†’ Flight StreamChunks

Records are retained per batch and released after network write completion. This keeps backpressure and memory usage predictable.


🌐 Wire Contract

Porter supports both raw and Flight SQL-native flows:

Operation Behavior
SQL Query Raw SQL β†’ FlightInfo β†’ DoGet stream
Prepared Statements Handle-based execution with binding
Schema Introspection Lightweight probe execution

Both converge on the same execution engine.


πŸ”Œ WebSockets (Coming Soon)

A WebSocket transport layer is in progress.

Planned capabilities:

  • Bi-directional streaming query sessions
  • Low-latency Arrow batch push over WS frames
  • Browser-native Flight-like client
  • Session-based prepared statement lifecycle

Think of it as Flight SQL without the gRPC boundary.


πŸ›£οΈ Roadmap

  • Streaming Flight SQL execution
  • Prepared statements with parameter binding
  • TTL-based handle lifecycle
  • Background garbage collection
  • WebSocket transport layer
  • Session-aware execution context
  • Improved schema introspection (reduce probe execution)
  • Performance benchmarking suite

πŸ§ͺ Philosophy

Porter is intentionally narrow:

No distributed illusions. No unnecessary abstraction layers. Just a fast path from query to stream.

It is a system designed for hacking, embedding, and evolving.


🀝 Contributing

If you’ve ever looked at a data system and thought:

β€œWhy is this so complicated?”

you already understand what Porter is trying to fix.

Build it smaller. Make it clearer. Keep it moving.

Directories ΒΆ

Path Synopsis
cmd
client command
porter command
execution
adapter/flightsql
Package server provides a DuckDB-backed Arrow Flight SQL server built on top of the upstream flightsql routing layer (fsql.NewFlightServer).
Package server provides a DuckDB-backed Arrow Flight SQL server built on top of the upstream flightsql routing layer (fsql.NewFlightServer).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL