README
¶
jp - JSON Stream Processor
Stream, filter, and transform JSON with Unix pipeline efficiency.
# Process gigabyte files efficiently
cat huge.json | jp '$.items[?@.price > 100]' | less
# Monitor live logs in real-time
tail -f app.log | jp '$..[?@.level == "error"]' '$.message'
# Compose with Unix tools using JPV format
jp -out jpv < data.json | grep -v password | sed 's/test/prod/' | jp
What is jp?
jp is a command-line JSON processor designed for Unix pipelines. It processes JSON as a stream with immediate output, making it ideal for large files, infinite streams, and real-time data processing.
Unlike tools that load entire files into memory, jp starts outputting results immediately and works seamlessly with other Unix tools like less, grep, head, and tail. Many operations use constant memory, especially in default streaming mode.
Why jp?
Pipeline-First Design
jp follows the Unix philosophy: do one thing well and compose with other tools.
- Immediate output: Start seeing results right away, don't wait for the full file
- Memory-efficient streaming: Many operations use constant memory, especially simple queries and transforms
- Streaming architecture: Perfect for
| less,| head,| grepand other pipelines - Composable: Chain multiple transforms in a single
jpcall, or mix with standard Unix tools
Powerful Features
- Full JSONPath support: Complete RFC 9535 implementation
- Unix-like operations:
head,tail,greppatterns for JSON data - Format conversion: JSON, JSON Lines, CSV, and JPV (JSON Path-Value)
- JPV format: Makes JSON grep-able and sed-able like plain text
Table of Contents
- Installation
- Quick Start
- Key Features
- Common Use Cases
- Pipeline Examples
- The JPV Format
- Advanced Features
- Input/Output Formats
- Command Reference
- JSONPath Compliance
- The jsonstream Package
- Development Status
Installation
Homebrew (macOS/Linux)
Recommended for macOS users - automatically handles code signing and removes quarantine attributes:
brew install arnodel/tap/jp
Nix (NixOS/Linux/macOS)
nix-env -iA nur.repos.arnodel.jp
Or with flakes:
nix profile install github:arnodel/nur-packages#jp
Pre-built Binaries
Download pre-built binaries for Linux, macOS, or Windows from the releases page.
Available platforms:
- Linux (amd64, arm64)
- macOS (amd64, arm64)
- Windows (amd64)
Note for macOS users: If you download a pre-built binary directly, macOS Gatekeeper may block it because the binary is not code-signed. You can either:
- Use Homebrew installation (recommended - handles this automatically)
- Remove the quarantine attribute:
xattr -d com.apple.quarantine jp - Or allow it in System Preferences → Security & Privacy
Using Go
go install github.com/arnodel/jsonstream/cmd/jp@latest
Note: This requires Go 1.23 or later. The jp binary will be installed to $GOPATH/bin (or $HOME/go/bin by default). Make sure this directory is in your $PATH:
# Add to ~/.bashrc, ~/.zshrc, or equivalent
export PATH="$PATH:$(go env GOPATH)/bin"
Quick Start
Pretty-Print JSON
# Format JSON with colors (when outputting to terminal)
cat data.json | jp
# Browse large files with immediate output
cat huge.json | jp | less
Extract Data with JSONPath
# Get all names
echo '{"users":[{"name":"Alice"},{"name":"Bob"}]}' | jp '$.users[*].name'
# Output:
# "Alice"
# "Bob"
# Filter by condition
echo '[{"price":50},{"price":150}]' | jp '$[?@.price > 100]'
# Output:
# {"price": 150}
Use in Pipelines
# Get first 10 items (immediate output, stops reading after 10)
jp '$.items[:10]' < data.json
# Get last 5 items (memory-efficient sliding window)
jp '$.items[-5:]' < data.json
# Monitor live log stream
tail -f app.log | jp '$..[?@.level == "error"]'
Convert Formats
# JSON array to JSON Lines (one item per line)
echo '[1,2,3]' | jp -json-lines split
# Output:
# 1
# 2
# 3
# CSV to JSON
cat data.csv | jp -in csv-with-header
Key Features
JSONPath Queries
Full RFC 9535 compliant JSONPath implementation with streaming optimization.
# Basic selectors
jp '$.name' # Get field
jp '$.items[0]' # First item
jp '$.items[-1]' # Last item
jp '$.items[2:5]' # Slice (items 2, 3, 4)
jp '$.items[*]' # All items
# Recursive descent
jp '$..email' # All email fields at any depth
# Filters
jp '$[?@.price < 100]' # Price less than 100
jp '$.users[?@.active == true]' # Active users
jp '$.logs[?@.level == "error"]' # Error logs
Streaming optimization: By default, jp optimizes for streaming performance. Use -jsonpath-strict for exact RFC 9535 ordering (less streaming-friendly).
Unix-Like Operations
Use jp like head, tail, and grep but for JSON:
# HEAD: Get first N items
jp '$.items[:10]' < data.json # JSONPath approach
jp split < array.json | head -10 # Unix pipe approach
# TAIL: Get last N items (memory-efficient sliding window)
jp '$.items[-10:]' < data.json
# GREP: Filter data
jp '$.users[?@.email]' < users.json # JSONPath filter
jp -out jpv < data.json | grep email | jp # grep approach (see JPV below)
Most common operations stream efficiently with constant or limited memory.
Format Conversion
Convert between JSON, JSON Lines, CSV, and JPV formats:
# JSON array to JSON Lines
cat array.json | jp -json-lines split
# CSV to JSON
cat data.csv | jp -in csv-with-header
# JSON to CSV-like structures
jp '$.users[*]' < data.json | jp -json-lines
Supported formats:
- JSON: Standard JSON with pretty-printing
- JSON Lines: Newline-delimited JSON (one value per line)
- CSV: Comma-separated values with header support
- JPV: JSON Path-Value format (see dedicated section below)
Common Use Cases
Process Large Files
# Browse huge file with immediate output (works with any size file)
cat 10GB-file.json | jp | less
# Get first 100 items from huge file (reads only what's needed)
jp '$.items[:100]' < 10GB-file.json
# Extract specific data from large file
jp '$.transactions[?@.amount > 1000].id' < huge-file.json | less
Monitor Real-Time Streams
# Watch logs in real-time
tail -f application.log | jp '$..[?@.level == "error"]'
# Pretty-print streaming API
curl -N https://stream-api.example.com/events | jp
# Process infinite stream (Ctrl+C to stop)
yes '{"timestamp": "2024-01-01", "value": 42}' | jp | head -20
Extract and Transform Data
# Extract all email addresses and collect them
jp '$..email' join < data.json
# Get names from nested structure
jp '$.departments[*].employees[*].name' < org.json
# Filter and extract
jp '$.products[?@.inStock == true].name' < inventory.json
Build Complex Pipelines
# Multi-stage transformation (chain transforms in one jp call)
jp '$.users[*]' split '$.address.city' < data.json | sort | uniq
# Combine with Unix tools
jp -json-lines '$.items[*]' split < data.json | grep -i 'important' | wc -l
# Chain multiple filters
jp '$.logs[*]' split '$[?@.level == "error"]' '$[?@.timestamp > "2024-01-01"]' < logs.json
Why chain transforms in a single jp call?
Using jp T1 T2 instead of jp T1 | jp T2 is more efficient: it avoids serialization/deserialization overhead and pipe I/O. The transforms process the token stream directly in memory.
Pipeline Examples
jp shines in Unix pipelines thanks to its streaming architecture.
Why Streaming Matters
Traditional JSON tools often load entire files before producing output. jp's streaming approach means:
- Immediate feedback: See results right away, don't wait for the full file
- Works with infinite streams: Process data that never ends (logs, APIs)
- Memory efficiency: Many operations use constant or limited memory
- Pipeline efficiency: Each tool in the chain starts working immediately
- Early termination:
| head -10stops reading after 10 items
Complex Multi-Stage Examples
# Extract, transform, and analyze
jp '$.transactions[?@.amount > 100]' split '$.user.email' < data.json | \
sort | uniq -c | sort -nr | head -10
# Sanitize config with JPV
jp -out jpv < config.json | grep -v 'password' | grep -v 'secret' | jp > sanitized.json
The JPV Format
JPV (JSON Path-Value) is a line-based format that makes JSON compatible with classic Unix text processing tools like grep, sed, and awk.
What is JPV?
JPV represents JSON as lines of "path = value" pairs using JSONPath syntax:
# Original JSON
echo '{"name":"Alice","email":"alice@example.com","age":30}' | jp -out jpv
# JPV output
$.name = "Alice"
$.email = "alice@example.com"
$.age = 30
Why JPV?
JPV makes JSON line-oriented, unlocking the full power of Unix text tools:
- grep-friendly: Search for fields or values like plain text
- sed-able: Edit values with familiar text tools
- awk-compatible: Process with awk scripts
- Reversible: Convert JSON → JPV → JSON without data loss
- Subset property: Any subset of JPV lines is still valid JPV
JPV vs GRON
JPV is similar to gron but uses standard JSONPath syntax instead of a custom format.
Basic JPV Workflow
# 1. Convert JSON to JPV
jp -out jpv < data.json
# 2. Process with Unix tools
jp -out jpv < data.json | grep email
# 3. Convert back to JSON
jp -out jpv < data.json | grep email | jp
JPV Pipeline Examples
Find All Email Fields
# Search for email fields anywhere in the JSON
jp -out jpv < data.json | grep '\.email'
# Find specific email domains
jp -out jpv < users.json | grep 'email.*gmail.com' | jp
Output:
{
"users": [
{
"email": "alice@gmail.com"
},
{
"email": "bob@gmail.com"
}
]
}
Remove Sensitive Fields
# Strip passwords and secrets
jp -out jpv < config.json | grep -v 'password' | grep -v 'secret' | jp
# Remove entire sections
jp -out jpv < data.json | grep -v '\.credentials' | jp
Before:
{
"user": "alice",
"password": "secret123",
"api_key": "xyz"
}
After:
{
"user": "alice",
"api_key": "xyz"
}
Edit Values with sed
# Replace domain in all email addresses
jp -out jpv < users.json | sed 's/@oldcorp\.com/@newcorp.com/g' | jp
# Update environment in config
jp -out jpv < config.json | sed 's/environment.*=.*"dev"/environment = "prod"/' | jp
# Update all port numbers
jp -out jpv < services.json | sed 's/port = 8080/port = 9090/g' | jp
Complex Multi-Tool Pipelines
# Find, filter, and transform
jp -out jpv < data.json | \
grep 'users' | \ # Only user-related fields
grep -v 'password' | \ # Remove passwords
sed 's/status = "active"/status = "verified"/' | \ # Update status
jp # Convert back to JSON
# Extract and analyze with awk
jp -out jpv < metrics.json | \
grep 'count' | \ # Get count fields
awk -F' = ' '{sum += $2} END {print sum}' # Sum all counts
# Conditional editing
jp -out jpv < config.json | \
awk '/timeout = / {gsub(/[0-9]+/, "5000")} {print}' | \ # Set all timeouts to 5000
jp
Diff JSON Files
# Compare two JSON files as text
diff <(jp -out jpv < old.json | sort) <(jp -out jpv < new.json | sort)
# Find what changed
jp -out jpv < old.json | sort > old.jpv
jp -out jpv < new.json | sort > new.jpv
diff -u old.jpv new.jpv
The Subset Property
JPV has a unique property: any subset of JPV lines (preserving order) is still valid JPV that can be converted back to JSON.
This means you can:
- Delete any lines (removes those fields from JSON)
- Extract matching lines (creates JSON with only those fields)
Note: You must preserve the original order of lines. Reordering JPV lines may produce invalid JSON structure.
# Original JSON
{"name":"Alice","age":30,"city":"Boston","country":"USA"}
# Convert to JPV and remove some lines
$ echo '{"name":"Alice","age":30,"city":"Boston","country":"USA"}' | \
jp -out jpv | grep -v age | grep -v country | jp
{
"name": "Alice",
"city": "Boston"
}
This is what makes JPV so powerful for Unix-style text processing of JSON data.
Advanced Features
Transform Chaining
Chain multiple transforms in a single jp call for efficiency:
# Each transform processes the output of the previous one
jp split '$[?@.age > 18]' '$.name' join < data.json
# Limit depth, split, and extract
jp depth=1 split '$.summary' < data.json
Available transforms:
split: Split array into stream of valuesjoin: Join stream of values into arraydepth=N: Truncate output at depth Ntrace: Log stream to stderr (debugging)
Note: Multiple transforms in one jp call is more efficient than piping through multiple jp processes, as it avoids serialization overhead.
Streaming Behavior
jp processes JSON as a stream of tokens, not as complete in-memory objects. This enables efficient processing of large files and infinite streams.
Streaming examples:
# Get first 100 items from huge file (stops reading after 100)
jp '$.items[:100]' < 10GB-file.json
# Process infinite input (Ctrl+C to stop)
yes '{"x": 1}' | jp | head -20
# Start outputting immediately, no waiting
cat huge.json | jp | less
Memory-efficient operations:
$.items[*]: Streams one item at a time (constant memory)$.items[:N]: Stops after N items (constant memory)$.items[-N:]: Sliding window (N items in memory)split: Converts array to stream (constant memory)join: Wraps stream in brackets (no buffering)- Simple filters like
$[?@.price > 100]: Process one item at a time
Operations that may use more memory:
- Strict mode (
-jsonpath-strict) with descendant queries: buffers collections for RFC ordering - Complex filters with multiple paths or function calls: may need to buffer values
- Queries with negative indices without lookahead optimization
- Very deeply nested descendant queries (
$..)
For maximum memory efficiency, use default (non-strict) mode and prefer simple queries.
Strict Mode vs Default Mode
By default, jp optimizes for streaming. Use -jsonpath-strict for exact RFC 9535 ordering.
Default mode (streaming-optimized):
- Results emitted in document order as encountered
- Descendant queries (
$..) process items immediately - Constant memory for most operations
- Best for: large files, infinite streams, real-time data
Strict mode (-jsonpath-strict):
- Results emitted in RFC 9535 order
- Descendant queries process level-by-level
- May buffer collections
- Best for: RFC compliance, reproducible ordering, small files
Example with $..[*] on {"a": {"b": 1}, "c": 2}:
- Default:
{"b": 1}, 1, 2(document order) - Strict:
{"b": 1}, 2, 1(per-level order)
Use -jsonpath-strict when:
- You need RFC 9535 compliance for interoperability
- Output order matters for your use case
- Processing small files where memory isn't a concern
Use default (no -jsonpath-strict) when:
- You want maximum streaming performance
- Processing large files or infinite streams
- Document order is acceptable
Input/Output Formats
Input Formats (-in FORMAT)
| Format | Description | Example |
|---|---|---|
auto (default) |
Auto-detect format | jp < data.json |
json |
Standard JSON, supports JSON Lines | jp -in json < data.json |
jpv |
JSON Path-Value format | jp -in jpv < data.jpv |
csv |
CSV records as JSON arrays | jp -in csv < data.csv |
csvh or csv-with-header |
CSV with header row | jp -in csvh < data.csv |
CSV options:
-csv-header NAMES: Specify field names for CSV (comma-separated)- CSV values: numbers, booleans (
true/false), andnullare parsed - Empty CSV fields become
null
Output Formats (-out FORMAT)
| Format | Description | Example |
|---|---|---|
json (default) |
Pretty-printed JSON | jp < data.json |
jpv |
JSON Path-Value format | jp -out jpv < data.json |
JSON output options:
-json-linesor-json-compact: Output JSON Lines (one value per line)-json-indent N: Indentation level (default: 2)-json-compact-width N: Max width for inline arrays/objects (default: 60)-color MODE: Color output (auto,always,never)
JPV output options:
-jpv-quote-keys: Always quote keys in brackets
Command Reference
jp has comprehensive built-in help:
jp -h # Main help
jp -help-input # Detailed input format help
jp -help-output # Detailed output format help
jp -help-transforms # Transform help with examples
jp -help-cookbook # Unix-like usage patterns (head/tail/grep)
Common flags:
-in FORMAT # Input format: json, jpv, csv, csvh, auto (default: auto)
-out FORMAT # Output format: json, jpv (default: json)
-json-lines # Output JSON Lines format
-json-indent N # Indentation level (default: 2)
-jsonpath-strict # Strict RFC 9535 mode
-color MODE # Color output: auto, always, never
Example usage:
jp [options] [transforms...] < input.json
# Multiple transforms
jp '$.users[*]' split '$.email' < data.json
# With options
jp -json-lines -color never '$.items[*]' < data.json
JSONPath Compliance
jp implements the full RFC 9535 JSONPath specification. The implementation passes all tests from the official JSONPath compliance test suite (as of 2025-11-23).
By default, jp optimizes for streaming performance while maintaining compliance with query semantics. Use -jsonpath-strict for exact RFC 9535 ordering requirements.
The jsonstream Package
jp is built on the jsonstream Go package, which provides:
- Token-based streaming: Process JSON without loading full documents
- JSONPath implementation: Full RFC 9535 support
- Format encoders/decoders: JSON, JPV, CSV
- Transform pipeline: Composable stream transformations
If you're building Go applications that need streaming JSON processing, check out the package documentation.
Development Status
jp is in active development. The core features are functional and the JSONPath implementation is complete, but I work on it in my spare time so progress is incremental.
Current status:
- ✅ Full RFC 9535 JSONPath implementation
- ✅ Streaming architecture (many operations use constant memory)
- ✅ JSON, JSON Lines, CSV, and JPV formats
- ✅ Unix-like operations (head/tail/grep patterns)
- ✅ Transform chaining
- ⏳ Additional transforms and features planned
Feedback and contributions welcome! Please open an issue if you find bugs or have feature requests.
License
See LICENSE file for details.
Documentation
¶
There is no documentation for this package.