pgcopy

command module

v0.4.0 Latest Latest Go to latest Published: Aug 8, 2025 License: MIT Imports: 3 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/koltyakov/pgcopy

Links

Open Source Insights

README ¶

pgcopy

pgcopy is a high-performance CLI tool for efficiently copying data between PostgreSQL databases with identical schemas. It's designed for data-only migrations and provides parallel processing capabilities for optimal performance.

pgcopy in action

Safety & Overwrite Semantics: pgcopy truncates destination tables and replaces their data. Review the short guide before running: see Safety & Overwrite Semantics.

When to Use pgcopy

Use Cases

pgcopy is specifically designed for casual data synchronization scenarios where you need to copy data between PostgreSQL databases with identical schemas. It's particularly valuable when:

pg_restore is too slow or doesn't work for your use case
You don't have superuser permissions but need to handle foreign key constraints
You're dealing with complex foreign key relationships including circular dependencies
You need parallel processing for better performance than traditional dump/restore
You want data-only synchronization without schema changes
You're working in development/staging environments and need frequent data refreshes

What pgcopy is NOT

Incremental sync or CDC (it overwrites destination data)
Real-time replication (use PostgreSQL replication for production)
Schema migration tool (schemas must already match)
Backup tool (have proper backups before using)

Why Choose pgcopy Over Alternatives

vs. pg_dump/pg_restore

Performance: Parallel processing often outperforms serial dump/restore operations
Foreign Key Handling: Automatically manages complex FK constraints without manual intervention
Flexibility: Table-level filtering capabilities
Progress Tracking: Real-time progress monitoring with detailed logging

vs. ETL Tools

Simplicity: No complex configuration or learning curve
PostgreSQL-Optimized: Built specifically for PostgreSQL-to-PostgreSQL transfers
Constraint-Aware: Understands and preserves PostgreSQL foreign key relationships

vs. Manual SQL Scripts

Automated FK Management: No need to manually drop/recreate constraints
Error Recovery: Built-in error handling and constraint restoration
Parallel Execution: Concurrent table processing for better performance

Quick Start

# 1) Preview
pgcopy copy --source "postgres://user:pass@src:5432/db" --dest "postgres://user:pass@dst:5432/db" --dry-run
# 2) Run with prompt
pgcopy copy --source "postgres://user:pass@src:5432/db" --dest "postgres://user:pass@dst:5432/db"
# 3) High-throughput
pgcopy copy --source "postgres://..." --dest "postgres://..." --parallel 8 --copy-pipe --compress
# 4) Progress UI
# (progress UI example removed)
# 5) Web dashboard
pgcopy copy --source "postgres://..." --dest "postgres://..." --output web

Common Scenarios

Scenario	Command
Refresh staging from prod	`pgcopy copy --source "postgres://user:pass@prod:5432/db" --dest "postgres://user:pass@staging:5432/db"`
Migrate hosts (data-only)	`pgcopy copy --source "postgres://user:pass@old:5432/db" --dest "postgres://user:pass@new:5432/db"`
Copy a subset of tables	`pgcopy copy --source "..." --dest "..." --include "public.users,public.orders,public.products"`
Speed up over WAN	`pgcopy copy --source "..." --dest "..." --parallel 8 --copy-pipe --compress`
Web dashboard	`pgcopy copy --source "..." --dest "..." --output web`

Features

High Performance: Parallel table copying with configurable workers
Batch Processing: Configurable batch sizes for optimal memory usage
Progress Tracking: Real-time progress monitoring with visual progress bar enabled by default (stays fixed at top while logs scroll below)
Flexible Configuration: Support for connection strings, config files, and command-line options
Table Filtering: Include/exclude specific tables from the copy operation
Data Safety: Comprehensive validation and confirmation dialogs
Dry Run Mode: Preview what will be copied without actually copying data
Transaction Safety: Uses transactions to ensure data consistency
Advanced Foreign Key Handling: Automatically detects and manages foreign key constraints, including circular dependencies
Streaming COPY Pipeline: Direct source→destination streaming via PostgreSQL COPY for maximum throughput (--copy-pipe)
In-flight Compression: Optional gzip compression of the COPY stream to reduce network I/O (--compress, requires --copy-pipe)

Installation

Download prebuilt binaries

Grab the latest release for your OS from GitHub Releases:

https://github.com/koltyakov/pgcopy/releases

After download:

macOS/Linux: chmod +x pgcopy && move to a directory on your PATH (e.g., /usr/local/bin)
Windows: use pgcopy.exe or add its folder to PATH

From Source

git clone https://github.com/koltyakov/pgcopy.git
cd pgcopy
make build

The binary will be available in the bin/ directory.

Install to System

make install

This will install pgcopy to /usr/local/bin/.

Usage

⚠️ Data Safety First

Before using pgcopy, understand that it will completely overwrite data in the destination database:

Backup your destination database before running pgcopy
Verify your connection strings - double-check source and destination
Use --dry-run first to preview what will be copied
Test with non-production data before running on production systems

Basic Usage

Copy data from one database to another:

pgcopy copy \
  --source "postgres://user:password@source-host:5432/source_db" \
  --dest "postgres://user:password@dest-host:5432/dest_db"

Using Configuration Files

For convenience, you can save connection strings to files and reference them:

echo "postgres://user:password@source-host:5432/source_db" > source.conn
echo "postgres://user:password@dest-host:5432/dest_db" > dest.conn

# Use the connection strings directly
pgcopy copy --source "$(cat source.conn)" --dest "$(cat dest.conn)"

Advanced Options

pgcopy copy \
  --source "postgres://user:pass@localhost:5432/sourcedb" \
  --dest "postgres://user:pass@localhost:5433/destdb" \
  --parallel 8 \
  --batch-size 5000 \
  --copy-pipe \
  --compress \
  --exclude "logs,temp_*,*_cache" \
  --include "users,orders,products"

For network-bound transfers, combine --copy-pipe with --compress to significantly boost throughput (commonly multiple times faster on high-latency links).

Table Filtering

See Table filtering with wildcards for patterns and examples.

Output Modes

A visual progress bar can be enabled during copy operations. By default, plain output mode is used (suitable for CI/headless environments):

# Plain output mode (default)
pgcopy copy \
  --source "postgres://user:pass@localhost:5432/sourcedb" \
  --dest "postgres://user:pass@localhost:5433/destdb"

# Interactive mode with live table progress
pgcopy copy \
  --source "postgres://user:pass@localhost:5432/sourcedb" \
  --dest "postgres://user:pass@localhost:5433/destdb" \
  --output interactive

# Web dashboard
pgcopy copy \
  --source "postgres://user:pass@localhost:5432/sourcedb" \
  --dest "postgres://user:pass@localhost:5433/destdb" \
  --output web

Web UI dashboard _{Web mode launches a local dashboard for real-time monitoring.}

The progress bar stays fixed at the top of the terminal while operational log messages scroll underneath. In web mode, a local dashboard is launched showing real-time status.

Streaming COPY pipeline and compression

--copy-pipe streams table data directly from the source to the destination using PostgreSQL COPY in binary mode. Adding --compress gzip-compresses the stream on the fly to cut network bandwidth. This combo can deliver dramatic speedups in network-bound scenarios (often several times faster, up to ~10x on slow links), with minimal memory footprint.

Key points:

Use --copy-pipe for maximum throughput without intermediate row buffering
Add --compress to reduce bandwidth usage (requires --copy-pipe)
Works with parallel workers across multiple tables
Per-row progress granularity is limited during streaming; progress is reported at table completion
Foreign key handling and safety checks still apply as usual

Examples:

# Stream with compression (recommended over WAN)
pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --parallel 8 \
  --copy-pipe \
  --compress

# Stream without compression (good on fast local networks)
pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --parallel 8 \
  --copy-pipe

Note: --compress can only be used together with --copy-pipe.

List Tables

Before copying, you can list all tables in a database:

pgcopy list --source "postgres://user:password@host:5432/database"

Dry Run

Preview what will be copied without actually copying:

pgcopy copy \
  --source "postgres://user:pass@localhost:5432/sourcedb" \
  --dest "postgres://user:pass@localhost:5433/destdb" \
  --dry-run

Safety & Overwrite Semantics

For the confirmation dialog, skip flags, recovery notes, and best practices, see Safety & Overwrite Semantics.

See the automation notes in Safety & Overwrite Semantics for --skip-backup usage.

Commands

`copy`

Copy data from source to destination database.

Flags:

--source, -s: Source database connection string
--dest, -d: Destination database connection string
--parallel, -p: Number of parallel workers (default: 4)
--batch-size: Batch size for data copying (default: 1000)
--exclude: Tables to exclude from copying (comma-separated, supports wildcards: temp_*,*_logs)
--include: Tables to include in copying (comma-separated, supports wildcards: user_*,*_data)
--dry-run: Show what would be copied without actually copying
--skip-backup: Skip confirmation dialog for data overwrite (use with caution)
--output, -o: Output mode: 'plain' (minimal output, default), 'progress' (progress bar), 'interactive' (live table progress), 'web' (web dashboard)
--copy-pipe: Use streaming COPY pipeline (source→dest) instead of row fetch + insert
--compress: Gzip-compress streaming COPY pipeline (requires --copy-pipe)

`list`

List tables in a database with row counts and sizes.

Flags:

--source, -s: Source database connection string
--schema: Specific schema to list (optional)

`version`

Display version information.

Configuration

Connection Strings

PostgreSQL connection strings follow the standard format:

postgres://username:password@hostname:port/database_name?sslmode=disable

Environment Variables

You can use environment variables in your connection strings:

export PGPASSWORD=mypassword
pgcopy copy \
  --source "postgres://user@localhost:5432/sourcedb" \
  --dest "postgres://user@localhost:5433/destdb"

Foreign Key Management

pgcopy's Superpower: Non-Superuser FK Handling

One of pgcopy's key advantages is its ability to handle complex foreign key constraints without requiring superuser privileges. This addresses a common pain point where:

Cloud databases often don't provide superuser access (AWS RDS, Google Cloud SQL, Azure Database)
Managed environments restrict administrative privileges
pg_dump/pg_restore fails with permission errors on constraint operations
Manual FK management becomes error-prone with circular dependencies

The Foreign Key Challenge

When copying data between PostgreSQL databases with foreign key constraints, you typically face these issues:

Constraint Violations: Inserting data in wrong order causes FK violations
Circular Dependencies: Tables that reference each other create chicken-and-egg problems
Permission Requirements: Standard solutions often require superuser access
Manual Complexity: Hand-managing constraints is tedious and error-prone

pgcopy's Solution

pgcopy solves these problems with intelligent constraint management:

┌─────────────────────────────────────────────────────────────┐
│ 1. Scan all tables for foreign key constraints              │
│ 2. Try replica mode (if superuser available)                │
│ 3. Fall back to smart FK dropping (non-superuser)           │
│ 4. Copy data in parallel (constraints disabled)             │
│ 5. Restore all constraints with original definitions        │
└─────────────────────────────────────────────────────────────┘

Automatic Detection and Handling

pgcopy automatically detects and handles foreign key constraints in your database, including complex scenarios with circular dependencies.

See Foreign key handling for modes, caveats, and recovery.

More details in Foreign key handling.

Real-World Example

# This works even without superuser privileges!
pgcopy copy \
  --source "postgres://user:pass@prod-rds.amazonaws.com:5432/myapp" \
  --dest "postgres://user:pass@staging-rds.amazonaws.com:5432/myapp"

Output:

Starting data copy operation...
20:53:26 INFO Detecting foreign key constraints...
20:53:26 DONE Found 5 foreign key constraints
20:53:26 DONE Using replica mode for foreign key handling
20:53:26 INFO Copying table public.ContainerHistory (84.8K rows)
...
20:53:46 DONE Completed copying public.ContainerHistory (84.8K rows) in 20s
20:53:48 DONE Completed copying public.ContainerBarcodes (112.9K rows) in 22s

╔══════════════════════════════════════════════════════════════╗
║                        📊 COPY STATISTICS                    ║
╠══════════════════════════════════════════════════════════════╣
║  📋 Tables Processed:                                     9  ║
║  📊 Rows Copied:                                     439025  ║
║  ⏱️  Duration:                                          22s  ║
║  🚀 Average Speed:                             19322 rows/s  ║
╚══════════════════════════════════════════════════════════════╝

Performance Considerations

When pgcopy Outperforms Traditional Methods

pgcopy is specifically optimized for scenarios where traditional PostgreSQL tools fall short:

vs. pg_dump/pg_restore Performance

Scenario	pg_dump/pg_restore	pgcopy	Advantage
Large tables (>1M rows)	Serial processing	Parallel batching	3-5x faster
Many small tables	Overhead per table	Parallel workers	2-4x faster
Complex FK constraints	Manual intervention	Automatic handling	Much easier
Cloud/managed databases	Permission issues	Works without superuser	Actually works

Performance Characteristics

Parallel Processing: Multiple tables copied simultaneously
Batch Optimization: Configurable batch sizes for memory efficiency
Connection Pooling: Optimized database connections
Progress Tracking: Real-time feedback without performance impact
Streaming Pipeline: COPY-based streaming eliminates per-row overhead and minimizes round-trips
In-flight Compression: Gzip can greatly reduce network transfer time on bandwidth-limited links

Output feature matrix

Feature	raw (plain)	interactive	web
Minimal logs	✓	✓	✓
Overall progress bar		✓	✓
Per-table live status		✓	✓
Real-time dashboard			✓
Auto-open UI			✓
Good for CI	✓

Advanced docs

Optimal Settings

Parallel Workers: Start with 4-8 workers, adjust based on your system resources
Batch Size: 1000-5000 rows per batch usually provides good performance
Connection Pooling: The tool automatically configures connection pools

Large Databases

For very large databases:

Use higher parallel worker counts (8-16)
Increase batch size (5000-10000)
Ensure both source and destination databases have adequate resources
Consider network bandwidth between source and destination

Memory Usage

Memory usage scales with:

Number of parallel workers
Batch size
Number of columns in tables

Monitor memory usage and adjust settings accordingly.

Examples

Safe Development Workflow

# 1. Always start with dry-run to verify what will be copied
pgcopy copy \
  --source "postgres://user:pass@prod-db:5432/myapp" \
  --dest "postgres://user:pass@staging-db:5432/myapp" \
  --dry-run

# 2. Run with confirmation dialog (default)
pgcopy copy \
  --source "postgres://user:pass@prod-db:5432/myapp" \
  --dest "postgres://user:pass@staging-db:5432/myapp"

# 3. For automation/CI, skip confirmation
pgcopy copy \
  --source "postgres://user:pass@prod-db:5432/myapp" \
  --dest "postgres://user:pass@staging-db:5432/myapp" \
  --skip-backup

Basic Copy

pgcopy copy \
  --source "postgres://user:pass@prod-db:5432/myapp" \
  --dest "postgres://user:pass@staging-db:5432/myapp"

High-Performance Copy

pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --parallel 16 \
  --batch-size 10000 \
  --copy-pipe \
  --compress

Selective Copy

pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --include "users,orders,products,inventory"

Exclude System Tables

pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --exclude "logs,sessions,temp_cache"

Wildcard-based Filtering

# Exclude all temporary and log tables
pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --exclude "temp_*,*_logs,*_cache"

# Include only user-related tables
pgcopy copy \
  --source "postgres://user:pass@source:5432/db" \
  --dest "postgres://user:pass@dest:5432/db" \
  --include "user_*,*_users,customer_*"

Building from Source

Prerequisites

Go 1.21 or later
PostgreSQL client libraries

Build

# Clone the repository
git clone https://github.com/koltyakov/pgcopy.git
cd pgcopy

# Install dependencies
make deps

# Build
make build

# Run tests
make test

# Build for all platforms
make build-all

Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests
Run make test and make lint
Submit a pull request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

Create an issue on GitHub for bug reports or feature requests
Check existing issues before creating new ones

Acknowledgments

Built with Cobra for CLI functionality
Uses lib/pq PostgreSQL driver
Inspired by the need for efficient PostgreSQL data migration tools

Documentation ¶

Overview ¶

Package main provides the entry point for the pgcopy application

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
cmd Package cmd provides command-line interface for pgcopy	Package cmd provides command-line interface for pgcopy
internal
copier Package copier provides functionality for copying data between PostgreSQL databases.	Package copier provides functionality for copying data between PostgreSQL databases.
server Package server provides a web server for real-time pgcopy state visualization	Package server provides a web server for real-time pgcopy state visualization
state Package state provides React-like hooks for state management	Package state provides React-like hooks for state management
utils Package utils provides utility functions for pgcopy	Package utils provides utility functions for pgcopy

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL