dbcopy

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2026 License: Apache-2.0 Imports: 17 Imported by: 0

Documentation

Overview

Package dbcopy implements the `datatug db copy` cross-engine database copy primitive. See spec/features/cli/db/copy/ for the contract.

Package dbcopy implements the cross-engine database-copy primitive for `datatug db copy`. This file holds the type-mapping table between the MVP backends (SQLite via dalgo2sqlite and inGitDB via dalgo2ingitdb).

Coverage bar: every column type appearing in the canonical Chinook fixture MUST map cleanly in both directions. Types outside that closed set return a *UnsupportedTypeError naming the source type and target backend.

Mapping policy for the MVP:

  • Both backends speak the engine-neutral dbschema.Type vocabulary, so the translation is identity for every type currently produced by either driver's SchemaReader.
  • Chinook's *.db SQLite fixture contains columns of SQLite affinity INTEGER, NVARCHAR/TEXT, NUMERIC(10,2), and DATETIME. As of dalgo2sqlite v0.0.0-20260513182736-6886f34af097 the driver's DescribeCollection rejects NUMERIC(p,s) and DATETIME columns (driver gap upstream of dbcopy). For the columns that DO describe successfully (Album, Artist, Customer, Genre, MediaType, Playlist, PlaylistTrack), only dbschema.Int and dbschema.String are observed. The MVP type-map declares the full dbschema vocabulary supported (Bool, Int, Float, String, Bytes, Time, Decimal) so the coverage holds the day dalgo2sqlite ships DATETIME / NUMERIC support and Track/Invoice/InvoiceLine/Employee describe successfully.
  • dbschema.Null is intentionally rejected: it is the zero value used for "unset" FieldDef.Type and never a meaningful column type for cross-engine copy.

Package dbcopy implements `datatug db copy --from <url> --to <url>`.

This file covers the URL scheme dispatcher: parsing --from/--to arguments into a typed BackendRef and opening the underlying DALgo dal.DB.

Scheme support per spec/features/cli/db/copy/README.md (REQ:supported-schemes):

  • sqlite:// fully wired via dalgo2sqlite
  • ingitdb:// fully wired via dalgo2ingitdb; local-paths-only (REQ:ingitdb-url-local-only)
  • postgres:// parses; Open returns ErrPostgresNotWired until a PostgreSQL DALgo driver implements the three capability interfaces (dbschema.SchemaReader, ddl.SchemaModifier, dal.ConcurrencyAware)

Index

Constants

View Source
const (
	BackendSQLite  = "sqlite"
	BackendInGitDB = "ingitdb"
)

Supported backend identifiers accepted by MapType.

Variables

View Source
var ErrNoPrimaryKey = errors.New("source collection has no primary key declared")

ErrNoPrimaryKey is returned when a source collection has no PK declared. The MVP row-copy path requires a declared PK to construct target record keys.

View Source
var ErrPostgresNotWired = errors.New("PostgreSQL backend not yet wired")

ErrPostgresNotWired is returned by BackendRef.Open for postgres:// URLs until a PostgreSQL DALgo driver implements the three capability interfaces (dbschema.SchemaReader, ddl.SchemaModifier, dal.ConcurrencyAware).

View Source
var ErrSourceHasNoTables = errors.New("source has no tables; nothing to copy")

ErrSourceHasNoTables signals that the source introspected cleanly but has zero collections. Callers should exit 0 with a stderr note per REQ:source-introspection-failure.

Functions

func MapType

func MapType(t dbschema.Type, sourceBackend, targetBackend string) (dbschema.Type, error)

MapType translates a column type from sourceBackend to targetBackend. The returned Type is ready for use in a target ddl.CreateCollection call. For the MVP, sourceBackend and targetBackend MUST each be one of "sqlite" or "ingitdb". Types outside the Chinook coverage set (dbschema.Null and any unrecognized Type value) return a *UnsupportedTypeError naming the source type and target backend.

Both backends accept the same engine-neutral dbschema.Type vocabulary, so this function is the identity for every supported type today. The function exists as the seam where engine-specific widening / narrowing rules will land if and when the type-mapping matrix needs them.

Types

type BackendRef

type BackendRef struct {
	Scheme string
	// Path holds the scheme-specific resource locator.
	// - sqlite:    filesystem path to the .db file (e.g. "/tmp/foo.db" or "./rel.db")
	// - ingitdb:   filesystem path to the project directory
	// - postgres:  full original URL (passed verbatim to the future driver)
	Path string
	// Raw is the original input string, preserved for error messages.
	Raw string
}

BackendRef is a parsed --from/--to URL.

func Parse

func Parse(rawURL string) (BackendRef, error)

Parse parses a CLI URL argument into a BackendRef. It returns an error for unknown schemes (REQ:unknown-scheme-rejected), malformed URLs, and remote ingitdb:// URLs (REQ:ingitdb-url-local-only).

The unknown-scheme error message names BOTH the unsupported scheme AND the supported list, as required by REQ:unknown-scheme-rejected.

func (BackendRef) Open

func (r BackendRef) Open(ctx context.Context) (dal.DB, error)

Open opens the underlying DALgo dal.DB for this BackendRef.

Dispatch:

  • sqlite: opens via dalgo2sqlite.NewDatabase.
  • ingitdb: opens via dalgo2ingitdb.NewDatabase with the default validator-backed CollectionsReader.
  • postgres: returns ErrPostgresNotWired (no DALgo Postgres driver yet exposes the three capability interfaces).

The context is reserved for future use; today's driver constructors are synchronous and do not honor cancellation. That's acceptable for the MVP CLI verb.

type CopyOpts

type CopyOpts struct {
	// Overwrite is "" (require empty target), "recreate" (drop source-named
	// tables, then create from source), or "reload" (reserved — row-level
	// semantics; behaves like "recreate" for schema-only copies until row
	// CRUD lands).
	Overwrite string

	// Stderr receives per-table skip / row-error notes and the final
	// summary note. Defaults to discard if nil.
	Stderr io.Writer

	// Progress, if non-nil, receives per-table progress lines.
	Progress *ProgressWriter

	// SchemaOnly, if true, skips row streaming entirely and only replicates
	// the target schema. Useful for E2E-test scaffolding and for backends
	// where row streaming isn't supported in this direction.
	SchemaOnly bool

	// ParallelStreams is the requested max number of source tables copied
	// concurrently. 0 means "use the default": runtime.NumCPU()-1 (with a
	// floor of 1). Negative values are normalized to 1. Capped to 1 when
	// either source or target advertises SupportsConcurrentConnections()==false
	// (REQ:concurrency-cap).
	ParallelStreams int

	// Filters carries resolved filtering directives (table include/exclude,
	// row WHERE predicates, row limits). nil or empty means "no filtering
	// — copy whole DB per parent Feature ACs". Subsequent tasks consume
	// this at two seams (pre-worker table filter; engine_rows query builder).
	// Spec: spec/features/cli/db/copy/filtering/README.md
	Filters *filter.Directives
}

CopyOpts controls a Copy call.

type NonEmptyTargetError

type NonEmptyTargetError struct {
	Table string
	Rows  int64
}

NonEmptyTargetError is returned by checkEmptyTarget when at least one source-named table exists on the target with >=1 row. The user should rerun with --overwrite=recreate or --overwrite=reload.

func (*NonEmptyTargetError) Error

func (e *NonEmptyTargetError) Error() string

type ProgressWriter

type ProgressWriter struct {
	// contains filtered or unexported fields
}

ProgressWriter emits per-table start/finish lines to a writer (intended to be os.Stderr in production). When enabled is false, all methods are no-ops. Safe for concurrent use across multiple worker goroutines.

Format is pinned by REQ:progress-reporting in spec/features/cli/db/copy/README.md.

func NewProgressWriter

func NewProgressWriter(w io.Writer, enabled bool) *ProgressWriter

NewProgressWriter returns a ProgressWriter writing to w when enabled is true. If enabled is false (or w is nil), all methods are no-ops.

func (*ProgressWriter) FinishTable

func (p *ProgressWriter) FinishTable(table string, rows int64, d time.Duration)

FinishTable announces that a copy of `table` has completed: `rows` inserted in `d`. Duration is rounded to millisecond per AC.

func (*ProgressWriter) StartTable

func (p *ProgressWriter) StartTable(table string, estRows int64)

StartTable announces that a copy of `table` is starting. If estRows is negative, the start line prints `est. ? rows` instead of a number.

type ReloadSchemaMismatchError

type ReloadSchemaMismatchError struct {
	Table       string
	Column      string
	SourceValue string
	TargetValue string
	Reason      string // short reason: "missing column", "type mismatch", "primary key mismatch"
}

ReloadSchemaMismatchError signals that the target's schema for a given table is not a superset of the source's, per REQ:reload-schema-match.

Column is either a column name (when a source column is missing or has an incompatible target type) or the literal "<primary key>" when the primary-key column sets differ.

SourceValue and TargetValue describe what differed (e.g. type names or PK column lists) in human-readable form, for the stderr diff.

func (*ReloadSchemaMismatchError) Error

func (e *ReloadSchemaMismatchError) Error() string

type SourceSummary

type SourceSummary struct {
	Tables        int
	Created       int
	CreatedNames  []string // names of collections created on the target, in source-iteration order
	Skipped       []string // tables skipped at DescribeCollection time
	RowsCopied    int64    // total rows inserted into the target across all tables
	RowsByTable   map[string]int64
	RowSkips      map[string]string // table → reason (e.g. composite PK, no PK)
	TargetBackend string            // adapter name of target, for error messages
}

SourceSummary is what Copy reports back to the caller.

func Copy

func Copy(ctx context.Context, source, target dal.DB, opts CopyOpts) (SourceSummary, error)

Copy replicates the source database into the target — schema first, then row data per table.

Schema replication uses DALgo dbschema/ddl. Row streaming uses ExecuteQueryToRecordsReader on the source and RunReadwriteTransaction → InsertMulti on the target (see engine_rows.go).

If opts.SchemaOnly is true, only schema is replicated.

Tables the source can't describe (e.g. dalgo2sqlite rejecting DATETIME / NUMERIC) are appended to Skipped and processing continues. Tables with no PK get schema replicated but row copy is skipped with the reason recorded in RowSkips. Composite-PK tables are now copied (key encoded as `__`-joined PK values; see encodeRecordID in engine_rows.go).

Concurrency: opts.ParallelStreams governs how many tables are copied in parallel. The effective value is capped to 1 if either source or target advertises SupportsConcurrentConnections()==false. When the cap reduces an explicitly-requested value >1, one warning line is emitted on stderr.

Errors:

  • ErrSourceHasNoTables — source introspects cleanly but has zero collections.
  • any other error — wrapped with the failing operation and table name.

type UnsupportedTypeError

type UnsupportedTypeError struct {
	// SourceType is the dbschema.Type that could not be mapped.
	SourceType dbschema.Type
	// TargetBackend is the backend name supplied to MapType.
	TargetBackend string
}

UnsupportedTypeError is returned by MapType when the source column type is outside the MVP coverage set.

func (*UnsupportedTypeError) Error

func (e *UnsupportedTypeError) Error() string

Error implements the error interface. The message names both the source type and the target backend so the operator can act.

Directories

Path Synopsis
Package filter — CLI mini-syntax parsers.
Package filter — CLI mini-syntax parsers.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL