query

package
v0.0.0-...-706b979 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 4, 2026 License: AGPL-3.0 Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type MergedRowIterator

type MergedRowIterator struct {
	// contains filtered or unexported fields
}

MergedRowIterator provides a unified interface over multiple partition results. It iterates through all partition results sequentially, presenting them as a single result set.

func NewMergedRowIterator

func NewMergedRowIterator(results []*PartitionResult, logger zerolog.Logger) (*MergedRowIterator, error)

NewMergedRowIterator creates an iterator over multiple partition results. It validates that all successful partitions have the same schema.

func (*MergedRowIterator) Close

func (m *MergedRowIterator) Close() error

Close closes all underlying result sets.

func (*MergedRowIterator) Columns

func (m *MergedRowIterator) Columns() []string

Columns returns the column names.

func (*MergedRowIterator) Err

func (m *MergedRowIterator) Err() error

Err returns any error from the underlying iterators.

func (*MergedRowIterator) Next

func (m *MergedRowIterator) Next() bool

Next advances to the next row, returning false when exhausted.

func (*MergedRowIterator) Scan

func (m *MergedRowIterator) Scan(dest ...interface{}) error

Scan copies the current row values into the provided destinations.

func (*MergedRowIterator) ScanBuffer

func (m *MergedRowIterator) ScanBuffer() ([]interface{}, error)

ScanBuffer scans into the internal buffer and returns the values.

type ParallelExecutor

type ParallelExecutor struct {
	// contains filtered or unexported fields
}

ParallelExecutor executes queries across multiple partitions in parallel. Each partition query runs in its own goroutine with semaphore-based concurrency control to prevent overwhelming the DuckDB connection pool.

func NewParallelExecutor

func NewParallelExecutor(db *sql.DB, config *ParallelExecutorConfig, logger zerolog.Logger) *ParallelExecutor

NewParallelExecutor creates a new parallel executor.

func (*ParallelExecutor) ExecutePartitioned

func (e *ParallelExecutor) ExecutePartitioned(
	ctx context.Context,
	paths []string,
	queryTemplate string,
	readParquetOptions string,
) ([]*PartitionResult, error)

ExecutePartitioned executes a query template across multiple partitions in parallel. The queryTemplate should contain {PARTITION_PATH} placeholder that will be replaced with each partition's read_parquet expression.

Returns merged sql.Rows from all partitions. The caller is responsible for closing all returned rows.

func (*ParallelExecutor) ShouldUseParallel

func (e *ParallelExecutor) ShouldUseParallel(partitionCount int) bool

ShouldUseParallel returns true if parallel execution should be used for the given number of partitions.

type ParallelExecutorConfig

type ParallelExecutorConfig struct {
	// MaxConcurrentPartitions limits concurrent partition queries.
	// Set to 0 for no limit (uses all available connections).
	MaxConcurrentPartitions int

	// MinPartitionsForParallel is the minimum number of partitions
	// required to trigger parallel execution. Below this threshold,
	// we use standard DuckDB array syntax which may be more efficient.
	MinPartitionsForParallel int
}

ParallelExecutorConfig configures the parallel partition executor.

func DefaultParallelConfig

func DefaultParallelConfig() *ParallelExecutorConfig

DefaultParallelConfig returns sensible defaults for parallel execution.

type PartitionResult

type PartitionResult struct {
	Rows    *sql.Rows
	Columns []string
	Error   error
	Path    string
}

PartitionResult holds the result of a single partition query.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL