Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type MergedRowIterator ¶
type MergedRowIterator struct {
// contains filtered or unexported fields
}
MergedRowIterator provides a unified interface over multiple partition results. It iterates through all partition results sequentially, presenting them as a single result set.
func NewMergedRowIterator ¶
func NewMergedRowIterator(results []*PartitionResult, logger zerolog.Logger) (*MergedRowIterator, error)
NewMergedRowIterator creates an iterator over multiple partition results. It validates that all successful partitions have the same schema.
func (*MergedRowIterator) Close ¶
func (m *MergedRowIterator) Close() error
Close closes all underlying result sets.
func (*MergedRowIterator) Columns ¶
func (m *MergedRowIterator) Columns() []string
Columns returns the column names.
func (*MergedRowIterator) Err ¶
func (m *MergedRowIterator) Err() error
Err returns any error from the underlying iterators.
func (*MergedRowIterator) Next ¶
func (m *MergedRowIterator) Next() bool
Next advances to the next row, returning false when exhausted.
func (*MergedRowIterator) Scan ¶
func (m *MergedRowIterator) Scan(dest ...interface{}) error
Scan copies the current row values into the provided destinations.
func (*MergedRowIterator) ScanBuffer ¶
func (m *MergedRowIterator) ScanBuffer() ([]interface{}, error)
ScanBuffer scans into the internal buffer and returns the values.
type ParallelExecutor ¶
type ParallelExecutor struct {
// contains filtered or unexported fields
}
ParallelExecutor executes queries across multiple partitions in parallel. Each partition query runs in its own goroutine with semaphore-based concurrency control to prevent overwhelming the DuckDB connection pool.
func NewParallelExecutor ¶
func NewParallelExecutor(db *sql.DB, config *ParallelExecutorConfig, logger zerolog.Logger) *ParallelExecutor
NewParallelExecutor creates a new parallel executor.
func (*ParallelExecutor) ExecutePartitioned ¶
func (e *ParallelExecutor) ExecutePartitioned( ctx context.Context, paths []string, queryTemplate string, readParquetOptions string, ) ([]*PartitionResult, error)
ExecutePartitioned executes a query template across multiple partitions in parallel. The queryTemplate should contain {PARTITION_PATH} placeholder that will be replaced with each partition's read_parquet expression.
Returns merged sql.Rows from all partitions. The caller is responsible for closing all returned rows.
func (*ParallelExecutor) ShouldUseParallel ¶
func (e *ParallelExecutor) ShouldUseParallel(partitionCount int) bool
ShouldUseParallel returns true if parallel execution should be used for the given number of partitions.
type ParallelExecutorConfig ¶
type ParallelExecutorConfig struct {
// MaxConcurrentPartitions limits concurrent partition queries.
// Set to 0 for no limit (uses all available connections).
MaxConcurrentPartitions int
// MinPartitionsForParallel is the minimum number of partitions
// required to trigger parallel execution. Below this threshold,
// we use standard DuckDB array syntax which may be more efficient.
MinPartitionsForParallel int
}
ParallelExecutorConfig configures the parallel partition executor.
func DefaultParallelConfig ¶
func DefaultParallelConfig() *ParallelExecutorConfig
DefaultParallelConfig returns sensible defaults for parallel execution.