query

package

v0.0.0-...-706b979 Latest Latest Go to latest Published: May 4, 2026 License: AGPL-3.0 Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/basekick-labs/arc

Links

Open Source Insights

Documentation ¶

Index ¶

type MergedRowIterator
- func NewMergedRowIterator(results []*PartitionResult, logger zerolog.Logger) (*MergedRowIterator, error)
type ParallelExecutor
- func NewParallelExecutor(db *sql.DB, config *ParallelExecutorConfig, logger zerolog.Logger) *ParallelExecutor
- func (e *ParallelExecutor) ExecutePartitioned(ctx context.Context, paths []string, queryTemplate string, ...) ([]*PartitionResult, error)
- func (e *ParallelExecutor) ShouldUseParallel(partitionCount int) bool
type ParallelExecutorConfig
- func DefaultParallelConfig() *ParallelExecutorConfig
type PartitionResult

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type MergedRowIterator ¶

type MergedRowIterator struct {
	// contains filtered or unexported fields
}

MergedRowIterator provides a unified interface over multiple partition results. It iterates through all partition results sequentially, presenting them as a single result set.

func NewMergedRowIterator ¶

func NewMergedRowIterator(results []*PartitionResult, logger zerolog.Logger) (*MergedRowIterator, error)

NewMergedRowIterator creates an iterator over multiple partition results. It validates that all successful partitions have the same schema.

func (*MergedRowIterator) Close ¶

func (m *MergedRowIterator) Close() error

Close closes all underlying result sets.

func (*MergedRowIterator) Columns ¶

func (m *MergedRowIterator) Columns() []string

Columns returns the column names.

func (*MergedRowIterator) Err ¶

func (m *MergedRowIterator) Err() error

Err returns any error from the underlying iterators.

func (*MergedRowIterator) Next ¶

func (m *MergedRowIterator) Next() bool

Next advances to the next row, returning false when exhausted.

func (*MergedRowIterator) Scan ¶

func (m *MergedRowIterator) Scan(dest ...interface{}) error

Scan copies the current row values into the provided destinations.

func (*MergedRowIterator) ScanBuffer ¶

func (m *MergedRowIterator) ScanBuffer() ([]interface{}, error)

ScanBuffer scans into the internal buffer and returns the values.

type ParallelExecutor ¶

type ParallelExecutor struct {
	// contains filtered or unexported fields
}

ParallelExecutor executes queries across multiple partitions in parallel. Each partition query runs in its own goroutine with semaphore-based concurrency control to prevent overwhelming the DuckDB connection pool.

func NewParallelExecutor ¶

func NewParallelExecutor(db *sql.DB, config *ParallelExecutorConfig, logger zerolog.Logger) *ParallelExecutor

NewParallelExecutor creates a new parallel executor.

func (*ParallelExecutor) ExecutePartitioned ¶

func (e *ParallelExecutor) ExecutePartitioned(
	ctx context.Context,
	paths []string,
	queryTemplate string,
	readParquetOptions string,
) ([]*PartitionResult, error)

ExecutePartitioned executes a query template across multiple partitions in parallel. The queryTemplate should contain {PARTITION_PATH} placeholder that will be replaced with each partition's read_parquet expression.

Returns merged sql.Rows from all partitions. The caller is responsible for closing all returned rows.

func (*ParallelExecutor) ShouldUseParallel ¶

func (e *ParallelExecutor) ShouldUseParallel(partitionCount int) bool

ShouldUseParallel returns true if parallel execution should be used for the given number of partitions.

type ParallelExecutorConfig ¶

type ParallelExecutorConfig struct {
	// MaxConcurrentPartitions limits concurrent partition queries.
	// Set to 0 for no limit (uses all available connections).
	MaxConcurrentPartitions int

	// MinPartitionsForParallel is the minimum number of partitions
	// required to trigger parallel execution. Below this threshold,
	// we use standard DuckDB array syntax which may be more efficient.
	MinPartitionsForParallel int
}

ParallelExecutorConfig configures the parallel partition executor.

func DefaultParallelConfig ¶

func DefaultParallelConfig() *ParallelExecutorConfig

DefaultParallelConfig returns sensible defaults for parallel execution.

type PartitionResult ¶

type PartitionResult struct {
	Rows    *sql.Rows
	Columns []string
	Error   error
	Path    string
}

PartitionResult holds the result of a single partition query.

Source Files ¶

View all Source files

parallel_executor.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL