Documentation
¶
Index ¶
- Constants
- func EvaluatePushDown[T promql.Timestamped](ctx context.Context, w *CacheManager, request queryapi.PushDownRequest, ...) (<-chan T, error)
- func EvaluatePushDownWithAggSplit[T promql.Timestamped](ctx context.Context, w *CacheManager, request queryapi.PushDownRequest, ...) (<-chan T, error)
- type CacheKey
- type CacheManager
- type DiskUsageFunc
- type DownloadBatchFunc
- type ParquetFileCache
- func (pfc *ParquetFileCache) Close()
- func (pfc *ParquetFileCache) FileCount() int64
- func (pfc *ParquetFileCache) GetOrPrepare(region, bucket, objectID string) (localPath string, exists bool, err error)
- func (pfc *ParquetFileCache) MarkForDeletion(region, bucket, objectID string)
- func (pfc *ParquetFileCache) MarkForDeletionWithCompanion(region, bucket, objectID string)
- func (pfc *ParquetFileCache) RegisterMetrics() error
- func (pfc *ParquetFileCache) ScanExistingFiles() error
- func (pfc *ParquetFileCache) TotalBytes() int64
- func (pfc *ParquetFileCache) TrackFile(region, bucket, objectID string) error
- type RowMapper
- type WorkerService
Constants ¶
const ( // DefaultCleanupInterval is how often the cleanup goroutine runs. DefaultCleanupInterval = 5 * time.Minute // DefaultDeletionDelay is how long to wait before actually deleting a file // after it's been marked for deletion. This allows concurrent queries to // finish using the file before it's removed. DefaultDeletionDelay = 1 * time.Minute // DiskUsageHighWatermark is the disk utilization level that triggers eviction. // When disk usage exceeds this fraction, the cache will start evicting files. DiskUsageHighWatermark = 0.80 // DiskUsageLowWatermark is the target disk utilization after eviction. // Eviction continues until disk usage drops below this level. DiskUsageLowWatermark = 0.70 )
const (
ChannelBufferSize = 4096
)
Variables ¶
This section is empty.
Functions ¶
func EvaluatePushDown ¶ added in v1.3.0
func EvaluatePushDown[T promql.Timestamped]( ctx context.Context, w *CacheManager, request queryapi.PushDownRequest, userSQL string, s3GlobSize int, mapper RowMapper[T], ) (<-chan T, error)
func EvaluatePushDownWithAggSplit ¶ added in v1.8.0
func EvaluatePushDownWithAggSplit[T promql.Timestamped]( ctx context.Context, w *CacheManager, request queryapi.PushDownRequest, tblSQL string, aggSQL string, groupBy []string, matcherFields []string, s3GlobSize int, mapper RowMapper[T], ) (<-chan T, error)
EvaluatePushDownWithAggSplit evaluates a pushdown query with intelligent routing between agg_ and tbl_ files based on segment eligibility. For segments where agg_ files exist and support the query's GROUP BY and matchers, use aggSQL on agg_ files. Otherwise use tblSQL on tbl_ files.
Types ¶
type CacheKey ¶ added in v1.7.1
type CacheKey struct {
Region string // cloud region (e.g., "us-east-1"), can be empty
Bucket string // bucket name
ObjectID string // object key/path within the bucket
}
CacheKey uniquely identifies a cached file by its cloud storage location.
type CacheManager ¶ added in v1.3.0
type CacheManager struct {
// contains filtered or unexported fields
}
CacheManager coordinates downloads and queries over local parquet files.
func NewCacheManager ¶ added in v1.3.0
func NewCacheManager(dl DownloadBatchFunc, dataset string, storageProfileProvider storageprofile.StorageProfileProvider, pool *duckdbx.DB, parquetCache *ParquetFileCache) *CacheManager
func (*CacheManager) Close ¶ added in v1.3.0
func (w *CacheManager) Close()
type DiskUsageFunc ¶ added in v1.8.0
DiskUsageFunc is a function that returns disk usage statistics.
type DownloadBatchFunc ¶ added in v1.3.0
type DownloadBatchFunc func(ctx context.Context, storageProfile storageprofile.StorageProfile, keys []string) error
DownloadBatchFunc downloads ALL given paths to their target local paths.
type ParquetFileCache ¶ added in v1.7.0
type ParquetFileCache struct {
// contains filtered or unexported fields
}
ParquetFileCache manages downloaded parquet files with disk pressure-based cleanup. It tracks file sizes and access times for metrics and LRU eviction purposes. The cache owns its storage directory and provides a domain-aware API for callers to work with (bucket, region, objectId) rather than file paths.
func NewParquetFileCache ¶ added in v1.7.0
func NewParquetFileCache(cleanupInterval time.Duration) (*ParquetFileCache, error)
NewParquetFileCache creates a new parquet file cache with disk pressure-based cleanup. The cache directory is created under os.TempDir(), which respects the TMPDIR environment variable (typically set by helpers.SetupTempDir()).
func NewParquetFileCacheWithBaseDir ¶ added in v1.7.1
func NewParquetFileCacheWithBaseDir(baseDir string, cleanupInterval time.Duration) (*ParquetFileCache, error)
NewParquetFileCacheWithBaseDir creates a new parquet file cache with a custom base directory. This is useful for testing where each test needs an isolated cache directory.
func (*ParquetFileCache) Close ¶ added in v1.7.0
func (pfc *ParquetFileCache) Close()
Close stops the cleanup goroutine and removes all cached files.
func (*ParquetFileCache) FileCount ¶ added in v1.7.0
func (pfc *ParquetFileCache) FileCount() int64
FileCount returns the current number of tracked files.
func (*ParquetFileCache) GetOrPrepare ¶ added in v1.7.1
func (pfc *ParquetFileCache) GetOrPrepare(region, bucket, objectID string) (localPath string, exists bool, err error)
GetOrPrepare checks if a file is cached and returns its local path. If cached and valid, returns (localPath, true, nil). If not cached, prepares the directory structure and returns (localPath, false, nil). The caller should download to the returned path and then call TrackFile.
func (*ParquetFileCache) MarkForDeletion ¶ added in v1.7.0
func (pfc *ParquetFileCache) MarkForDeletion(region, bucket, objectID string)
MarkForDeletion marks a file for delayed deletion. The file will be deleted after DefaultDeletionDelay has passed. Until then, the file remains on disk but GetOrPrepare will return exists=false, causing re-downloads if needed.
func (*ParquetFileCache) MarkForDeletionWithCompanion ¶ added in v1.8.0
func (pfc *ParquetFileCache) MarkForDeletionWithCompanion(region, bucket, objectID string)
MarkForDeletionWithCompanion marks a tbl_ file and its corresponding agg_ file for delayed deletion. This ensures tbl_ and agg_ files are treated as a unit.
func (*ParquetFileCache) RegisterMetrics ¶ added in v1.7.0
func (pfc *ParquetFileCache) RegisterMetrics() error
RegisterMetrics registers OTEL metrics for the parquet file cache.
func (*ParquetFileCache) ScanExistingFiles ¶ added in v1.7.0
func (pfc *ParquetFileCache) ScanExistingFiles() error
ScanExistingFiles scans the cache directory and tracks any existing files. This is useful on startup to recover state from previous runs. Files with parseable paths (containing "db/" marker) are fully tracked with CacheKey reconstruction. Other files are tracked for cleanup only.
func (*ParquetFileCache) TotalBytes ¶ added in v1.7.0
func (pfc *ParquetFileCache) TotalBytes() int64
TotalBytes returns the total size of tracked files.
func (*ParquetFileCache) TrackFile ¶ added in v1.7.0
func (pfc *ParquetFileCache) TrackFile(region, bucket, objectID string) error
TrackFile marks a file as successfully downloaded and starts tracking it. Should be called after a file is downloaded to the path returned by GetOrPrepare.
type RowMapper ¶ added in v1.3.0
type RowMapper[T promql.Timestamped] func(queryapi.PushDownRequest, []string, *sql.Rows) (T, error)
RowMapper turns the current row into a T.
type WorkerService ¶ added in v1.3.0
type WorkerService struct {
MetricsCM *CacheManager
LogsCM *CacheManager
TracesCM *CacheManager
StorageProfilePoller storageprofile.StorageProfileProvider
MetricsGlobSize int
LogsGlobSize int
TracesGlobSize int
// contains filtered or unexported fields
}
WorkerService wires HTTP → CacheManager → SSE.
func NewWorkerService ¶ added in v1.3.0
func NewWorkerService( metricsGlobSize int, logsGlobSize int, tracesGlobSize int, maxParallelDownloads int, sp storageprofile.StorageProfileProvider, cloudManagers cloudstorage.ClientProvider, duckdbSettings duckdbx.DuckDBSettings, ) (*WorkerService, error)
func (*WorkerService) Close ¶ added in v1.4.4
func (ws *WorkerService) Close()
func (*WorkerService) Run ¶ added in v1.3.0
func (ws *WorkerService) Run(doneCtx context.Context) error
func (*WorkerService) ServeHTTP ¶ added in v1.3.0
func (ws *WorkerService) ServeHTTP(w http.ResponseWriter, r *http.Request)
ServeHttp serves SSE with merged, sorted points from cache+S3.