Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type BinPackResult ¶
BinPackResult represents a bin containing groups and their total size.
func BinPack ¶
func BinPack[G Sizer](groups []G) []BinPackResult[G]
BinPack performs best-fit decreasing bin packing with overflow and merging.
Algorithm:
- Sort items by size (largest first) for better packing
- For each item, find the best-fit bin (smallest remaining capacity that fits)
- If no bin fits within targetUncompressedSize, create a new bin if we are below the estimated bin count
- If no bin fits within targetUncompressedSize and we have reached the estimated bin count, try overflowing bins with upto targetUncompressedSize * maxOutputMultiple
- If still no fit, create a new bin
- Merge under-filled bins (below minFillPercent) in a post-processing step
type BucketIndexStreamReader ¶
type BucketIndexStreamReader struct {
// contains filtered or unexported fields
}
BucketIndexStreamReader is the production implementation that reads from object storage.
func (*BucketIndexStreamReader) ReadStreams ¶
func (r *BucketIndexStreamReader) ReadStreams(ctx context.Context, indexPath, tenant string, windowStart, windowEnd time.Time) (*IndexStreamResult, error)
ReadStreams reads stream metadata from an index object in the bucket.
type CompactionPlan ¶
type CompactionPlan struct {
// OutputObjects contains the planned output objects (one per bin).
OutputObjects []*compactionpb.SingleTenantObjectSource
// TotalUncompressedSize is the total uncompressed size across all output objects (for reporting).
TotalUncompressedSize int64
// LeftoverBeforeStreams contains streams with data before the compaction window.
LeftoverBeforeStreams []*LeftoverStreamGroup
// LeftoverAfterStreams contains streams with data after the compaction window.
LeftoverAfterStreams []*LeftoverStreamGroup
}
CompactionPlan represents the complete plan for compacting data objects for a single tenant.
type IndexStreamReader ¶
type IndexStreamReader interface {
ReadStreams(ctx context.Context, indexPath, tenant string, windowStart, windowEnd time.Time) (*IndexStreamResult, error)
}
IndexStreamReader reads stream metadata from index objects. This interface allows for mocking in tests.
type IndexStreamResult ¶
type IndexStreamResult struct {
Streams []StreamInfo
LeftoverBeforeStreams []LeftoverStreamInfo
LeftoverAfterStreams []LeftoverStreamInfo
}
IndexStreamResult holds stream infos from reading an index.
type LeftoverPlan ¶
type LeftoverPlan struct {
// BeforeWindow contains planned output objects for data BEFORE the compaction window.
BeforeWindow []*compactionpb.MultiTenantObjectSource
// BeforeWindowSize is the total uncompressed size of BeforeWindow (for reporting).
BeforeWindowSize int64
// AfterWindow contains planned output objects for data AFTER the compaction window.
AfterWindow []*compactionpb.MultiTenantObjectSource
// AfterWindowSize is the total uncompressed size of AfterWindow (for reporting).
AfterWindowSize int64
}
LeftoverPlan represents the plan for collecting leftover data outside the compaction window. Bin-packing is done separately for data before and after the window using stream-level granularity.
type LeftoverStreamGroup ¶
type LeftoverStreamGroup struct {
Streams []*compactionpb.TenantStream
TotalUncompressedSize int64
}
LeftoverStreamGroup represents streams with the same labels that have leftover data.
func (*LeftoverStreamGroup) GetSize ¶
func (g *LeftoverStreamGroup) GetSize() int64
GetSize implements Sizer interface for bin-packing.
type LeftoverStreamInfo ¶
type LeftoverStreamInfo struct {
compactionpb.TenantStream
LabelsHash uint64
UncompressedSize int64
}
LeftoverStreamInfo represents a stream's leftover data outside the compaction window.
type Planner ¶
type Planner struct {
// contains filtered or unexported fields
}
Planner creates compaction plans by reading stream metadata from indexes and grouping streams into output objects using bin-packing algorithms.
func NewPlanner ¶
NewPlanner creates a new Planner with the given bucket.
type Sizer ¶
type Sizer interface {
GetSize() int64
}
Sizer is an interface for items that have a size. Used by generic bin-packing algorithm.
type StreamCollectionResult ¶
type StreamCollectionResult struct {
// StreamGroups contains streams grouped by labels hash.
StreamGroups []*StreamGroup
// TotalUncompressedSize is the sum of uncompressed sizes across all stream groups.
TotalUncompressedSize int64
// LeftoverBeforeStreams contains streams with data before the compaction window.
LeftoverBeforeStreams []*LeftoverStreamGroup
// LeftoverAfterStreams contains streams with data after the compaction window.
LeftoverAfterStreams []*LeftoverStreamGroup
}
StreamCollectionResult holds the result of collecting streams from indexes.
type StreamGroup ¶
type StreamGroup struct {
// Streams contains all the stream entries for this stream (from different indexes).
Streams []*compactionpb.Stream
// TotalUncompressedSize is the sum of uncompressed sizes across all streams.
TotalUncompressedSize int64
}
StreamGroup represents a group of stream entries that belong to the same stream (identified by labels hash) across multiple index objects.
func (*StreamGroup) GetSize ¶
func (g *StreamGroup) GetSize() int64
GetSize implements Sizer interface for bin-packing.
type StreamInfo ¶
type StreamInfo struct {
compactionpb.Stream
// LabelsHash is the stable hash of the stream labels.
LabelsHash uint64
// UncompressedSize is the total uncompressed size of this stream.
UncompressedSize int64
}
StreamInfo represents aggregated stream information from an index object.