Documentation
¶
Index ¶
- Constants
- func ApproxLen(pack *pb.UidPack) int
- func CopyUidPack(pack *pb.UidPack) *pb.UidPack
- func Decode(pack *pb.UidPack, seek uint64) []uint64
- func DecodeToBuffer(buf *z.Buffer, pack *pb.UidPack)
- func Encode(uids []uint64, blockSize int) *pb.UidPack
- func EncodeFromBuffer(buf []byte, blockSize int) *pb.UidPack
- func ExactLen(pack *pb.UidPack) int
- func FreePack(pack *pb.UidPack)
- type Decoder
- func (d *Decoder) ApproxLen() int
- func (d *Decoder) BlockIdx() int
- func (d *Decoder) LinearSeek(seek uint64) []uint64
- func (d *Decoder) Next() []uint64
- func (d *Decoder) PeekNextBase() uint64
- func (d *Decoder) Seek(uid uint64, whence seekPos) []uint64
- func (d *Decoder) SeekToBlock(uid uint64, whence seekPos) []uint64
- func (d *Decoder) Uids() []uint64
- func (d *Decoder) UnpackBlock() []uint64
- func (d *Decoder) Valid() bool
- type Encoder
Constants ¶
const ( // SeekStart is used with Seek() to search relative to the Uid, returning it in the results. SeekStart seekPos = iota // SeekCurrent to Seek() a Uid using it as offset, not as part of the results. SeekCurrent )
Variables ¶
This section is empty.
Functions ¶
func ApproxLen ¶
ApproxLen would indicate the total number of UIDs in the pack. Can be used for int slice allocations.
func CopyUidPack ¶
CopyUidPack creates a copy of the given UidPack.
func Decode ¶
Decode decodes the UidPack back into the list of uids. This is a stop-gap function, Decode would need to do more specific things than just return the list back.
func DecodeToBuffer ¶
DecodeToBuffer is the same as Decode but it returns a z.Buffer which is calloc'ed and can be SHOULD be freed up by calling buffer.Release().
func Encode ¶
Encode takes in a list of uids and a block size. It would pack these uids into blocks of the given size, with the last block having fewer uids. Within each block, it stores the first uid as base. For each next uid, a delta = uids[i] - uids[i-1] is stored. Protobuf uses Varint encoding, as mentioned here: https://developers.google.com/protocol-buffers/docs/encoding . This ensures that the deltas being considerably smaller than the original uids are nicely packed in fewer bytes. Our benchmarks on artificial data show compressed size to be 13% of the original. This mechanism is a LOT simpler to understand and if needed, debug.
func EncodeFromBuffer ¶
EncodeFromBuffer is the same as Encode but it accepts a byte slice instead of a uint64 slice.
Types ¶
type Decoder ¶
Decoder is used to read a pb.UidPack object back into a list of UIDs.
func NewDecoder ¶
NewDecoder returns a decoder for the given UidPack and properly initializes it.
func (*Decoder) ApproxLen ¶
ApproxLen returns the approximate number of UIDs in the pb.UidPack object.
func (*Decoder) LinearSeek ¶
LinearSeek returns uids of the last block whose base is less than seek. If there are no such blocks i.e. seek < base of first block, it returns uids of first block. LinearSeek is used to get closest uids which are >= seek.
func (*Decoder) PeekNextBase ¶
PeekNextBase returns the base of the next block without advancing the decoder.
func (*Decoder) Seek ¶
Seek will search for uid in a packed block using the specified whence position. The value of whence must be one of the predefined values SeekStart or SeekCurrent. SeekStart searches uid and includes it as part of the results. SeekCurrent searches uid but only as offset, it won't be included with results.
Returns a slice of all uids whence the position, or an empty slice if none found.
func (*Decoder) SeekToBlock ¶
SeekToBlock will find the block containing the uid, and unpack it. When we are going to intersect the list later, this function is useful. As this function skips the search function and returns the entire block, it is faster than Seek. Unlike seek, we don't truncate the uids returned, which would be done by the intersect function anyways.
func (*Decoder) Uids ¶
Uids returns all the uids in the pb.UidPack object as an array of integers. uids are owned by the Decoder, and the slice contents would be changed on the next call. They should be copied if passed around.