guid

package module
v2.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 3, 2024 License: Apache-2.0 Imports: 10 Imported by: 20

README ΒΆ

GUID

K-ordered unique identifiers in lock-free and decentralized manner for Golang applications


Package guid implements interface to generate k-ordered unique identifiers in lock-free and decentralized manner for Golang applications. We says that sequence A is k-ordered if it consists of strictly ordered subsequences of length k:

  𝑨[π’Š βˆ’ π’Œ] ≀ 𝑨[π’Š] ≀ 𝑨[π’Š + π’Œ] for all π’Š such that π’Œ < π’Š ≀ π’βˆ’π’Œ.

Key features

This library aims important objectives:

  • IDs allocation does not require centralized authority or coordination with other nodes.
  • IDs are suitable for partial event ordering in distributed environment and helps on detection of causality violation.
  • IDs are roughly sortable by allocation order ("time").
  • IDs reduce indexes footprints and optimize lookup latency.

Inspiration

The event ordering in distributed computing is resolved using various techniques, e.g. Lamport timestamps, Universal Unique Identifiers, Twitter Snowflake and many other techniques are offered by open source libraries. guid is a Golang port of Erlang's uid library.

All these solution made a common conclusion, globally unique ID is a triple βŸ¨π’•, 𝒍, π’”βŸ©: βŸ¨π’•βŸ© monotonically increasing clock or timestamp is a primary dimension to roughly sort events, βŸ¨π’βŸ© is spatially unique identifier of ID allocator so called node location, βŸ¨π’”βŸ© sequence is a monotonic integer, which prevents clock collisions. The guid library addresses few issues observed in other solutions.

Every byte counts when application is processing or storing large volume of events. This library implements fixed size 96-bit identity schema, which is castable to 64-bit under certain occasion. It is about 25% improvement to compare with UUID or similar 128-bit identity schemas (only Twitters Snowflake is 64-bit).

Most of identity schemas uses monotonically increasing clock (timestamp) to roughly order events. The resolution of clock varies from nanoseconds to milliseconds. We found that usage of timestamp is not perfectly aligned with the goal of decentralized ID allocations. Usage of time synchronization protocol becomes necessary at distributed systems. Strictly speaking, NTP server becomes an authority to coordinate clock synchronization. This happens because schemas uses time fraction βŸ¨π’•βŸ© as a primary sorting key. In contrast with other libraries, guid do not give priority to single fraction of identity triple βŸ¨π’•βŸ© or βŸ¨π’βŸ©. It uses dynamic schema where the location fraction has higher priority than time only at particular precision. It allows to keep ordering consistent even if clocks on other node is skewed.

Identity Schema

A fixed size of 96-bit is used to implement identity schema

  3bit  47 bit - 𝒅 bit         32 bit      𝒅 bit  14 bit
   |-|-------------------|----------------|-----|-------|
   βŸ¨π’…βŸ©        βŸ¨π’•βŸ©                βŸ¨π’βŸ©         βŸ¨π’•βŸ©     βŸ¨π’”βŸ©

↣ βŸ¨π’•βŸ© is 47-bit UTC timestamp with millisecond precision. It is derived from nanosecond UNIX timestamp by shifting it by 17 bits (time.Now().UnixNano() << 17). The library is able to change the base timestamp to any value in-order to address Year 2038 problem.

↣ βŸ¨π’βŸ© is 32-bits node/allocator identifier. It is allocated randomly to each node using cryptographic random generator or application provided value. The node identity has higher sorting priority than seconds faction of timestamp. This allows to order events if clock drifts on nodes. The random allocation give an application ability to introduce about 65K allocators before it meets a high probability of collisions.

↣ βŸ¨π’…βŸ© is 3 drift bits defines allowed clock drift. It shows the value of less important faction of time. The value supports step-wise drift from 30 seconds to 36 minutes.

↣ βŸ¨π’”βŸ© is 14-bit of monotonic strictly locally ordered integer. It helps to avoid collisions when multiple events happens during single millisecond or when the clock set backwards. The 14-bit value allows to have about 16K allocations per millisecond and over 10M per second on single node. Each instance of application process runs a unique sequence of integers. The implementation ensures that the same integer is not returned more than once on the current process. Restart of the process resets the sequence.

The library supports casting of 96-bit identifier to 64-bit by dropping βŸ¨π’βŸ© fraction. This optimization reduces a storage footprint if application uses persistent allocators.

  3bit        47 bit            14 bit
   |-|------------------------|-------|
   βŸ¨π’…βŸ©           βŸ¨π’•βŸ©              βŸ¨π’”βŸ©

Getting started

The latest version of the library is available at main branch. All development, including new features and bug fixes, take place on the main branch using forking and pull requests as described in contribution guidelines. The stable version is available via Golang modules.

Use go get to retrieve the library and add it as dependency to your application.

go get github.com/fogfish/guid/v2

Here is minimal example (also available in playground):

package main

import (
  "fmt"
  "time"

  "github.com/fogfish/guid/v2"
)

func useDefaultClock() {
  a := guid.G(guid.Clock)
  time.Sleep(1 * time.Second)
  b := guid.G(guid.Clock)
  fmt.Printf("%s < %s is %v\n", a, b, guid.Before(a, b))
}

func useCustomClock() {
  clock := guid.NewClock(
    guid.WithNodeID(0xffffffff),
  )

  c := guid.G(clock)
  time.Sleep(1 * time.Second)
  d := guid.G(clock)
  fmt.Printf("%s < %s is %v\n", c, d, guid.Before(c, d))
}

func main() {
  useDefaultClock()
  useCustomClock()
}

The library api specification is available via Go doc.

How To Contribute

The library is Apache 2.0 licensed and accepts contributions via GitHub pull requests:

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

The build and testing process requires Go version 1.13 or later.

Build and run in your development console.

git clone https://github.com/fogfish/guid
cd guid
go test

License

See LICENSE

References

  1. Lamport timestamps
  2. Universal Unique Identifiers,
  3. Twitter Snowflake
  4. Flake

Documentation ΒΆ

Overview ΒΆ

Package guid implements interface to generate k-ordered unique identifiers in lock-free and decentralized manner for Golang applications. We says that sequence A is k-ordered if it consists of strictly ordered subsequences of length k:

𝑨[π’Š βˆ’ π’Œ] ≀ 𝑨[π’Š] ≀ 𝑨[π’Š + π’Œ] for all π’Š such that π’Œ < π’Š ≀ π’βˆ’π’Œ.

Key features ΒΆ

This library aims important objectives:

↣ IDs allocation does not require centralized authority or coordination with other nodes.

↣ IDs are suitable for partial event ordering in distributed environment and helps on detection of causality violation.

↣ IDs are roughly sortable by allocation order ("time").

↣ IDs reduce indexes footprints and optimize lookup latency.

Inspiration ΒΆ

The event ordering in distributed computing is resolved using various techniques, e.g. Lamport timestamps (https://en.wikipedia.org/wiki/Lamport_timestamps), Universal Unique Identifiers (https://tools.ietf.org/html/rfc4122), Twitter Snowflake (https://blog.twitter.com/engineering/en_us/a/2010/announcing-snowflake.html) and many other techniques are offered by open source libraries. `guid` is a Golang port of https://github.com/fogfish/uid.

All these solution made a common conclusion, globally unique ID is a triple βŸ¨π’•, 𝒍, π’”βŸ©:

↣ βŸ¨π’•βŸ© monotonically increasing clock or timestamp is a primary dimension to roughly sort events,

↣ βŸ¨π’βŸ© is spatially unique identifier of ID allocator so called node location,

↣ βŸ¨π’”βŸ© sequence is a monotonic integer, which prevents clock collisions. The value is global for the node

The `guid` library addresses few issues observed in other solutions.

Every byte counts when application is processing or storing large volume of events. This library implements fixed size 96-bit identity schema, which is castable to 64-bit under certain occasion. It is about 25% improvement to compare with UUID or similar 128-bit identity schemas. Twitters Snowflake is also 64-bit.

Most of identity schemas uses monotonically increasing clock (timestamp) to roughly order events. The resolution of clock varies from nanoseconds to milliseconds. We found that usage of timestamp is not perfectly aligned with the goal of decentralized ID allocations. Usage of time synchronization protocol becomes necessary at distributed systems. Strictly speaking, NTP server becomes an authority to coordinate clock synchronization. This happens because schemas uses time fraction βŸ¨π’•βŸ© as a primary sorting key. In contrast with other libraries, `guid` do not give priority to single βŸ¨π’•βŸ© or βŸ¨π’βŸ© fraction of identity triple. It uses dynamic schema where the location fraction has higher priority than time only at particular precision. It allows to keep ordering consistent even if clocks on other node is skewed.

Identity Schema ΒΆ

A fixed size of 96-bit is used to implement identity schema

3bit  47 bit - 𝒅 bit         32 bit      𝒅 bit  14 bit
 |-|-------------------|----------------|-----|-------|
 βŸ¨π’…βŸ©        βŸ¨π’•βŸ©                βŸ¨π’βŸ©         βŸ¨π’•βŸ©     βŸ¨π’”βŸ©

↣ βŸ¨π’•βŸ© is 47-bit UTC timestamp with millisecond precision. It is derived from nanosecond UNIX timesamp by shifting it by 17 bits (time.Now().UnixNano() >> 17). The library is able to change the base timestamp to any value in-order to address Year 2038 problem.

↣ βŸ¨π’βŸ© is 32-bits node/spacial identifier. It is allocated randomly to each node using cryptographic random generator or application provided value. The node identity has higher sorting priority than seconds faction of timestamp. This allows to order events if clock drifts on nodes. The random allocation give an application ability to introduce about 65K allocators before it meets a high probability of collisions.

↣ βŸ¨π’…βŸ© is 3 drift bits defines allowed clock drift. It shows the value of less important faction of time. The value supports step-wise drift from 30 seconds to 36 minutes.

↣ βŸ¨π’”βŸ© is 14-bit of monotonic strictly locally ordered integer. It helps to avoid collisions when multiple events happens during single millisecond or when the clock set backwards. The 14-bit value allows to have about 16K allocations per millisecond and over 10M per second on single node. Each instance of application process runs a unique sequence of integers. The implementation ensures that the same integer is not returned more than once on the current process. Restart of the process resets the sequence.

The library supports casting of 96-bit identifier to 64-bit by dropping βŸ¨π’βŸ© fraction. This optimization reduces a storage footprint if application uses persistent allocators.

3bit        47 bit            14 bit
 |-|------------------------|-------|
 βŸ¨π’…βŸ©           βŸ¨π’•βŸ©              βŸ¨π’”βŸ©

Applications ΒΆ

The schema has wide range of applications where globally unique id are required.

↣ object identity: use library to generate unique identifiers.

↣ replacement of auto increment types: out-of-the-box replacement for auto increment fields in databases.

↣ vector clock: defines a logical clock for each process.

↣ CRDTs: LWW Register, OR-set and others.

Index ΒΆ

Constants ΒΆ

This section is empty.

Variables ΒΆ

This section is empty.

Functions ΒΆ

func After ΒΆ

func After(a, b K) bool

After checks if k-ordered value A is after value B

func Base62 ΒΆ added in v2.1.0

func Base62(uid K) string

Encodes k-ordered value to lexicographically sortable base62 strings

func Before ΒΆ

func Before(a, b K) bool

Before checks if k-ordered value A is before value B

func Bytes ΒΆ

func Bytes(uid K) []byte

Bytes encodes k-odered value to byte slice

func EpochI ΒΆ

func EpochI(uid K) time.Time

EpochI (inverse) convers βŸ¨π’•βŸ© timestamp fraction from identifier as unix timestamp

func EpochT ΒΆ

func EpochT(uid K) time.Time

EpochT convers βŸ¨π’•βŸ© timestamp fraction from identifier as unix timestamp

func Equal ΒΆ

func Equal(a, b K) bool

Equal compares k-order UIDs, returns true if values are equal

func Node ΒΆ

func Node(uid K) uint64

Node returns βŸ¨π’βŸ© location fraction from identifier.

func Seq ΒΆ

func Seq(uid K) uint64

Seq returns βŸ¨π’”βŸ© sequence value. The value of monotonic unique integer at the time of K-ordered value creation.

func Split ΒΆ

func Split(n uint64, uid K) (bytes []byte)

Split decomposes UID value to bytes slice. The function acts as binary comprehension, the value n defines number of bits to extract into each cell.

func String ΒΆ

func String(uid K) string

String encodes k-ordered value to lexicographically sortable strings

func Time ΒΆ

func Time(uid K) uint64

Time returns βŸ¨π’•βŸ© timestamp fraction from identifier in nano seconds

Types ΒΆ

type Chronos ΒΆ

type Chronos interface {
	// Spatially unique identifier βŸ¨π’βŸ© of ID allocator so called node location
	L() uint64
	// Monotonically increasing logical clock βŸ¨π’•βŸ©
	T() (uint64, uint64)
}

Chronos is an abstraction of logical clock used by library.

var Clock Chronos = NewClock()

Clock is global default instance of logical clock

If the application needs own default clock e.g. inverse one, it declares own clock and pair of GID & LID functions.

func NewClock ΒΆ

func NewClock(opts ...Config) Chronos

Creates instance of logical clock

func NewClockMock ΒΆ

func NewClockMock(opts ...Config) Chronos

Create mock instance of logical clock

type Config ΒΆ

type Config func(*clock)

Config option of default logical clock behavior. Config options allows to define custom strategies to generate βŸ¨π’βŸ© location or βŸ¨π’•βŸ© timestamp.

func WithClock ΒΆ

func WithClock(ticker func() uint64) Config

WithClock configures a custom timestamp generator function

func WithClockInverse ΒΆ

func WithClockInverse() Config

WithClockInverse configures inverse unix timestamp as generator function

func WithClockUnix ΒΆ

func WithClockUnix() Config

WithClockUnix configures unix timestamp time.Now().UnixNano() as generator function

func WithNodeFromEnv ΒΆ

func WithNodeFromEnv() Config

WithNodeFromEnv configures βŸ¨π’βŸ© spatially unique identifier using env variable.

CONFIG_GUID_NODE_ID - defines location id as a string

func WithNodeID ΒΆ

func WithNodeID(id uint64) Config

WithNodeID explicitly configures βŸ¨π’βŸ© spatially unique identifier

func WithNodeRandom ΒΆ

func WithNodeRandom() Config

WithNodeRandom configures βŸ¨π’βŸ© spatially unique identifier using cryptographic random generator

func WithUnique ΒΆ

func WithUnique(unique func() uint64) Config

WithUnique configures generator for βŸ¨π’”βŸ© monotonic strictly locally ordered integer

type K ΒΆ added in v2.0.2

type K struct{ Hi, Lo uint64 }

K is native representation of k-ordered number. The structure is dedicated for both local and global k-ordered values. The local k-ordered value do not uses Hi fraction (equal to 0). The global k-ordered value is 96-bit long and requires no central registration process.

Note: Golang struct is 128-bits but only 96-bits are used effectively. The serialization process ensures that only 96-bits are used.

func Diff ΒΆ

func Diff(a, b K) K

Diff approximates distance between k-order UIDs.

func FoldG ΒΆ

func FoldG(n uint64, bytes []byte) (uid K)

Fold composes UID value from byte slice. The operation is inverse to Split.

func FoldL ΒΆ

func FoldL(n uint64, bytes []byte) (uid K)

Fold composes UID value from byte slice. The operation is inverse to Split.

func FromBase62 ΒΆ added in v2.1.0

func FromBase62(val string) (K, error)

FromBase62 decodes converts k-order UID from base62 string

func FromBytes ΒΆ

func FromBytes(val []byte) (K, error)

FromBytes decodes converts k-order UID from bytes

func FromL ΒΆ

func FromL(clock Chronos, uid K) K

Casts local (64-bit) k-order UID to global (96-bit) one

func FromStringG ΒΆ added in v2.0.2

func FromStringG(val string) (K, error)

FromStringG decodes converts k-order UID from lexicographically sortable strings

func FromStringL ΒΆ added in v2.0.2

func FromStringL(val string) (K, error)

FromStringL decodes converts k-order UID from lexicographically sortable strings

func FromT ΒΆ

func FromT(t time.Time, drift ...time.Duration) K

FromT converts unix timestamp to local K-order value

func G ΒΆ

func G(clock Chronos, drift ...time.Duration) K

Generates globally unique 96-bit k-ordered identifier.

3bit  47 bit - 𝒅 bit         32 bit     𝒅 bit  14 bit
|-|-------------------|----------------|-----|-------|
βŸ¨π’…βŸ©        βŸ¨π’•βŸ©                βŸ¨π’βŸ©         βŸ¨π’•βŸ©     βŸ¨π’”βŸ©

func L ΒΆ

func L(clock Chronos, drift ...time.Duration) K

func ToL ΒΆ

func ToL(uid K) K

Casts global (96-bit) k-order value to local (64-bit) one

func Z ΒΆ

func Z(clock Chronos, drift ...time.Duration) (uid K)

Z returns "zero" local (64-bit) k-order identifier

func (K) MarshalJSON ΒΆ added in v2.0.2

func (uid K) MarshalJSON() (bytes []byte, err error)

MarshalJSON encodes k-ordered value to lexicographically sortable JSON strings

func (K) String ΒΆ added in v2.0.2

func (uid K) String() string

String encoding of K-Order value

func (*K) UnmarshalJSON ΒΆ added in v2.0.2

func (uid *K) UnmarshalJSON(b []byte) (err error)

UnmarshalJSON decodes lexicographically sortable strings to UID value

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL