mksuid

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 28, 2025 License: MIT Imports: 12 Imported by: 0

README

mksuid

mksuid is a fork with sub millisecond precision of the popular ksuid, an efficient, comprehensive, battle-tested Go library for generating and parsing a specific kind of globally unique identifier called a KSUID.

Install

go get -u github.com/demula/mksuid

What is a mKSUID?

mKSUID is for K-Sortable Unique IDentifier. It is a kind of globally unique identifier similar to a RFC 4122 UUID, built from the ground-up to be "naturally" sorted by generation timestamp without any special type-aware logic.

In short, running a set of mKSUIDs through the UNIX sort command will result in a list ordered by generation time.

Why use mKSUIDs?

There are numerous methods for generating unique identifiers, so why mKSUID?

  1. Naturally ordered by generation time
  2. Collision-free, coordination-free, dependency-free
  3. Highly portable representations
  4. Sub-millisecond precision

Only if #4 is needed to you use this project, for everything else use original ksuid project.

For a follow up read on the topic: A brief history of UUID

1. Naturally Ordered By Generation Time

Unlike the more ubiquitous UUIDv4, a mKSUID contains a timestamp component that allows them to be loosely sorted by generation time. This is not a strong guarantee (an invariant) as it depends on wall clocks, but is still incredibly useful in practice. Both the binary and text representations will sort by creation time without any special sorting logic.

2. Collision-free, Coordination-free, Dependency-free

While RFC 4122 UUIDv1s do include a time component, there aren't enough bytes of randomness to provide strong protection against collisions (duplicates). With such a low amount of entropy, it is feasible for a malicious party to guess generated IDs, creating a problem for systems whose security is, implicitly or explicitly, sensitive to an adversary guessing identifiers.

To fit into a 64-bit number space, Snowflake IDs and its derivatives require coordination to avoid collisions, which significantly increases the deployment complexity and operational burden.

A mKSUID includes 112 bits of pseudorandom data ("entropy"). This number space is 1024 times smaller than the 122 bits used by the well-accepted RFC 4122 UUIDv4 standard. Jointly with the sub-millisecond timestamp component, mKSUID can lower the probability of collisions, to the point of physical infeasibility in some specific practical implementations.

3. Highly Portable Representations

The text and binary representations are lexicographically sortable, which allows them to be dropped into systems which do not natively support mKSUIDs and retain their time-ordered property.

The text representation is an alphanumeric base62 encoding, so it "fits" anywhere alphanumeric strings are accepted. No delimiters are used, so stringified mKSUIDs won't be inadvertently truncated or tokenized when interpreted by software that is designed for human-readable text, a common problem for the text representation of RFC 4122 UUIDs.

4. Sub-millisecond precision

By shifting 16 bits from the payload to the timestamp representation we can deliver a time precision of 0.01ms.

How do mKSUIDs work?

Binary mKSUIDs are 20-bytes: a 48-bit unsigned integer UTC timestamp and a 112-bit randomly generated payload. The timestamp uses big-endian encoding, to support lexicographic sorting. The timestamp epoch is adjusted to Feb 2025, providing over 89 years of life. The payload is generated by a cryptographically-strong pseudorandom number generator.

The text representation is always 27 characters, encoded in alphanumeric base62 that will lexicographically sort by timestamp.

High Performance

This library is designed to be used in code paths that are performance critical. Its code has been tuned to eliminate all non-essential overhead. The KSUID type is derived from a fixed-size array, which eliminates the additional reference chasing and allocation involved in a variable-width type.

The API provides an interface for use in code paths which are sensitive to allocation. For example, the Append method can be used to parse the text representation and replace the contents of a KSUID value without additional heap allocation.

All public package level "pure" functions are concurrency-safe, protected by a global mutex. For hot loops that generate a large amount of KSUIDs from a single Goroutine, the Sequence type is provided to elide the potential contention.

By default, out of an abundance of caution, the cryptographically-secure PRNG is used to generate the random bits of a KSUID. This can be relaxed in extremely performance-critical code using the included FastRander type. FastRander uses the standard PRNG with a seed generated by the cryptographically-secure PRNG.

NOTE: While there is no evidence that FastRander will increase the probability of a collision, it shouldn't be used in scenarios where uniqueness is important to security, as there is an increased chance the generated IDs can be predicted by an adversary.

NOT Battle Tested

There are many trade-offs to this implementation and it is very important that the user understands and mitigate potential collision scenarios.

While adding extra segmentation in the timestamp, this increased substantially the sensitivity to traffic distribution of the collision space. The original second interval meant no matter how concentrated in a specific millisecond the traffic might be, the full 128 bits of random space was available ensuring no collision.

For this implementation we need to consider peaks on specific timestamp for calculating if we might run into potential collisions.

Interval Space Traffic Prob. collision[^1]
1 s 128 bit 1 T 1.44e-15
0.01 ms 112 bit 0.01 T 9.66e-15

As the table above shows, given the same density in the slot, the mKSUD is ~7 times more likely to have a collision than the original implementation. That said, 10 * 10^9 IDs in a 0.01 ms slot is significantly higher that a lot of the traffic that some systems see and even then the probability of collision is still small.

Nevertheless is should be noted that the following scenarios that increases probability of collision and should be avoided where possible:

  • The generator has millisecond accuracy. This means that traffic will pile up on the exact millisecond slot and not extend over the available sub-millisecond slots.

  • Very long batches created on the same slot.

Also data lifetime might be consider as 89 years might pass fast.

Plays Well With Others

Designed to be integrated with other libraries, the KSUID type implements many standard library interfaces, including:

  • Stringer
  • database/sql.Scanner and database/sql/driver.Valuer
  • encoding.BinaryMarshal and encoding.BinaryUnmarshal
  • encoding.TextMarshal and encoding.TextUnmarshal (encoding/json friendly!)

Command Line Tool

This package comes with a command-line tool mksuid, useful for generating mKSUIDs as well as inspecting the internal components of existing mKSUIDs. Machine-friendly output is provided for scripting use cases.

Given a Go build environment, it can be installed with the command:

go install github.com/demula/mksuid/cmd/mksuid

CLI Usage Examples

Generate a mKSUID
$ mksuid
0ujsswThIGTUYm2K8FjOOfXtY1K
Generate 4 mKSUIDs
$ mksuid -n 4
0ujsszwN8NRY24YaXiTIE2VWDTS
0ujsswThIGTUYm2K8FjOOfXtY1K
0ujssxh0cECutqzMgbtXSGnjorm
0ujsszgFvbiEr7CDgE3z8MAUPFt
Inspect the components of a mKSUID
$ mksuid -f inspect 0ujtsYcgvSTl8PAuAdqWYSMnLOv

REPRESENTATION:

  String: 0ujtsYcgvSTl8PAuAdqWYSMnLOv
     Raw: 0669F7EFB5A1CD34B5F99D1154FB6853345C9735

COMPONENTS:

       Time: 2027-05-17 04:46:50.14689 +0200 CEST
  Timestamp: 7052201014689
    Payload: CD34B5F99D1154FB6853345C9735
Generate a mKSUID and inspect its components
$ mksuid -f inspect

REPRESENTATION:

  String: 06nSoGRnELM99c9qpQTbnWHykBq
     Raw: 00C4B79B65B757F8A939A03018FA928766EB7B86

COMPONENTS:

       Time: 2025-05-28 18:15:40.04663 +0200 CEST
  Timestamp: 844894004663
    Payload: 57F8A939A03018FA928766EB7B86

Inspect a mKSUID with template formatted inspection output
$ mksuid -f template -t '{{ .Time }}: {{ .Payload }}' 0ujtsYcgvSTl8PAuAdqWYSMnLOv
2027-05-17 04:46:50.14689 +0200 CEST: CD34B5F99D1154FB6853345C9735
Inspect multiple mKSUIDs with template formatted output
$ mksuid -f template -t '{{ .Time }}: {{ .Payload }}' $(mksuid -n 4)
2025-05-28 20:28:10.24348 +0200 CEST: 65B1F219CA4E6BED6586580381C2
2025-05-28 20:28:10.24349 +0200 CEST: 0DECE2A8DE7F670E7E138FF77C82
2025-05-28 20:28:10.24349 +0200 CEST: 035D3C74371A713F4598212EA31F
2025-05-28 20:28:10.24349 +0200 CEST: 12B5FB87845977AA3605C05ABFA6
Generate mKSUIDs and output JSON using template formatting
$ mksuid -f template -t '{ "timestamp": "{{ .Timestamp }}", "payload": "{{ .Payload }}", "mksuid": "{{.String}}"}' -n 4
{ "timestamp": "845700406608", "payload": "80F12706D883B1F932716728DA09", "mksuid": "06nrkZ58B9uQRvJ9baExBnWliB7"}
{ "timestamp": "845700406608", "payload": "4617B8EB09229AAFDC64702FD975", "mksuid": "06nrkZ584e4kHTrZi7IVzZgZCUP"}
{ "timestamp": "845700406608", "payload": "2A0186B81833A0236C7FE6AE3A89", "mksuid": "06nrkZ581XL9Dibb1g0mByiixGT"}
{ "timestamp": "845700406608", "payload": "75E90559DFEA7B91A998B4E8D76D", "mksuid": "06nrkZ589wCv0rKPKO8N0zEI9bZ"}

OrNil functions

There are times when you are sure your mksuid is correct. But you need to get it from bytes or string and pass it it's to the structure. For this, there are OrNil functions that return mksuid.Nil on error and can be called directly in the structure.

Functions:

  • ParseOrNil()
  • FromPartsOrNil()
  • FromBytesOrNil()

An example of using the function without OrNil:

func getPosts(before, after []byte) {
 b, err := mksuid.FromBytes(before)
 if err != nil {
  // handle error
 }

 a, err := mksuid.FromBytes(after)
 if err != nil {
  // handle error
 }

 sortOptions := SortOptions{Before: b, After: a}
}

It is much more convenient to do it like this:

func getPosts(before, after []byte) {
 sortOptions := SortOptions{
  Before: mksuid.FromBytesOrNil(before),
  After:  mksuid.FromBytesOrNil(after),
 }
}

OrNil functions are also used in many other libraries:

Implementations for other languages

Footnotes

[^1]: Calculated using Birthday problem from thinxer on Github issue #8

License

mksuid source code is available under an MIT License.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var FastRander = newRBG()

FastRander is an io.Reader that uses math/rand and is optimized for generating 16 bytes KSUID payloads. It is intended to be used as a performance improvements for programs that have no need for cryptographically secure KSUIDs and are generating a lot of them.

Functions

func Compare

func Compare(a, b KSUID) int

Implements comparison for KSUID type

func IsSorted

func IsSorted(ids []KSUID) bool

IsSorted checks whether a slice of KSUIDs is sorted

func PutUint48

func PutUint48(b []byte, v uint64)

PutUint48 stores v into b[:6] big endian ignoring the most 2 significant bits of v.

func SetRand

func SetRand(r io.Reader)

Sets the global source of random bytes for KSUID generation. This should probably only be set once globally. While this is technically thread-safe as in it won't cause corruption, there's no guarantee on ordering.

func Sort

func Sort(ids []KSUID)

Sorts the given slice of KSUIDs

func Uint48

func Uint48(b []byte) uint64

Uint48 returns the uint64 big endian representation of b[:6] droping the most 2 significant bits of uint64.

Types

type CompressedSet

type CompressedSet []byte

CompressedSet is an immutable data type which stores a set of KSUIDs.

func AppendCompressed

func AppendCompressed(set []byte, ids ...KSUID) CompressedSet

AppendCompressed uses the given byte slice as pre-allocated storage space to build a KSUID set.

Note that the set uses a compression technique to store the KSUIDs, so the resuling length is not 20 x len(ids). The rule of thumb here is for the given byte slice to reserve the amount of memory that the application would be OK to waste.

func Compress

func Compress(ids ...KSUID) CompressedSet

Compress creates and returns a compressed set of KSUIDs from the list given as arguments.

func (CompressedSet) GoString

func (set CompressedSet) GoString() string

String satisfies the fmt.GoStringer interface, returns a Go representation of the set.

func (CompressedSet) Iter

func (set CompressedSet) Iter() CompressedSetIter

Iter returns an iterator that produces all KSUIDs in the set.

func (CompressedSet) String

func (set CompressedSet) String() string

String satisfies the fmt.Stringer interface, returns a human-readable string representation of the set.

type CompressedSetIter

type CompressedSetIter struct {
	// KSUID is modified by calls to the Next method to hold the KSUID loaded
	// by the iterator.
	KSUID KSUID
	// contains filtered or unexported fields
}

CompressedSetIter is an iterator type returned by Set.Iter to produce the list of KSUIDs stored in a set.

Here's is how the iterator type is commonly used:

for it := set.Iter(); it.Next(); {
	id := it.KSUID
	// ...
}

CompressedSetIter values are not safe to use concurrently from multiple goroutines.

func (*CompressedSetIter) Next

func (it *CompressedSetIter) Next() bool

Next moves the iterator forward, returning true if there a KSUID was found, or false if the iterator as reached the end of the set it was created from.

type KSUID

type KSUID [byteLength]byte

KSUIDs are 20 bytes:

00-05 byte: uint48 BE UTC timestamp with custom epoch
06-19 byte: random "payload"
var (

	// Represents a completely empty (invalid) KSUID
	Nil KSUID
	// Represents the highest value a KSUID can have
	Max = KSUID{255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255}
)

func FromBytes

func FromBytes(b []byte) (KSUID, error)

Constructs a KSUID from a 20-byte binary representation

func FromBytesOrNil

func FromBytesOrNil(b []byte) KSUID

Constructs a KSUID from a 20-byte binary representation. Same behavior as FromBytes, but returns a Nil KSUID on error.

func FromParts

func FromParts(t time.Time, payload []byte) (KSUID, error)

Constructs a KSUID from constituent parts

func FromPartsOrNil

func FromPartsOrNil(t time.Time, payload []byte) KSUID

Constructs a KSUID from constituent parts. Same behavior as FromParts, but returns a Nil KSUID on error.

func New

func New() KSUID

Generates a new KSUID. In the strange case that random bytes can't be read, it will panic.

func NewRandom

func NewRandom() (ksuid KSUID, err error)

Generates a new KSUID

func NewRandomWithTime

func NewRandomWithTime(t time.Time) (ksuid KSUID, err error)

func Parse

func Parse(s string) (KSUID, error)

Parse decodes a string-encoded representation of a KSUID object

func ParseOrNil

func ParseOrNil(s string) KSUID

Parse decodes a string-encoded representation of a KSUID object. Same behavior as Parse, but returns a Nil KSUID on error.

func (KSUID) Append

func (i KSUID) Append(b []byte) []byte

Append appends the string representation of i to b, returning a slice to a potentially larger memory area.

func (KSUID) Bytes

func (i KSUID) Bytes() []byte

Raw byte representation of KSUID

func (KSUID) Get

func (i KSUID) Get() interface{}

Get satisfies the flag.Getter interface, making it possible to use KSUIDs as part of of the command line options of a program.

func (KSUID) IsNil

func (i KSUID) IsNil() bool

IsNil returns true if this is a "nil" KSUID

func (KSUID) MarshalBinary

func (i KSUID) MarshalBinary() ([]byte, error)

func (KSUID) MarshalText

func (i KSUID) MarshalText() ([]byte, error)

func (KSUID) Next

func (id KSUID) Next() KSUID

Next returns the next KSUID after id.

func (KSUID) Payload

func (i KSUID) Payload() []byte

The 16-byte random payload without the timestamp

func (KSUID) Prev

func (id KSUID) Prev() KSUID

Prev returns the previoud KSUID before id.

func (*KSUID) Scan

func (i *KSUID) Scan(src interface{}) error

Scan implements the sql.Scanner interface. It supports converting from string, []byte, or nil into a KSUID value. Attempting to convert from another type will return an error.

func (*KSUID) Set

func (i *KSUID) Set(s string) error

Set satisfies the flag.Value interface, making it possible to use KSUIDs as part of of the command line options of a program.

func (KSUID) String

func (i KSUID) String() string

String-encoded representation that can be passed through Parse()

func (KSUID) Time

func (i KSUID) Time() time.Time

The timestamp portion of the ID as a Time object

func (KSUID) Timestamp

func (i KSUID) Timestamp() uint64

The timestamp portion of the ID as a bare integer which is uncorrected for KSUID's special epoch.

func (*KSUID) UnmarshalBinary

func (i *KSUID) UnmarshalBinary(b []byte) error

func (*KSUID) UnmarshalText

func (i *KSUID) UnmarshalText(b []byte) error

func (KSUID) Value

func (i KSUID) Value() (driver.Value, error)

Value converts the KSUID into a SQL driver value which can be used to directly use the KSUID as parameter to a SQL query.

type Sequence

type Sequence struct {
	// The seed is used as base for the KSUID generator, all generated KSUIDs
	// share the same leading 18 bytes of the seed.
	Seed KSUID
	// contains filtered or unexported fields
}

Sequence is a KSUID generator which produces a sequence of ordered KSUIDs from a seed.

Up to 65536 KSUIDs can be generated by for a single seed.

A typical usage of a Sequence looks like this:

seq := ksuid.Sequence{
	Seed: ksuid.New(),
}
id, err := seq.Next()

Sequence values are not safe to use concurrently from multiple goroutines.

func (*Sequence) Bounds

func (seq *Sequence) Bounds() (min KSUID, max KSUID)

Bounds returns the inclusive min and max bounds of the KSUIDs that may be generated by the sequence. If all ids have been generated already then the returned min value is equal to the max.

func (*Sequence) Next

func (seq *Sequence) Next() (KSUID, error)

Next produces the next KSUID in the sequence, or returns an error if the sequence has been exhausted.

Directories

Path Synopsis
cmd
mksuid command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL