uring

package module
v0.0.0-alpha1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 9, 2026 License: MIT Imports: 20 Imported by: 0

README

uring

Go Reference Go Report Card Codecov

Go package for the kernel-facing io_uring boundary on Linux 6.18+.

Language: English | 简体中文 | Español | 日本語 | Français

Overview

uring handles the kernel-facing boundary for Linux io_uring. It creates and starts rings, prepares SQEs, decodes CQEs, carries submission identity through user_data, and provides buffer registration, multishot operations, and listener-setup primitives.

uring draws a clear boundary: kernel-facing mechanics and observable completion facts live at the API edge; policy and composition stay above it.

The primary surfaces are:

  • Uring, the live ring handle and operation set
  • SQEContext, the submission identity carried in user_data
  • CQEView, the borrowed completion view returned by Wait
  • buffer provisioning through registered buffers and multi-size buffer groups

Installation

uring requires Linux kernel 6.18 or later. Check the running kernel first:

uname -r

Debian 13's stable kernel track may still be below 6.18. See Debian 13 kernel upgrade for the backports path to a kernel that meets the requirement.

go get code.hybscloud.com/uring
Debian 13 kernel upgrade

Debian 13 ships kernel 6.12 in its stable track. The trixie-backports suite provides a Debian-packaged 6.18+ kernel. See SETUP.md for step-by-step instructions.

Troubleshooting

Ring creation may return ENOMEM, EPERM, or ENOSYS depending on memlock limits, sysctl settings, or kernel support. Container runtimes block io_uring syscalls by default. See SETUP.md for diagnosis and resolution.

Ring lifecycle

New returns an unstarted ring and eagerly constructs the context pools. Call Start before submitting operations — it registers ring resources and enables the ring. uring assumes the 6.18+ baseline and carries no fallback branches for older kernels.

ring, err := uring.New(func(o *uring.Options) {
    o.Entries = uring.EntriesMedium
})
if err != nil {
    return err
}

if err := ring.Start(); err != nil {
    return err
}

cqes := make([]uring.CQEView, 64)
n, err := ring.Wait(cqes)
if err != nil && !errors.Is(err, iox.ErrWouldBlock) {
    return err
}

for i := range n {
    cqe := cqes[i]
    if cqe.Res < 0 {
        return fmt.Errorf("completion failed: op=%d fd=%d res=%d", cqe.Op(), cqe.FD(), cqe.Res)
    }
    fmt.Printf("completed op=%d on fd=%d with res=%d\n", cqe.Op(), cqe.FD(), cqe.Res)
}

Wait flushes pending submissions, then reaps completions. On single-issuer rings it also issues the kernel enter that keeps deferred task work moving once the SQ drains; the caller must serialize Wait/enter with submit-state operations. iox.ErrWouldBlock signals that no completion is currently observable at the boundary. The error is defined in code.hybscloud.com/iox.

Start and Stop form the ring lifecycle pair. Stop is idempotent and renders the ring permanently unusable — call it only after you have drained all in-flight operations, reaped outstanding CQEs, and quiesced live multishot subscriptions.

Types and operations

Type Role
Uring Ring setup, submission, completion reaping, and operation methods
Options Ring entries, registered-buffer budget, buffer-group scale, and completion visibility
SQEContext Compact submission identity stored in user_data
CQEView Borrowed completion record with decoded context accessors
ListenerOp Handle to a listener creation operation with FD and accept helpers
BundleIterator Iterates over buffers consumed in a bundle receive
IncrementalReceiver Manages incremental buffer-ring receives (IOU_PBUF_RING_INC)
ZCTracker Tracks the two-CQE zero-copy send lifecycle
ContextPools Pools for indirect and extended submission contexts
ZCRXReceiver Zero-copy receive lifecycle over a NIC RX queue
ZCRXConfig Configuration for a ZCRX receive instance
ZCRXHandler Callback interface for ZCRX data, errors, and shutdown
ZCRXBuffer Delivered zero-copy receive view with kernel refill on release

Operations:

Area Methods
Socket TCP4Socket, TCP6Socket, UDP4Socket, UDP6Socket, UDPLITE4Socket, UDPLITE6Socket, SCTP4Socket, SCTP6Socket, UnixSocket, SocketRaw, plus *Direct variants
Connection Bind, Listen, Accept, AcceptDirect, Connect, Shutdown
Socket I/O Receive, Send, RecvMsg, SendMsg, ReceiveBundle, ReceiveZeroCopy, Multicast, MulticastZeroCopy
Multishot AcceptMultishot, ReceiveMultishot, SubmitAcceptMultishot, SubmitAcceptDirectMultishot, SubmitReceiveMultishot, SubmitReceiveBundleMultishot
File I/O Read, Write, ReadV, WriteV, ReadFixed, WriteFixed, ReadvFixed, WritevFixed
File mgmt OpenAt, Close, Sync, Fallocate, FTruncate, Statx, RenameAt, UnlinkAt, MkdirAt, SymlinkAt, LinkAt
Xattr FGetXattr, FSetXattr, GetXattr, SetXattr
Transfer Splice, Tee, Pipe, SyncFileRange, FileAdvise
Timeout Timeout, TimeoutRemove, TimeoutUpdate, LinkTimeout
Cancel AsyncCancel, AsyncCancelFD, AsyncCancelOpcode, AsyncCancelAny, AsyncCancelAll
Poll PollAdd, PollRemove, PollUpdate, PollAddLevel, PollAddMultishot, PollAddMultishotLevel
Async EpollWait, FutexWait, FutexWake, FutexWaitV, Waitid
Ring msg MsgRing, MsgRingFD, FixedFdInstall, FilesUpdate
Cmd UringCmd, UringCmd128, Nop, Nop128

Nop128 and UringCmd128 require a ring created with Options.SQE128 and kernel support for the corresponding opcodes. Without both, they return ErrNotSupported.

Uring.Close submits IORING_OP_CLOSE for a target file descriptor. It is not a ring teardown method.

Context transport

SQEContext is the primary identity token. In direct mode it packs the opcode, SQE flags, buffer-group ID, and file descriptor into a single 64-bit value.

sqeCtx := uring.ForFD(fd).
    WithOp(uring.IORING_OP_RECV).
    WithBufGroup(groupID)

The three context modes are:

Mode Representation Typical use
Direct Inline 64-bit payload Common submit and reap path, zero allocation
Indirect Pointer to IndirectSQE Full SQE payload when 64 bits are not enough
Extended Pointer to ExtSQE Full SQE plus 64 bytes of user data

For the common path, start with ForFD or PackDirect and attach only the bits you need to see again at completion time. WithFlags replaces the entire flag set, so compute unions before calling it.

When you need caller-owned metadata beyond the 64-bit direct layout, borrow an ExtSQE, write into its UserData through Ctx*Of or ViewCtx*, and pack it back into an SQEContext. Prefer scalar payloads. If a raw overlay or typed view stores Go pointers, interfaces, func values, slices, strings, maps, chans, or structs containing them, keep the live roots outside UserData — the GC does not trace those raw bytes.

ext := ring.ExtSQE()
meta := uring.CtxV1Of(ext)
meta.Val1 = requestSeq

sqeCtx := uring.PackExtended(ext)
fmt.Printf("sqe context mode=%d seq=%d\n", sqeCtx.Mode(), meta.Val1)

NewContextPools returns pools that are ready to use. Call Reset only once all borrowed contexts have been returned and you want to reuse the pool set.

Completion dispatch with CQEView

There is no separate completion-context type. All completion dispatch goes through CQEView — call cqe.Context() to recover the original submission token.

cqes := make([]uring.CQEView, 64)

n, err := ring.Wait(cqes)
if err != nil && !errors.Is(err, iox.ErrWouldBlock) {
    return err
}

for i := 0; i < n; i++ {
    cqe := cqes[i]
    if cqe.Res < 0 {
        return fmt.Errorf("completion failed: op=%d fd=%d res=%d", cqe.Op(), cqe.FD(), cqe.Res)
    }

    switch cqe.Op() {
    case uring.IORING_OP_ACCEPT:
        fmt.Printf("accepted fd=%d\n", cqe.Res)
    case uring.IORING_OP_RECV:
        if cqe.HasBuffer() {
            fmt.Printf("buffer id=%d\n", cqe.BufID())
        }
        if cqe.Extended() {
            seq := uring.CtxV1Of(cqe.ExtSQE()).Val1
            fmt.Printf("request seq=%d\n", seq)
        }
    }
}

CQEView decodes the matching context mode on demand at completion time. CQEView, IndirectSQE, ExtSQE, and borrowed buffers must not outlive their documented lifetimes.

Buffer provisioning

Two receive-buffer strategies are supported:

  • fixed-size provided buffers through ReadBufferSize and ReadBufferNum
  • multi-size buffer groups through MultiSizeBuffer

For most systems the configuration helpers are the easiest entry point:

opts := uring.OptionsForSystem(uring.MachineMemory4GB)
ring, err := uring.New(func(o *uring.Options) {
    *o = opts
})

Use OptionsForBudget to start from an explicit memory budget, or BufferConfigForBudget to inspect the tier layout chosen for a given budget.

Registered buffers require pinned memory. If large buffer registration fails, increase RLIMIT_MEMLOCK or use a smaller memory budget.

Multishot and listener operations

AcceptMultishot, ReceiveMultishot, SubmitAcceptMultishot, SubmitAcceptDirectMultishot, SubmitReceiveMultishot, and SubmitReceiveBundleMultishot each submit a multishot socket operation.

CQE routing policy stays outside the package. Listener setup progresses through DecodeListenerCQE, PrepareListenerBind, PrepareListenerListen, and SetListenerReady; the caller decides how to dispatch completions and when to stop the chain.

Architecture implementation

The implementation sits at this boundary:

  1. New builds a disabled kernel ring, constructs context pools, and selects a buffer strategy.
  2. Start registers buffers and enables the ring for the 6.18+ baseline.
  3. Operation methods express intent by writing SQEs.
  4. Wait flushes submissions and returns borrowed CQE views.
  5. Higher layers decide scheduling, retries, parking, and orchestration.

This keeps uring focused on kernel-facing mechanics and preserves completion meaning across the boundary.

Application-layer patterns

uring exposes kernel mechanics; scheduling, retry, connection tracking, and protocol interpretation belong in the layers above it. The patterns below outline common ways to structure that upper layer.

Ring-owning event loop

In single-issuer mode (the default), one goroutine serializes all submit-state operations. A typical loop submits pending work, applies caller-owned iox.Backoff when Wait reports no observable progress, and dispatches completions:

func runLoop(ring *uring.Uring, stop <-chan struct{}) error {
    cqes := make([]uring.CQEView, 64)
    var backoff iox.Backoff
    for {
        select {
        case <-stop:
            return nil
        default:
        }

        n, err := ring.Wait(cqes)
        if errors.Is(err, iox.ErrWouldBlock) {
            backoff.Wait()
            continue
        }
        if err != nil {
            return err
        }
        if n == 0 {
            backoff.Wait()
            continue
        }

        backoff.Reset()
        for i := range n {
            dispatch(ring, cqes[i])
        }
    }
}

All ring methods, including Send, Receive, AcceptMultishot, and Wait, run on this goroutine. Work from other goroutines enters the loop through a channel or a lock-free queue, not by calling ring methods directly. iox.Backoff stays caller-owned: call backoff.Wait() on iox.ErrWouldBlock or when Wait returns no CQEs, and backoff.Reset() after any batch with n > 0.

Multishot subscription lifecycle

A multishot operation produces a stream of CQEs until the kernel sends a final one (without IORING_CQE_F_MORE). The framework layer tracks subscriptions and handles resubmission:

handler := uring.NewMultishotSubscriber().
OnStep(func(step uring.MultishotStep) uring.MultishotAction {
if step.Err != nil {
return uring.MultishotStop
}
connFD := iofd.FD(step.CQE.Res)
registerConnection(connFD)
return uring.MultishotContinue
}).
OnStop(func (err error, cancelled bool) {
if !cancelled {
resubscribeAccept()
}
})

_, err := ring.AcceptMultishot(acceptCtx, handler.Handler())

OnMultishotStep observes each completion; return MultishotContinue to keep the stream or MultishotStop to request cancellation. OnMultishotStop runs once at the terminal state. Use it for cleanup and conditional resubscription.

Per-connection state with typed contexts

Extended contexts carry per-connection references through the submit → complete round-trip without a global lookup table:

type ConnState struct {
Addr    netip.AddrPort
Created int64
}

ext := ring.ExtSQE()
ctx := uring.Ctx1V1Of[ConnState](ext)
ctx.Ref1 = connState
ctx.Val1 = sequenceNumber

sqeCtx := uring.PackExtended(ext)
if err := ring.Send(sqeCtx, &fd, payload); err != nil {
ring.PutExtSQE(ext)
return err
}

At completion time, recover the state through the same typed view:

ext := cqe.ExtSQE()
ctx := uring.Ctx1V1Of[ConnState](ext)
conn := ctx.Ref1
seq := ctx.Val1
ring.PutExtSQE(ext)

Keep live Go pointer roots reachable outside UserData. The GC does not trace those raw bytes. The sidecar root set attached to each ExtSQE slot handles this for internal multishot and listener protocols, but framework code that places typed refs must keep them reachable independently.

Deadline composition

LinkTimeout attaches a deadline to the preceding SQE through an IOSQE_IO_LINK chain. The operation and the timeout race: exactly one completes, and the other is cancelled.

recvCtx := uring.ForFD(fd).
WithOp(uring.IORING_OP_RECV).
WithBufGroup(group)

if err := ring.Receive(recvCtx, &fd, nil, uring.WithFlags(uring.IOSQE_IO_LINK)); err != nil {
return err
}

timeoutCtx := uring.PackDirect(uring.IORING_OP_LINK_TIMEOUT, 0, 0, 0)
if err := ring.LinkTimeout(timeoutCtx, 5*time.Second); err != nil {
return err
}

The framework layer handles both outcomes: a successful receive cancels the timeout, and a fired timeout cancels the receive. Both produce CQEs that the dispatch loop must observe.

TCP usage patterns

These are the shortest flows, meant to be read alongside the tests:

Scenario Main APIs Reference
Echo server ListenerManager, AcceptMultishot, ReceiveMultishot, Send listener_example_test.go, examples/multishot_test.go, examples/echo_test.go
Client TCP4Socket, Connect, Send, Receive socket_integration_test.go
TCP echo server

ListenerManager prepares the socket → bind → listen chain for you. Once the listener is live, start multishot accept and multishot receive on the connection FDs.

pool := uring.NewContextPools(32)
manager := uring.NewListenerManager(ring, pool)

listenerOp, err := manager.ListenTCP4(addr, 128, listenerHandler)
if err != nil {
    return err
}

acceptSub, err := listenerOp.AcceptMultishot(acceptHandler)
if err != nil {
    return err
}
defer acceptSub.Cancel()

recvCtx := uring.ForFD(clientFD).WithBufGroup(readGroup)
recvSub, err := ring.ReceiveMultishot(recvCtx, recvHandler)
if err != nil {
    return err
}
defer recvSub.Cancel()

listener_example_test.go covers listener setup with multishot accept, examples/multishot_test.go covers handler-side multishot receive CQEs, and examples/echo_test.go covers the full loopback echo flow.

TCP client

Create a socket, wait for the IORING_OP_SOCKET completion, then wrap the returned FD in an iofd.FD for Connect, Send, and Receive.

clientCtx := uring.PackDirect(uring.IORING_OP_SOCKET, 0, 0, 0)
if err := ring.TCP4Socket(clientCtx); err != nil {
    return err
}

clientFD := iofd.NewFD(int(socketCQE.Res))

connectCtx := uring.PackDirect(uring.IORING_OP_CONNECT, 0, 0, int32(clientFD))
if err := ring.Connect(connectCtx, remoteAddr); err != nil {
    return err
}

sendCtx := uring.PackDirect(uring.IORING_OP_SEND, 0, 0, int32(clientFD))
if err := ring.Send(sendCtx, &clientFD, payload); err != nil {
    return err
}

recvCtx := uring.PackDirect(uring.IORING_OP_RECV, 0, 0, int32(clientFD))
if err := ring.Receive(recvCtx, &clientFD, buf); err != nil {
    return err
}

After each submit, reuse the Wait loop from the ring lifecycle section to observe the matching completion. socket_integration_test.go at the package level covers the connect/send cycle.

Zero-copy receive (ZCRX)

ZCRXReceiver drives zero-copy receive from a NIC hardware RX queue through io_uring.

NewZCRXReceiver requires a ring with 32-byte CQEs (IORING_SETUP_CQE32). The current Options surface does not expose that setup flag, so rings created through the standard New path cause this constructor to return ErrNotSupported.

Lifecycle
  1. Create the receiver with NewZCRXReceiver on a ring with 32-byte CQEs. The constructor registers the ZCRX interface queue, maps the refill area, and prepares the refill ring.
  2. Call Start to submit the extended RECV_ZC operation.
  3. On the CQE dispatch path, ZCRX completions route to the ZCRXHandler:
    • OnData delivers a ZCRXBuffer pointing into the NIC-mapped area. Call Release when done to return the slot to the kernel. Return false to request a best-effort stop.
    • OnError delivers CQE errors. Return false to request a best-effort stop.
    • OnStopped fires once during terminal retirement, before the state reaches Stopped.
  4. Call Stop to submit an async cancel. The receiver transitions through StoppingRetiringStopped.
  5. Poll Stopped until it returns true, stop the owning ring, then call Close to release the mapped area and the refill-ring mapping.
State machine
Idle → Active → Stopping → Retiring → Stopped

Stop reverts to Active if cancel submission fails. Close is idempotent.

Handler contract
  • OnData and OnError are called serially from the CQE dispatch goroutine.
  • Release is single-producer; call it only from the dispatch goroutine.
  • Stop must not race with CQE dispatch. The caller is responsible for this serialization.

Examples

The example tests in uring/examples/ show the API in practice.

  • multishot_test.go, multishot accept, multishot receive, and subscription stop behavior
  • file_io_test.go, basic file reads, writes, and batching
  • fixed_buffers_test.go, registered buffers and fixed-buffer I/O
  • vectored_io_test.go, vectored read and write operations
  • splice_tee_test.go, splice and tee zero-copy data transfer
  • zerocopy_test.go, zero-copy send paths and completion tracking
  • poll_test.go, poll-based readiness workflows
  • buffer_ring_test.go, buffer ring provisioning and multi-size buffer groups
  • context_test.go, direct, indirect, and extended SQEContext flows plus CQEView access
  • echo_test.go, TCP echo server and UDP ping-pong flows
  • timeout_test.go, timeout and linked-timeout operations

The package-level listener_example_test.go covers listener creation with multishot accept, and socket_integration_test.go covers the TCP client connect/send flow.

Operational notes

  • Enable NotifySucceed when you need a visible CQE for every successful operation.
  • ring.Features reports actual SQ/CQ entry counts, SQE slot width, and the byte order used to interpret user_data.
  • Leave MultiIssuers unset for the default single-issuer configuration (SINGLE_ISSUER + DEFER_TASKRUN) when a single execution path serializes submit-state operations (submit, Wait/enter, Stop, and resize). Set it only when multiple goroutines need concurrent submission or wait-side enter — this switches the ring to the shared-submit COOP_TASKRUN configuration.
  • EpollWait requires timeout to remain 0; use LinkTimeout when you need a deadline.
  • Release or discard borrowed completion views and pooled contexts promptly.
  • ListenerOp.Close closes the listener FD immediately. If a setup CQE is still pending, drain it first, then call Close again to return the borrowed ExtSQE to the pool.

Platform support

uring targets Go 1.26+ and Linux 6.18+ on the real kernel-backed path. Most source files and example tests carry a //go:build linux guard. Darwin files provide API-compatible stubs for Darwin builds; they do not change the Linux runtime baseline.

License

MIT, see LICENSE.

©2026 Hayabusa Cloud Co., Ltd.

Documentation

Overview

Package uring provides the kernel-boundary `io_uring` surface for Linux 6.18+. Its core Linux `io_uring` implementation was refactored from `code.hybscloud.com/sox` into this dedicated package. It prepares SQEs, decodes CQEs, transports submission context through `user_data`, and exposes kernel-boundary facts; dispatch, retry, and other orchestration policy stay in higher layers.

// Create TCP socket
socketCtx := uring.PackDirect(uring.IORING_OP_SOCKET, 0, 0, 0)
ring.TCP4Socket(socketCtx)

// Submit low-level multishot accept (one SQE, multiple CQEs)
acceptCtx := uring.PackDirect(uring.IORING_OP_ACCEPT, 0, 0, listenerFD)
ring.SubmitAcceptMultishot(acceptCtx)

// Start multishot receive with buffer selection
recvCtx := uring.ForFD(clientFD).WithBufGroup(bufGroupID)
sub, err := ring.ReceiveMultishot(recvCtx, recvHandler)

Completions return the kernel result together with the submission context.

cqes := make([]uring.CQEView, 64)
n, err := ring.Wait(cqes)  // Poll CQ, returns iox.ErrWouldBlock if empty

for i := range n {
    cqe := cqes[i]
    if cqe.Res < 0 {
        return fmt.Errorf("completion failed: op=%d fd=%d res=%d", cqe.Op(), cqe.FD(), cqe.Res)
    }
    fmt.Printf("completed op=%d on fd=%d with res=%d\n", cqe.Op(), cqe.FD(), cqe.Res)
}

Uring.SubmitAcceptMultishot, Uring.SubmitReceiveMultishot, and Uring.SubmitReceiveBundleMultishot submit raw multishot SQEs and keep the kernel-boundary flow explicit. Uring.AcceptMultishot and Uring.ReceiveMultishot use the same kernel path and return a MultishotSubscription when caller code wants callback-driven retirement.

sqeCtx := uring.ForFD(listenerFD)
sub, err := ring.AcceptMultishot(sqeCtx, handler)

// Process CQEs - higher layers decide how to route decoded CQEs
for i := range n {
    dispatch(handler, cqes[i])
}

// Cancel when done
sub.Cancel()

Listener setup advances with DecodeListenerCQE, PrepareListenerBind, PrepareListenerListen, and SetListenerReady. ListenerManager is a thin convenience for the initial SOCKET submission and returns a ListenerOp. If ListenerOp.Close races a pending listener setup CQE, drain that CQE before the final Close that returns the pooled listener context.

pool := uring.NewContextPools(16)
manager := uring.NewListenerManager(ring, pool)

addr := &net.TCPAddr{IP: net.IPv4(127, 0, 0, 1), Port: 8080}
op, err := manager.ListenTCP4(addr, 128, handler)

// Caller decodes CQEs and chains bind→listen via Prepare helpers
// After LISTEN completes, start accepting:
acceptSub, err := op.AcceptMultishot(acceptHandler)

Extended-mode raw `UserData` is caller-beware storage. Prefer scalar payloads there; if raw overlays or typed context views place Go pointers, interfaces, func values, maps, slices, strings, chans, or structs containing them in those bytes, caller code must keep the live roots outside `UserData`.

SQEContext packs submission metadata into `user_data`.

Direct mode layout (zero allocation, most common):

┌─────────┬─────────┬──────────────┬────────────────────────────┬────┐
│ Op (8b) │Flags(8b)│ BufGrp (16b) │        FD (30b)            │Mode│
└─────────┴─────────┴──────────────┴────────────────────────────┴────┘
  Bits 0-7  Bits 8-15  Bits 16-31     Bits 32-61              Bits 62-63

Mode bits (62-63): 00=Direct, 01=Indirect (64B ptr), 10=Extended (128B ptr)

Pack context for submission:

ctx := uring.PackDirect(
    uring.IORING_OP_RECV,   // Op: operation type
    0,                      // Flags: SQE flags
    bufferGroupID,          // BufGroup: for buffer selection
    clientFD,               // FD: target file descriptor
)

If `IOSQE_FIXED_FILE` is set, the FD field stores the registered file index instead of a raw file descriptor.

Or use the fluent builder:

ctx := uring.ForFD(clientFD).WithOp(uring.IORING_OP_RECV).WithBufGroup(groupID)

Handler Patterns

Handler helpers provide convenience step/action adapters. They do not change the underlying CQE facts and are optional.

Subscriber pattern (functional callbacks):

handler := uring.NewMultishotSubscriber().
    OnStep(func(step uring.MultishotStep) uring.MultishotAction {
        if step.Err == nil {
            return uring.MultishotContinue
        }
        return uring.MultishotStop
    }).
    OnStop(func(err error, cancelled bool) {
        log.Println("stopped", err, cancelled)
    })

Noop embedding pattern (override only needed methods):

type myHandler struct {
    uring.NoopMultishotHandler
    connections int
}

func (h *myHandler) OnMultishotStep(step uring.MultishotStep) uring.MultishotAction {
    if step.Err == nil && step.CQE.Res >= 0 {
        h.connections++
        return uring.MultishotContinue
    }
    return h.NoopMultishotHandler.OnMultishotStep(step)
}

Handlers either return `MultishotContinue` to keep a live subscription, or `MultishotStop` to request cancellation after the current step. The request is local until the cancel SQE is successfully enqueued.

Buffer Groups

Buffer groups enable kernel-side buffer selection for receive operations. The kernel picks an available buffer from the group at completion time; userspace does not select or assign buffers per receive.

opts := uring.OptionsForBudget(256 * uring.MiB)
ring, _ := uring.New(func(opt *uring.Options) {
    *opt = opts
})

Supported Operations

Socket creation:

Socket operations:

File operations:

Control operations:

Registration:

Ring management:

Capability queries:

Zero-copy receive (ZCRX):

Performance

The hot submit and reap paths are designed to remain zero-allocation. See the benchmark tests for current machine-specific numbers.

Ring Setup

Create and start an io_uring instance:

ring, err := uring.New(func(opt *uring.Options) {
    opt.Entries = uring.EntriesMedium // 2048 entries
})
if err != nil {
    return err
}
if err := ring.Start(); err != nil {
    return err
}

Memory Barriers

The package uses dwcas.BarrierAcquire and dwcas.BarrierRelease for SQ/CQ ring synchronization. On amd64 (TSO), these are compiler barriers. On arm64, they emit DMB ISHLD/ISHST instructions. User code does not manage these barriers.

Dependencies

Index

Constants

View Source
const (
	KiB = 1 << 10
	MiB = 1 << 20
	GiB = 1 << 30
)

Memory size constants for budget specification.

View Source
const (
	MachineMemory512MB = 512 * MiB
	MachineMemory1GB   = 1 * GiB
	MachineMemory2GB   = 2 * GiB
	MachineMemory4GB   = 4 * GiB
	MachineMemory8GB   = 8 * GiB
	MachineMemory16GB  = 16 * GiB
	MachineMemory32GB  = 32 * GiB
	MachineMemory64GB  = 64 * GiB
	MachineMemory96GB  = 96 * GiB
	MachineMemory128GB = 128 * GiB
)

Common machine memory sizes for OptionsForSystem.

View Source
const (
	BufferSizePico   = iobuf.BufferSizePico   // 32 B
	BufferSizeNano   = iobuf.BufferSizeNano   // 128 B
	BufferSizeMicro  = iobuf.BufferSizeMicro  // 512 B
	BufferSizeSmall  = iobuf.BufferSizeSmall  // 2 KiB
	BufferSizeMedium = iobuf.BufferSizeMedium // 8 KiB
	BufferSizeBig    = iobuf.BufferSizeBig    // 32 KiB
	BufferSizeLarge  = iobuf.BufferSizeLarge  // 128 KiB
	BufferSizeGreat  = iobuf.BufferSizeGreat  // 512 KiB
	BufferSizeHuge   = iobuf.BufferSizeHuge   // 2 MiB
	BufferSizeVast   = iobuf.BufferSizeVast   // 8 MiB
	BufferSizeGiant  = iobuf.BufferSizeGiant  // 32 MiB
	BufferSizeTitan  = iobuf.BufferSizeTitan  // 128 MiB
)

Buffer size constants re-exported from iobuf for API compatibility. These follow a power-of-4 progression starting at 32 bytes.

View Source
const (
	EPERM           = uintptr(zcall.EPERM)
	EINTR           = uintptr(zcall.EINTR)
	EAGAIN          = uintptr(zcall.EAGAIN)
	EWOULDBLOCK     = EAGAIN
	ENOMEM          = uintptr(zcall.ENOMEM)
	EACCES          = uintptr(zcall.EACCES)
	EFAULT          = uintptr(zcall.EFAULT)
	EBUSY           = uintptr(zcall.EBUSY)
	EEXIST          = uintptr(zcall.EEXIST)
	ENAMETOOLONG    = uintptr(zcall.ENAMETOOLONG)
	ENODEV          = uintptr(zcall.ENODEV)
	EINVAL          = uintptr(zcall.EINVAL)
	EPIPE           = uintptr(zcall.EPIPE)
	EMFILE          = uintptr(zcall.EMFILE)
	ENFILE          = uintptr(zcall.ENFILE)
	ENOSYS          = uintptr(zcall.ENOSYS)
	ENOTSUP         = uintptr(zcall.ENOTSUP)
	EINPROGRESS     = uintptr(zcall.EINPROGRESS)
	EALREADY        = uintptr(zcall.EALREADY)
	ENOTSOCK        = uintptr(zcall.ENOTSOCK)
	EDESTADDRREQ    = uintptr(zcall.EDESTADDRREQ)
	EMSGSIZE        = uintptr(zcall.EMSGSIZE)
	EPROTOTYPE      = uintptr(zcall.EPROTOTYPE)
	ENOPROTOOPT     = uintptr(zcall.ENOPROTOOPT)
	EPROTONOSUPPORT = uintptr(zcall.EPROTONOSUPPORT)
	EOPNOTSUPP      = uintptr(zcall.EOPNOTSUPP)
	EAFNOSUPPORT    = uintptr(zcall.EAFNOSUPPORT)
	EADDRINUSE      = uintptr(zcall.EADDRINUSE)
	EADDRNOTAVAIL   = uintptr(zcall.EADDRNOTAVAIL)
	ENETDOWN        = uintptr(zcall.ENETDOWN)
	ENETUNREACH     = uintptr(zcall.ENETUNREACH)
	ENETRESET       = uintptr(zcall.ENETRESET)
	ECONNABORTED    = uintptr(zcall.ECONNABORTED)
	ECONNRESET      = uintptr(zcall.ECONNRESET)
	ENOBUFS         = uintptr(zcall.ENOBUFS)
	EISCONN         = uintptr(zcall.EISCONN)
	ENOTCONN        = uintptr(zcall.ENOTCONN)
	ESHUTDOWN       = uintptr(zcall.ESHUTDOWN)
	ETIMEDOUT       = uintptr(zcall.ETIMEDOUT)
	ECONNREFUSED    = uintptr(zcall.ECONNREFUSED)
	EHOSTDOWN       = uintptr(zcall.EHOSTDOWN)
	EHOSTUNREACH    = uintptr(zcall.EHOSTUNREACH)
	ECANCELED       = uintptr(zcall.ECANCELED)
)

Errno constants aliased from zcall for architecture-safe error handling.

View Source
const (
	EntriesPico   = 1 << 3  // 8 entries
	EntriesNano   = 1 << 5  // 32 entries
	EntriesMicro  = 1 << 7  // 128 entries
	EntriesSmall  = 1 << 9  // 512 entries
	EntriesMedium = 1 << 11 // 2048 entries
	EntriesLarge  = 1 << 13 // 8192 entries
	EntriesHuge   = 1 << 15 // 32768 entries
)

Uring entry count constants define submission queue sizes. The values scale by powers of four: 8, 32, 128, 512, 2048, 8192, and 32768.

View Source
const (
	IORING_SETUP_IOPOLL             = zcall.IORING_SETUP_IOPOLL
	IORING_SETUP_SQPOLL             = zcall.IORING_SETUP_SQPOLL
	IORING_SETUP_SQ_AFF             = zcall.IORING_SETUP_SQ_AFF
	IORING_SETUP_CQSIZE             = zcall.IORING_SETUP_CQSIZE
	IORING_SETUP_CLAMP              = zcall.IORING_SETUP_CLAMP
	IORING_SETUP_ATTACH_WQ          = zcall.IORING_SETUP_ATTACH_WQ
	IORING_SETUP_R_DISABLED         = zcall.IORING_SETUP_R_DISABLED
	IORING_SETUP_SUBMIT_ALL         = zcall.IORING_SETUP_SUBMIT_ALL
	IORING_SETUP_COOP_TASKRUN       = zcall.IORING_SETUP_COOP_TASKRUN
	IORING_SETUP_TASKRUN_FLAG       = zcall.IORING_SETUP_TASKRUN_FLAG
	IORING_SETUP_SQE128             = zcall.IORING_SETUP_SQE128
	IORING_SETUP_CQE32              = zcall.IORING_SETUP_CQE32
	IORING_SETUP_SINGLE_ISSUER      = zcall.IORING_SETUP_SINGLE_ISSUER
	IORING_SETUP_DEFER_TASKRUN      = zcall.IORING_SETUP_DEFER_TASKRUN
	IORING_SETUP_NO_MMAP            = zcall.IORING_SETUP_NO_MMAP
	IORING_SETUP_REGISTERED_FD_ONLY = zcall.IORING_SETUP_REGISTERED_FD_ONLY
	IORING_SETUP_NO_SQARRAY         = zcall.IORING_SETUP_NO_SQARRAY
	IORING_SETUP_HYBRID_IOPOLL      = zcall.IORING_SETUP_HYBRID_IOPOLL
	IORING_SETUP_CQE_MIXED          = zcall.IORING_SETUP_CQE_MIXED // Allow both 16b and 32b CQEs
	IORING_SETUP_SQE_MIXED          = zcall.IORING_SETUP_SQE_MIXED // Allow both 64b and 128b SQEs
)
View Source
const (
	IORING_ENTER_GETEVENTS       = zcall.IORING_ENTER_GETEVENTS
	IORING_ENTER_SQ_WAKEUP       = zcall.IORING_ENTER_SQ_WAKEUP
	IORING_ENTER_SQ_WAIT         = zcall.IORING_ENTER_SQ_WAIT
	IORING_ENTER_EXT_ARG         = zcall.IORING_ENTER_EXT_ARG
	IORING_ENTER_REGISTERED_RING = zcall.IORING_ENTER_REGISTERED_RING
	IORING_ENTER_ABS_TIMER       = zcall.IORING_ENTER_ABS_TIMER   // Absolute timeout
	IORING_ENTER_EXT_ARG_REG     = zcall.IORING_ENTER_EXT_ARG_REG // Use registered wait region
	IORING_ENTER_NO_IOWAIT       = zcall.IORING_ENTER_NO_IOWAIT   // Skip I/O wait
)
View Source
const (
	IORING_OFF_SQ_RING    int64 = 0
	IORING_OFF_CQ_RING    int64 = 0x8000000
	IORING_OFF_SQES       int64 = 0x10000000
	IORING_OFF_PBUF_RING        = 0x80000000
	IORING_OFF_PBUF_SHIFT       = 16
	IORING_OFF_MMAP_MASK        = 0xf8000000
)
View Source
const (
	IORING_SQ_NEED_WAKEUP = 1 << iota
	IORING_SQ_CQ_OVERFLOW
	IORING_SQ_TASKRUN
)
View Source
const (
	IOSQE_FIXED_FILE       = zcall.IOSQE_FIXED_FILE
	IOSQE_IO_DRAIN         = zcall.IOSQE_IO_DRAIN
	IOSQE_IO_LINK          = zcall.IOSQE_IO_LINK
	IOSQE_IO_HARDLINK      = zcall.IOSQE_IO_HARDLINK
	IOSQE_ASYNC            = zcall.IOSQE_ASYNC
	IOSQE_BUFFER_SELECT    = zcall.IOSQE_BUFFER_SELECT
	IOSQE_CQE_SKIP_SUCCESS = zcall.IOSQE_CQE_SKIP_SUCCESS
)
View Source
const (
	IORING_POLL_ADD_MULTI = 1 << iota
	IORING_POLL_UPDATE_EVENTS
	IORING_POLL_UPDATE_USER_DATA
	IORING_POLL_ADD_LEVEL
)
View Source
const (
	IORING_ASYNC_CANCEL_ALL = 1 << iota
	IORING_ASYNC_CANCEL_FD
	IORING_ASYNC_CANCEL_ANY
	IORING_ASYNC_CANCEL_FD_FIXED
	IORING_ASYNC_CANCEL_USERDATA
	IORING_ASYNC_CANCEL_OP
)
View Source
const (
	IORING_CQE_F_BUFFER        = 1 << 0
	IORING_CQE_F_MORE          = 1 << 1
	IORING_CQE_F_SOCK_NONEMPTY = 1 << 2
	IORING_CQE_F_NOTIF         = 1 << 3
	IORING_CQE_F_BUF_MORE      = 1 << 4  // Buffer partially consumed (incremental mode)
	IORING_CQE_F_SKIP          = 1 << 5  // Skip CQE (gap filler for ring wrap)
	IORING_CQE_F_32            = 1 << 15 // 32-byte CQE in mixed mode
)
View Source
const (
	IORING_REGISTER_BUFFERS          = zcall.IORING_REGISTER_BUFFERS
	IORING_UNREGISTER_BUFFERS        = zcall.IORING_UNREGISTER_BUFFERS
	IORING_REGISTER_FILES            = zcall.IORING_REGISTER_FILES
	IORING_UNREGISTER_FILES          = zcall.IORING_UNREGISTER_FILES
	IORING_REGISTER_EVENTFD          = zcall.IORING_REGISTER_EVENTFD
	IORING_UNREGISTER_EVENTFD        = zcall.IORING_UNREGISTER_EVENTFD
	IORING_REGISTER_FILES_UPDATE     = zcall.IORING_REGISTER_FILES_UPDATE
	IORING_REGISTER_EVENTFD_ASYNC    = zcall.IORING_REGISTER_EVENTFD_ASYNC
	IORING_REGISTER_PROBE            = zcall.IORING_REGISTER_PROBE
	IORING_REGISTER_PERSONALITY      = zcall.IORING_REGISTER_PERSONALITY
	IORING_UNREGISTER_PERSONALITY    = zcall.IORING_UNREGISTER_PERSONALITY
	IORING_REGISTER_RESTRICTIONS     = zcall.IORING_REGISTER_RESTRICTIONS
	IORING_REGISTER_ENABLE_RINGS     = zcall.IORING_REGISTER_ENABLE_RINGS
	IORING_REGISTER_FILES2           = zcall.IORING_REGISTER_FILES2
	IORING_REGISTER_FILES_UPDATE2    = zcall.IORING_REGISTER_FILES_UPDATE2
	IORING_REGISTER_BUFFERS2         = zcall.IORING_REGISTER_BUFFERS2
	IORING_REGISTER_BUFFERS_UPDATE   = zcall.IORING_REGISTER_BUFFERS_UPDATE
	IORING_REGISTER_IOWQ_AFF         = zcall.IORING_REGISTER_IOWQ_AFF
	IORING_UNREGISTER_IOWQ_AFF       = zcall.IORING_UNREGISTER_IOWQ_AFF
	IORING_REGISTER_IOWQ_MAX_WORKERS = zcall.IORING_REGISTER_IOWQ_MAX_WORKERS
	IORING_REGISTER_RING_FDS         = zcall.IORING_REGISTER_RING_FDS
	IORING_UNREGISTER_RING_FDS       = zcall.IORING_UNREGISTER_RING_FDS
	IORING_REGISTER_PBUF_RING        = zcall.IORING_REGISTER_PBUF_RING
	IORING_UNREGISTER_PBUF_RING      = zcall.IORING_UNREGISTER_PBUF_RING
	IORING_REGISTER_SYNC_CANCEL      = zcall.IORING_REGISTER_SYNC_CANCEL
	IORING_REGISTER_FILE_ALLOC_RANGE = zcall.IORING_REGISTER_FILE_ALLOC_RANGE
	IORING_REGISTER_PBUF_STATUS      = zcall.IORING_REGISTER_PBUF_STATUS
	IORING_REGISTER_NAPI             = zcall.IORING_REGISTER_NAPI
	IORING_UNREGISTER_NAPI           = zcall.IORING_UNREGISTER_NAPI
	IORING_REGISTER_CLOCK            = zcall.IORING_REGISTER_CLOCK         // Register clock source
	IORING_REGISTER_CLONE_BUFFERS    = zcall.IORING_REGISTER_CLONE_BUFFERS // Clone buffers from another ring
	IORING_REGISTER_SEND_MSG_RING    = zcall.IORING_REGISTER_SEND_MSG_RING // Send MSG_RING without ring
	IORING_REGISTER_ZCRX_IFQ         = zcall.IORING_REGISTER_ZCRX_IFQ      // Register ZCRX interface queue
	IORING_REGISTER_RESIZE_RINGS     = zcall.IORING_REGISTER_RESIZE_RINGS  // Resize CQ ring
	IORING_REGISTER_MEM_REGION       = zcall.IORING_REGISTER_MEM_REGION    // Memory region setup (6.19+)
	IORING_REGISTER_QUERY            = zcall.IORING_REGISTER_QUERY         // Query ring state (6.19+)
	IORING_REGISTER_ZCRX_CTRL        = zcall.IORING_REGISTER_ZCRX_CTRL     // ZCRX control operations (6.19+)
)
View Source
const (
	FUTEX2_SIZE_U8  uint32 = 0x00 // 8-bit futex
	FUTEX2_SIZE_U16 uint32 = 0x01 // 16-bit futex
	FUTEX2_SIZE_U32 uint32 = 0x02 // 32-bit futex (most common)
	FUTEX2_SIZE_U64 uint32 = 0x03 // 64-bit futex
	FUTEX2_NUMA     uint32 = 0x04 // NUMA-aware futex
	FUTEX2_PRIVATE  uint32 = 128  // Private futex (process-local, faster)
)

Futex2 flags for FutexWait/FutexWake operations. These follow the futex2(2) interface, not the legacy futex(2) v1 flags.

View Source
const (
	// IORING_MSG_DATA sends data (result + userData) to target ring's CQ.
	IORING_MSG_DATA uint64 = 0

	// IORING_MSG_SEND_FD transfers a fixed file from source to target ring.
	IORING_MSG_SEND_FD uint64 = 1
)

MSG_RING command types for the addr field.

View Source
const (
	// IORING_MSG_RING_CQE_SKIP skips posting CQE to target ring.
	// The source ring still gets a completion.
	IORING_MSG_RING_CQE_SKIP uint32 = 1 << 0

	// IORING_MSG_RING_FLAGS_PASS passes the specified flags to target CQE.
	IORING_MSG_RING_FLAGS_PASS uint32 = 1 << 1
)

MSG_RING flags for MsgRing operations.

View Source
const (
	IOU_PBUF_RING_MMAP = 1 // Kernel allocates memory, app uses mmap
	IOU_PBUF_RING_INC  = 2 // Incremental buffer consumption mode
)

Buffer ring registration flags.

View Source
const (
	IORING_ZCRX_AREA_SHIFT = 48
	IORING_ZCRX_AREA_MASK  = ^((uint64(1) << IORING_ZCRX_AREA_SHIFT) - 1)
)

ZCRX area shift and mask for encoding area ID into offsets.

View Source
const (
	ZCRX_CTRL_FLUSH_RQ = 0 // Flush refill queue
	ZCRX_CTRL_EXPORT   = 1 // Export ZCRX state
)

ZCRX control operations.

View Source
const (
	IO_URING_QUERY_OPCODES = 0 // Query supported opcodes
	IO_URING_QUERY_ZCRX    = 1 // Query ZCRX capabilities
	IO_URING_QUERY_SCQ     = 2 // Query SQ/CQ ring info
)

Query operation types for IORING_REGISTER_QUERY.

View Source
const (
	IO_URING_NAPI_REGISTER_OP   = 0 // Register/unregister (backward compatible)
	IO_URING_NAPI_STATIC_ADD_ID = 1 // Add NAPI ID with static tracking
	IO_URING_NAPI_STATIC_DEL_ID = 2 // Delete NAPI ID with static tracking
)

NAPI operation types.

View Source
const (
	IO_URING_NAPI_TRACKING_DYNAMIC  = 0   // Dynamic tracking (default)
	IO_URING_NAPI_TRACKING_STATIC   = 1   // Static tracking
	IO_URING_NAPI_TRACKING_INACTIVE = 255 // Inactive/disabled
)

NAPI tracking strategies.

View Source
const (
	SOCKET_URING_OP_SIOCINQ      = 0 // Get input queue size
	SOCKET_URING_OP_SIOCOUTQ     = 1 // Get output queue size
	SOCKET_URING_OP_GETSOCKOPT   = 2 // Get socket option
	SOCKET_URING_OP_SETSOCKOPT   = 3 // Set socket option
	SOCKET_URING_OP_TX_TIMESTAMP = 4 // TX timestamp support
	SOCKET_URING_OP_GETSOCKNAME  = 5 // Get socket name
)

Socket uring command operations.

View Source
const (
	IORING_TIMESTAMP_HW_SHIFT   = 16                             // CQE flags bit shift for HW timestamp
	IORING_TIMESTAMP_TYPE_SHIFT = IORING_TIMESTAMP_HW_SHIFT + 1  // CQE flags bit shift for timestamp type
	IORING_CQE_F_TSTAMP_HW      = 1 << IORING_TIMESTAMP_HW_SHIFT // Hardware timestamp flag
)

Timestamp constants for SOCKET_URING_OP_TX_TIMESTAMP.

View Source
const (
	IORING_NOP_INJECT_RESULT = 1 << 0 // Inject result from sqe->result
	IORING_NOP_FILE          = 1 << 1 // NOP with file reference
	IORING_NOP_FIXED_FILE    = 1 << 2 // NOP with fixed file
	IORING_NOP_FIXED_BUFFER  = 1 << 3 // NOP with fixed buffer
	IORING_NOP_TW            = 1 << 4 // NOP via task work
	IORING_NOP_CQE32         = 1 << 5 // NOP produces 32-byte CQE
)

NOP operation flags for IORING_OP_NOP.

View Source
const (
	IORING_REGISTER_SRC_REGISTERED = 1 << 0 // Source ring is registered
	IORING_REGISTER_DST_REPLACE    = 1 << 1 // Replace destination buffers
)

Clone buffers registration flags.

View Source
const (
	IOPrioClassNone = 0
	IOPrioClassRT   = 1 // Real-time
	IOPrioClassBE   = 2 // Best-effort
	IOPrioClassIDLE = 3 // Idle
)

I/O priority class constants for WithIOPrioClass.

View Source
const (
	IORING_OP_NOP uint8 = iota
	IORING_OP_READV
	IORING_OP_WRITEV
	IORING_OP_FSYNC
	IORING_OP_READ_FIXED
	IORING_OP_WRITE_FIXED
	IORING_OP_POLL_ADD
	IORING_OP_POLL_REMOVE
	IORING_OP_SYNC_FILE_RANGE
	IORING_OP_SENDMSG
	IORING_OP_RECVMSG
	IORING_OP_TIMEOUT
	IORING_OP_TIMEOUT_REMOVE
	IORING_OP_ACCEPT
	IORING_OP_ASYNC_CANCEL
	IORING_OP_LINK_TIMEOUT
	IORING_OP_CONNECT
	IORING_OP_FALLOCATE
	IORING_OP_OPENAT
	IORING_OP_CLOSE
	IORING_OP_FILES_UPDATE
	IORING_OP_STATX
	IORING_OP_READ
	IORING_OP_WRITE
	IORING_OP_FADVISE
	IORING_OP_MADVISE
	IORING_OP_SEND
	IORING_OP_RECV
	IORING_OP_OPENAT2
	IORING_OP_EPOLL_CTL
	IORING_OP_SPLICE
	IORING_OP_PROVIDE_BUFFERS
	IORING_OP_REMOVE_BUFFERS
	IORING_OP_TEE
	IORING_OP_SHUTDOWN
	IORING_OP_RENAMEAT
	IORING_OP_UNLINKAT
	IORING_OP_MKDIRAT
	IORING_OP_SYMLINKAT
	IORING_OP_LINKAT
	IORING_OP_MSG_RING
	IORING_OP_FSETXATTR
	IORING_OP_SETXATTR
	IORING_OP_FGETXATTR
	IORING_OP_GETXATTR
	IORING_OP_SOCKET
	IORING_OP_URING_CMD
	IORING_OP_SEND_ZC
	IORING_OP_SENDMSG_ZC
	IORING_OP_READ_MULTISHOT
	IORING_OP_WAITID
	IORING_OP_FUTEX_WAIT
	IORING_OP_FUTEX_WAKE
	IORING_OP_FUTEX_WAITV
	IORING_OP_FIXED_FD_INSTALL
	IORING_OP_FTRUNCATE
	IORING_OP_BIND
	IORING_OP_LISTEN
	IORING_OP_RECV_ZC      // Zero-copy receive
	IORING_OP_EPOLL_WAIT   // Epoll wait
	IORING_OP_READV_FIXED  // Vectored read with fixed buffers
	IORING_OP_WRITEV_FIXED // Vectored write with fixed buffers
	IORING_OP_PIPE         // Create pipe
	IORING_OP_NOP128       // 128-byte NOP opcode
	IORING_OP_URING_CMD128 // 128-byte uring command opcode
)

IORING_OP_* values encode io_uring operation types in SQEs and SQEContext.

View Source
const (
	IORING_TIMEOUT_ABS = 1 << iota
	IORING_TIMEOUT_UPDATE
	IORING_TIMEOUT_BOOTTIME
	IORING_TIMEOUT_REALTIME
	IORING_LINK_TIMEOUT_UPDATE
	IORING_TIMEOUT_ETIME_SUCCESS
	IORING_TIMEOUT_MULTISHOT
	IORING_TIMEOUT_CLOCK_MASK  = IORING_TIMEOUT_BOOTTIME | IORING_TIMEOUT_REALTIME
	IORING_TIMEOUT_UPDATE_MASK = IORING_TIMEOUT_UPDATE | IORING_LINK_TIMEOUT_UPDATE
)

Timeout operation flags.

View Source
const (
	IORING_ACCEPT_MULTISHOT  = 1 << 0 // Multi-shot accept: one SQE, multiple completions
	IORING_ACCEPT_DONTWAIT   = 1 << 1 // Non-blocking accept
	IORING_ACCEPT_POLL_FIRST = 1 << 2 // Poll for connection before accepting
)

Accept operation flags.

View Source
const (
	IORING_RECVSEND_POLL_FIRST  = 1 << iota // Poll before send/recv
	IORING_RECV_MULTISHOT                   // Multi-shot receive
	IORING_RECVSEND_FIXED_BUF               // Use registered buffer
	IORING_SEND_ZC_REPORT_USAGE             // Report zero-copy usage
	IORING_RECVSEND_BUNDLE                  // Bundle mode
	IORING_SEND_VECTORIZED                  // Vectorized send
)

Send/receive operation flags.

View Source
const (
	DefaultBufferNumPico   = 1 << 15 // 32768 × 32 B = 1 MiB
	DefaultBufferNumNano   = 1 << 14 // 16384 × 128 B = 2 MiB
	DefaultBufferNumMicro  = 1 << 13 // 8192 × 512 B = 4 MiB
	DefaultBufferNumSmall  = 1 << 12 // 4096 × 2 KiB = 8 MiB
	DefaultBufferNumMedium = 1 << 11 // 2048 × 8 KiB = 16 MiB
	DefaultBufferNumBig    = 1 << 10 // 1024 × 32 KiB = 32 MiB
	DefaultBufferNumLarge  = 1 << 9  // 512 × 128 KiB = 64 MiB
	DefaultBufferNumGreat  = 1 << 8  // 256 × 512 KiB = 128 MiB
	DefaultBufferNumHuge   = 1 << 7  // 128 × 2 MiB = 256 MiB
	DefaultBufferNumVast   = 1 << 6  // 64 × 8 MiB = 512 MiB
	DefaultBufferNumGiant  = 1 << 5  // 32 × 32 MiB = 1 GiB
	DefaultBufferNumTitan  = 1 << 4  // 16 × 128 MiB = 2 GiB
)

Default buffer counts per tier. Smaller buffers have more instances to handle high-frequency small I/O. Larger buffers have fewer instances due to memory constraints.

View Source
const (
	NetworkUnix = sock.NetworkUnix
	NetworkIPv4 = sock.NetworkIPv4
	NetworkIPv6 = sock.NetworkIPv6
)

Network family aliases.

View Source
const (
	SizeofSockaddrAny   = sock.SizeofSockaddrAny
	SizeofSockaddrInet4 = sock.SizeofSockaddrInet4
	SizeofSockaddrInet6 = sock.SizeofSockaddrInet6
	SizeofSockaddrUnix  = sock.SizeofSockaddrUnix
)

Raw socket address size constants.

View Source
const (
	AF_UNIX  = sock.AF_UNIX
	AF_LOCAL = sock.AF_LOCAL
	AF_INET  = sock.AF_INET
	AF_INET6 = sock.AF_INET6

	SOCK_STREAM    = sock.SOCK_STREAM
	SOCK_DGRAM     = sock.SOCK_DGRAM
	SOCK_RAW       = sock.SOCK_RAW
	SOCK_SEQPACKET = sock.SOCK_SEQPACKET
	SOCK_NONBLOCK  = sock.SOCK_NONBLOCK
	SOCK_CLOEXEC   = sock.SOCK_CLOEXEC

	IPPROTO_IP   = sock.IPPROTO_IP
	IPPROTO_RAW  = sock.IPPROTO_RAW
	IPPROTO_TCP  = sock.IPPROTO_TCP
	IPPROTO_UDP  = sock.IPPROTO_UDP
	IPPROTO_IPV6 = sock.IPPROTO_IPV6
	IPPROTO_SCTP = sock.IPPROTO_SCTP

	MSG_WAITALL  = sock.MSG_WAITALL
	MSG_ZEROCOPY = sock.MSG_ZEROCOPY

	SHUT_RD   = sock.SHUT_RD
	SHUT_WR   = sock.SHUT_WR
	SHUT_RDWR = sock.SHUT_RDWR

	PROT_READ  = zcall.PROT_READ
	PROT_WRITE = zcall.PROT_WRITE

	MAP_SHARED   = zcall.MAP_SHARED
	MAP_POPULATE = zcall.MAP_POPULATE
)

Socket, protocol, message, shutdown, and memory-mapping aliases.

View Source
const (
	EPOLL_CTL_ADD = 1
	EPOLL_CTL_DEL = 2
	EPOLL_CTL_MOD = 3

	EPOLLIN  = 0x1
	EPOLLOUT = 0x4
	EPOLLET  = 0x80000000
)

Epoll constants.

View Source
const AT_FDCWD = -100

AT_FDCWD is the special value for current working directory.

View Source
const FUTEX_BITSET_MATCH_ANY uint64 = 0xFFFFFFFF

FUTEX_BITSET_MATCH_ANY matches any waker when used as mask in FutexWait.

View Source
const (
	IORING_CQE_BUFFER_SHIFT = 16
)
View Source
const IORING_FILE_INDEX_ALLOC uint32 = 0xFFFFFFFF

IORING_FILE_INDEX_ALLOC is passed as file_index to have io_uring allocate a free direct descriptor slot. The allocated index is returned in cqe->res. Returns -ENFILE if no free slots available.

View Source
const IORING_FIXED_FD_NO_CLOEXEC uint32 = 1 << 0

IORING_FIXED_FD_NO_CLOEXEC omits O_CLOEXEC when installing a fixed fd. By default, FixedFdInstall sets O_CLOEXEC on the new regular fd.

View Source
const (
	IORING_MEM_REGION_REG_WAIT_ARG = 1 // Expose region as registered wait arguments
)

Memory region registration flags.

View Source
const (
	IORING_MEM_REGION_TYPE_USER = 1 // User-provided memory
)

Memory region types.

View Source
const (
	IORING_NOTIF_USAGE_ZC_COPIED = 1 << 31 // Data was copied instead of zero-copy
)

Notification CQE usage flags for zero-copy operations.

View Source
const IORING_REGISTER_USE_REGISTERED_RING = zcall.IORING_REGISTER_USE_REGISTERED_RING

IORING_REGISTER_USE_REGISTERED_RING is a flag that can be OR'd with register opcodes to use a registered ring fd instead of a regular fd.

View Source
const (
	IORING_REG_WAIT_TS = 1 << 0 // Timestamp in wait region
)

Registered wait flags.

View Source
const (
	IORING_RSRC_REGISTER_SPARSE = 1 << 0 // Sparse registration
)

Resource registration flags.

View Source
const (
	IORING_RW_ATTR_FLAG_PI = 1 << 0 // PI (Protection Information) attribute
)

RW attribute flags for sqe->attr_type_mask.

View Source
const (
	IORING_ZCRX_AREA_DMABUF = 1 // Use DMA buffer
)

ZCRX area registration flags.

View Source
const (
	IO_URING_OP_SUPPORTED = 1 << 0
)
View Source
const IPPROTO_UDPLITE = sock.IPPROTO_UDPLITE

IPPROTO_UDPLITE is UDP-Lite protocol number.

View Source
const O_LARGEFILE = 0x8000

O_LARGEFILE flag for openat.

View Source
const SizeofOpenHow = 24

SizeofOpenHow is the size of OpenHow structure.

View Source
const (
	ZCRX_REG_IMPORT = 1 // Import mode
)

ZCRX registration flags.

Variables

View Source
var (
	ErrInvalidParam = iofd.ErrInvalidParam
	ErrInterrupted  = iofd.ErrInterrupted
	ErrNoMemory     = iofd.ErrNoMemory
	ErrPermission   = iofd.ErrPermission
)

Common errors reused from iofd for semantic consistency across the ecosystem.

View Source
var (
	// ErrInProgress indicates the operation is in progress.
	ErrInProgress = errors.New("uring: operation in progress")

	// ErrFaultParams indicates a fault in parameters (bad address).
	ErrFaultParams = errors.New("uring: fault in parameters")

	// ErrProcessFileLimit indicates the process file descriptor limit was reached.
	ErrProcessFileLimit = errors.New("uring: process file descriptor limit")

	// ErrSystemFileLimit indicates the system file descriptor limit was reached.
	ErrSystemFileLimit = errors.New("uring: system file descriptor limit")

	// ErrNoDevice indicates no such device.
	ErrNoDevice = errors.New("uring: no such device")

	// ErrNotSupported indicates the operation is not supported.
	ErrNotSupported = errors.New("uring: operation not supported")

	// ErrBusy indicates the resource is busy.
	ErrBusy = errors.New("uring: resource busy")

	// ErrClosed indicates the ring has already been stopped.
	ErrClosed = errors.New("uring: ring closed")

	// ErrCQOverflow indicates the CQ overflow condition was observed while the CQ appeared empty.
	ErrCQOverflow = errors.New("uring: completion queue overflow")

	// ErrExists indicates the resource already exists.
	ErrExists = errors.New("uring: already exists")

	// ErrNameTooLong indicates a pathname exceeds the kernel limit.
	ErrNameTooLong = errors.New("uring: name too long")

	// ErrNotFound indicates the resource was not found.
	ErrNotFound = errors.New("uring: not found")

	// ErrCanceled indicates the operation was canceled.
	ErrCanceled = errors.New("uring: operation canceled")

	// ErrTimedOut indicates the operation timed out.
	ErrTimedOut = errors.New("uring: operation timed out")

	// ErrConnectionRefused indicates the connection was refused.
	ErrConnectionRefused = errors.New("uring: connection refused")

	// ErrConnectionReset indicates the connection was reset.
	ErrConnectionReset = errors.New("uring: connection reset")

	// ErrNotConnected indicates the socket is not connected.
	ErrNotConnected = errors.New("uring: not connected")

	// ErrAlreadyConnected indicates the socket is already connected.
	ErrAlreadyConnected = errors.New("uring: already connected")

	// ErrAddressInUse indicates the address is already in use.
	ErrAddressInUse = errors.New("uring: address in use")

	// ErrNetworkUnreachable indicates the network is unreachable.
	ErrNetworkUnreachable = errors.New("uring: network unreachable")

	// ErrHostUnreachable indicates the host is unreachable.
	ErrHostUnreachable = errors.New("uring: host unreachable")

	// ErrBrokenPipe indicates the pipe is broken (EPIPE).
	ErrBrokenPipe = errors.New("uring: broken pipe")

	// ErrNoBufferSpace indicates no buffer space available (ENOBUFS).
	ErrNoBufferSpace = errors.New("uring: no buffer space available")
)

Error definitions for uring operations.

View Source
var ErrNotReady = errors.New("listener not ready")

ErrNotReady indicates the listener is not yet ready for accept.

Functions

func AlignedMemBlock

func AlignedMemBlock() []byte

AlignedMemBlock returns a page-aligned memory block.

func CastUserData

func CastUserData[T any](ext *ExtSQE) *T

CastUserData casts `ExtSQE.UserData` to `*T`. The returned pointer is borrowed from `ext` and is valid only until release. `T` must fit within `ExtSQE.UserData`.

`ExtSQE.UserData` is raw caller-beware storage. Prefer scalar payloads here; if a raw overlay stores Go pointers, interfaces, func values, maps, slices, strings, chans, or structs containing them in these bytes, caller code must keep the live roots outside `UserData`.

func ContextUserData

func ContextUserData[T any](ctx context.Context) T

ContextUserData extracts a typed value from context. Returns the zero value of T if not found.

func ContextWithUserData

func ContextWithUserData[T any](ctx context.Context, val T) context.Context

ContextWithUserData returns a new context with the typed value stored.

func PrepareListenerBind

func PrepareListenerBind(ext *ExtSQE, fd iofd.FD)

PrepareListenerBind fills ext's SQE for IORING_OP_BIND using the sockaddr stored from PrepareListenerSocket. fd is the socket from SOCKET completion.

func PrepareListenerListen

func PrepareListenerListen(ext *ExtSQE, fd iofd.FD)

PrepareListenerListen fills ext's SQE for IORING_OP_LISTEN. fd is the bound socket, backlog from the stored context.

func PrepareListenerSocket

func PrepareListenerSocket(ext *ExtSQE, domain, sockType, proto int, sa Sockaddr, backlog int, handler ListenerHandler) error

PrepareListenerSocket fills ext's SQE for IORING_OP_SOCKET and stores the sockaddr + backlog for subsequent stages. Small sockaddrs stay inline in ext.UserData; oversized ones stay anchored in the pooled sidecar. After calling this, submit with ring.SubmitExtended(PackExtended(ext)).

ext must be a pool-borrowed slot obtained from ContextPools.Extended. Passing a non-pooled ExtSQE is undefined behavior: the sidecar anchors live past the end of a standalone ExtSQE object.

A nil handler is normalized to NoopListenerHandler.

func SetListenerReady

func SetListenerReady(ext *ExtSQE)

SetListenerReady marks the listener context as ready. Call after LISTEN completes successfully.

Types

type Addr

type Addr = sock.Addr

Addr is the network address interface used by connect and bind helpers.

type AttrPI

type AttrPI struct {
	Flags  uint16 // PI flags
	AppTag uint16 // Application tag
	Len    uint32 // Length
	Addr   uint64 // Address
	Seed   uint64 // Seed value
	// contains filtered or unexported fields
}

AttrPI is the PI attribute information for read/write operations. Matches struct io_uring_attr_pi in Linux.

type BigBuffer

type BigBuffer = iobuf.BigBuffer

Buffer types re-exported from iobuf.

type BufferGroupsConfig

type BufferGroupsConfig struct {
	PicoNum   int // 32 B buffers
	NanoNum   int // 128 B buffers
	MicroNum  int // 512 B buffers
	SmallNum  int // 2 KiB buffers
	MediumNum int // 8 KiB buffers
	BigNum    int // 32 KiB buffers
	LargeNum  int // 128 KiB buffers
	GreatNum  int // 512 KiB buffers
	HugeNum   int // 2 MiB buffers
	VastNum   int // 8 MiB buffers
	GiantNum  int // 32 MiB buffers
	TitanNum  int // 128 MiB buffers
}

BufferGroupsConfig configures buffer counts for each tier.

Each field specifies the number of buffers to allocate for that tier. A zero count disables the tier (no memory allocated).

Memory usage calculation:

Total = Sum(TierSize × TierCount × Scale)

Example with the default config (Scale=1):

Pico:   32768 × 32 B   = 1 MiB
Nano:   16384 × 128 B  = 2 MiB
Micro:  8192 × 512 B   = 4 MiB
Small:  4096 × 2 KiB   = 8 MiB
Medium: 2048 × 8 KiB   = 16 MiB
Big:    1024 × 32 KiB  = 32 MiB
Large:  512 × 128 KiB  = 64 MiB
                Total  ≈ 127 MiB per scale

func BufferConfigForBudget

func BufferConfigForBudget(budget int) (BufferGroupsConfig, int)

BufferConfigForBudget returns a BufferGroupsConfig and scale for the given memory budget. Use this when you want fine-grained control over Options while using budget-based buffer configuration.

Budget handling matches OptionsForBudget:

  • registered buffers use 25% of the budget (minimum 8 MiB)
  • ring overhead is reserved from the same budget
  • buffer groups use the remaining memory

The returned scale should be passed to Options.MultiSizeBuffer.

Example:

cfg, scale := BufferConfigForBudget(256 * MiB)
// cfg contains tier configuration, scale is the multiplier

func DefaultBufferGroupsConfig

func DefaultBufferGroupsConfig() BufferGroupsConfig

DefaultBufferGroupsConfig returns the default configuration. It enables the first 7 tiers (Pico through Large), totaling ~127 MiB per scale.

func FullBufferGroupsConfig

func FullBufferGroupsConfig() BufferGroupsConfig

FullBufferGroupsConfig returns configuration with all 12 tiers enabled. It uses ~4 GiB per scale. Use it only on high-memory systems.

func MinimalBufferGroupsConfig

func MinimalBufferGroupsConfig() BufferGroupsConfig

MinimalBufferGroupsConfig returns a reduced configuration. It enables the first 5 tiers (Pico through Medium), totaling ~31 MiB per scale.

type BundleIterator

type BundleIterator struct {
	// contains filtered or unexported fields
}

BundleIterator iterates over buffers consumed in a bundle receive operation. Bundle receives allow receiving multiple buffers in a single syscall, with data spanning the logical sequence of buffer IDs starting at the CQE's first ID.

The iterator handles buffer ring wrap-around using the ring mask.

func NewBundleIterator

func NewBundleIterator(cqe CQEView, bufBacking []byte, bufSize int, ringEntries int) (BundleIterator, bool)

NewBundleIterator creates an iterator for the buffers consumed by a bundle CQE.

Parameters:

  • cqe: the CQE from a bundle receive operation
  • bufBacking: backing memory for the full ring, such as the slice returned by AlignedMem
  • bufSize: size of each buffer in the ring
  • ringEntries: number of entries in the buffer ring; must be a power of two

bufBacking must remain alive for the iterator's lifetime and must cover at least bufSize*ringEntries bytes.

Returns a zero BundleIterator and false if the CQE indicates no data was received or if the constructor arguments are invalid.

func (BundleIterator) All

func (it BundleIterator) All() iter.Seq[[]byte]

All returns an iterator function for use with Go 1.23+ range-over-func. Each iteration yields one buffer from the bundle.

Usage:

for buf := range iter.All() {
    process(buf)
}

func (BundleIterator) AllWithSlotID

func (it BundleIterator) AllWithSlotID() iter.Seq2[uint16, []byte]

AllWithSlotID returns an iterator that yields both buffer data and masked ring slot ID. Useful when you need to track which ring slots were consumed.

Usage:

for id, buf := range iter.AllWithSlotID() {
    fmt.Printf("Slot ID %d: %d bytes\n", id, len(buf))
}

func (BundleIterator) Buffer

func (it BundleIterator) Buffer(index int) []byte

Buffer returns the buffer at the given index without advancing the iterator. Index must be in range [0, Count()). The last buffer may be partial.

func (BundleIterator) Collect

func (it BundleIterator) Collect() [][]byte

Collect returns all buffers as a slice. This allocates a new slice; for zero-allocation iteration, use All().

func (BundleIterator) CopyTo

func (it BundleIterator) CopyTo(dst []byte) int

CopyTo copies all bundle data to the destination slice. Returns the number of bytes copied.

func (BundleIterator) Count

func (it BundleIterator) Count() int

Count returns the number of buffers consumed in this bundle.

func (BundleIterator) Recycle

func (it BundleIterator) Recycle(ur *Uring)

Recycle returns all consumed buffers to the buffer ring via provide and commits them with advance. This MUST be called after the bundle data has been fully processed to prevent buffer ring entry leaks.

Recycle is single-threaded: do not call it concurrently with another Recycle on the same Uring, or with any other path that can race with buffer ring provide/advance.

The group info (gidOffset, group) is captured at construction time.

func (BundleIterator) SlotID

func (it BundleIterator) SlotID(index int) uint16

SlotID returns the masked ring slot ID at the given index in the bundle. Handles ring wrap-around automatically. Index must be in range [0, Count()).

func (BundleIterator) TotalBytes

func (it BundleIterator) TotalBytes() int

TotalBytes returns the total bytes received in this bundle.

type CQEView

type CQEView struct {
	Res   int32  // Completion result (directly accessible)
	Flags uint32 // CQE flags (directly accessible)
	// contains filtered or unexported fields
}

CQEView provides a view into a completion queue entry. It exposes kernel completion facts directly and lets higher layers decide how to route or interpret them. When available, it also exposes the submission context that produced those facts.

Property Patterns

| FullSQE() | Extended() | Mode | Available Data | |-----------|------------|----------|-----------------------------------------------| | false | false | Direct | Op, SQE flags, BufGroup, FD, Res, CQE flags | | true | false | Indirect | + full ioUringSqe copy | | true | true | Extended | + borrowed `ExtSQE` escape hatch |

Usage

n, err := ring.Wait(cqes)
for i := range n {
    cqe := cqes[i]
    // Observe the kernel facts first.
    if cqe.Res < 0 {
        return fmt.Errorf("completion failed: op=%d fd=%d res=%d", cqe.Op(), cqe.FD(), cqe.Res)
    }
    fmt.Printf("completed op=%d on fd=%d with res=%d\n", cqe.Op(), cqe.FD(), cqe.Res)
    if cqe.HasMore() {
        // Higher layers decide whether to keep routing this live stream.
    }
    if cqe.FullSQE() {
        // Indirect and Extended modes also expose the submitted SQE.
        fmt.Printf("submitted opcode=%d\n", cqe.SQE().Opcode())
    }
}

func (*CQEView) BufGroup

func (c *CQEView) BufGroup() uint16

BufGroup returns the observed submission buffer group index. It is non-zero only when buffer selection was part of the submission.

func (*CQEView) BufID

func (c *CQEView) BufID() uint16

BufID returns the buffer ID from CQE flags. Only valid when IORING_CQE_F_BUFFER flag is set.

func (*CQEView) BundleBuffers

func (c *CQEView) BundleBuffers(bufferSize int) (startID uint16, count int)

BundleBuffers returns the logical range of buffer IDs consumed. The returned startID is the first buffer ID; count is the number of buffers. The range [startID, startID+count) is logical and may wrap around the ring. Callers must apply (id & ringMask) to obtain physical buffer IDs, or use BundleIterator which handles wrap-around automatically.

func (*CQEView) BundleCount

func (c *CQEView) BundleCount(bufferSize int) int

BundleCount returns the number of buffers consumed in a bundle operation. For receive bundles, this is derived from the result (bytes received) divided by the buffer size. For accurate count, use with known buffer sizes.

func (*CQEView) BundleStartID

func (c *CQEView) BundleStartID() uint16

BundleStartID returns the starting buffer ID for a bundle operation. For bundle receives, buffers are consumed contiguously from this ID. Only valid when IORING_CQE_F_BUFFER flag is set.

func (*CQEView) Context

func (c *CQEView) Context() SQEContext

Context returns the underlying SQEContext. Use this for advanced mode-specific inspection beyond the CQEView helpers.

func (*CQEView) ExtSQE

func (c *CQEView) ExtSQE() *ExtSQE

ExtSQE returns the borrowed ExtSQE backing Extended mode contexts. Caller should check Extended() first.

func (*CQEView) Extended

func (c *CQEView) Extended() bool

Extended reports whether extended user data is available. Returns true only for Extended mode.

func (*CQEView) FD

func (c *CQEView) FD() iofd.FD

FD returns the file descriptor associated with the operation. Always available.

func (*CQEView) FullSQE

func (c *CQEView) FullSQE() bool

FullSQE reports whether full SQE information is available. Returns true for Indirect and Extended modes.

func (*CQEView) HasBuffer

func (c *CQEView) HasBuffer() bool

HasBuffer reports whether a buffer ID is available in the flags.

func (*CQEView) HasBufferMore

func (c *CQEView) HasBufferMore() bool

HasBufferMore reports whether the buffer was partially consumed (incremental mode). When set, the same buffer ID remains valid for additional data.

func (*CQEView) HasMore

func (c *CQEView) HasMore() bool

HasMore reports whether more completions are coming (multishot).

func (*CQEView) IsNotification

func (c *CQEView) IsNotification() bool

IsNotification reports whether this is a zero-copy notification CQE. Zero-copy sends generate two CQEs: one for completion, one for notification when the buffer can be reused.

func (*CQEView) Op

func (c *CQEView) Op() uint8

Op returns the IORING_OP_* opcode. Always available (extracted from Direct mode context or from SQE in other modes).

func (*CQEView) SQE

func (c *CQEView) SQE() SQEView

SQE returns a view of the submitted SQE when the context retains one. Caller should check FullSQE() first; Direct mode returns an invalid view because it keeps only compact completion-context facts.

func (*CQEView) SocketNonEmpty

func (c *CQEView) SocketNonEmpty() bool

SocketNonEmpty reports whether the socket has more data available. This is set when a short read/recv occurred but more data remains.

type CloneBuffers

type CloneBuffers struct {
	SrcFD  uint32 // Source ring file descriptor
	Flags  uint32 // IORING_REGISTER_SRC_* flags
	SrcOff uint32 // Source buffer offset
	DstOff uint32 // Destination buffer offset
	Nr     uint32 // Number of buffers to clone
	// contains filtered or unexported fields
}

CloneBuffers describes a buffer clone operation. Matches struct io_uring_clone_buffers in Linux.

type ContextPools

type ContextPools struct {
	// contains filtered or unexported fields
}

ContextPools holds pooled IndirectSQE and ExtSQE contexts. IndirectSQE slots use explicit aligned backing; extended slots pair each ExtSQE with adjacent GC-visible sidecar anchors.

func NewContextPools

func NewContextPools(capacity int) *ContextPools

NewContextPools creates pooled IndirectSQE and ExtSQE contexts with the given per-pool capacity. New pools are ready for immediate use.

func (*ContextPools) Capacity

func (p *ContextPools) Capacity() int

Capacity returns the per-pool slot count.

func (*ContextPools) Extended

func (p *ContextPools) Extended() *ExtSQE

Extended borrows an ExtSQE from the pool. Returns nil if exhausted.

func (*ContextPools) ExtendedAvailable

func (p *ContextPools) ExtendedAvailable() int

ExtendedAvailable returns the number of ExtSQE slots available.

func (*ContextPools) Indirect

func (p *ContextPools) Indirect() *IndirectSQE

Indirect borrows an IndirectSQE from the pool. Returns nil if exhausted.

func (*ContextPools) IndirectAvailable

func (p *ContextPools) IndirectAvailable() int

IndirectAvailable returns the number of IndirectSQE slots available.

func (*ContextPools) PutExtended

func (p *ContextPools) PutExtended(ext *ExtSQE)

PutExtended returns an ExtSQE to the pool and clears its sidecar anchors.

func (*ContextPools) PutIndirect

func (p *ContextPools) PutIndirect(indirect *IndirectSQE)

PutIndirect returns an IndirectSQE to the pool.

func (*ContextPools) Reset

func (p *ContextPools) Reset()

Reset scrubs pooled slot state and reinitializes both pool queues, making all slots available again.

type Ctx0

type Ctx0 struct {
	Fn   Handler  // 8 bytes
	Data [56]byte // 56 bytes
}

Ctx0 has 0 refs, 0 vals, and 56 bytes of data.

func Ctx0Of

func Ctx0Of(sqe *ExtSQE) *Ctx0

Ctx0Of is a shorthand for ViewCtx(sqe).Vals0(). Use when you need just a handler with max data space (56B).

func CtxOf

func CtxOf(sqe *ExtSQE) *Ctx0

CtxOf is a shorthand for ViewCtx(sqe).Vals0().

type Ctx0V1

type Ctx0V1 struct {
	Fn   Handler  // 8 bytes
	Val1 int64    // 8 bytes
	Data [48]byte // 48 bytes
}

Ctx0V1 has 0 refs, 1 val, and 48 bytes of data.

func Ctx0V1Of

func Ctx0V1Of(sqe *ExtSQE) *Ctx0V1

Ctx0V1Of is a shorthand for ViewCtx(sqe).Vals1(). Use when you need 0 refs and 1 val (e.g., handler + timestamp).

func CtxV1Of

func CtxV1Of(sqe *ExtSQE) *Ctx0V1

CtxV1Of is a shorthand for ViewCtx(sqe).Vals1().

type Ctx0V2

type Ctx0V2 struct {
	Fn   Handler  // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Data [40]byte // 40 bytes
}

Ctx0V2 has 0 refs, 2 vals, and 40 bytes of data.

func Ctx0V2Of

func Ctx0V2Of(sqe *ExtSQE) *Ctx0V2

Ctx0V2Of is a shorthand for ViewCtx(sqe).Vals2(). Use when you need 0 refs and 2 vals (e.g., handler + offset + length).

func CtxV2Of

func CtxV2Of(sqe *ExtSQE) *Ctx0V2

CtxV2Of is a shorthand for ViewCtx(sqe).Vals2().

type Ctx0V3

type Ctx0V3 struct {
	Fn   Handler  // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Val3 int64    // 8 bytes
	Data [32]byte // 32 bytes
}

Ctx0V3 has 0 refs, 3 vals, and 32 bytes of data.

func Ctx0V3Of

func Ctx0V3Of(sqe *ExtSQE) *Ctx0V3

Ctx0V3Of is a shorthand for ViewCtx(sqe).Vals3(). Use when you need 0 refs and 3 vals.

func CtxV3Of

func CtxV3Of(sqe *ExtSQE) *Ctx0V3

CtxV3Of is a shorthand for ViewCtx(sqe).Vals3().

type Ctx0V4

type Ctx0V4 struct {
	Fn   Handler  // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Val3 int64    // 8 bytes
	Val4 int64    // 8 bytes
	Data [24]byte // 24 bytes
}

Ctx0V4 has 0 refs, 4 vals, and 24 bytes of data.

func Ctx0V4Of

func Ctx0V4Of(sqe *ExtSQE) *Ctx0V4

Ctx0V4Of is a shorthand for ViewCtx(sqe).Vals4(). Use when you need 0 refs and 4 vals.

func CtxV4Of

func CtxV4Of(sqe *ExtSQE) *Ctx0V4

CtxV4Of is a shorthand for ViewCtx(sqe).Vals4().

type Ctx0V5

type Ctx0V5 struct {
	Fn   Handler  // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Val3 int64    // 8 bytes
	Val4 int64    // 8 bytes
	Val5 int64    // 8 bytes
	Data [16]byte // 16 bytes
}

Ctx0V5 has 0 refs, 5 vals, and 16 bytes of data.

func Ctx0V5Of

func Ctx0V5Of(sqe *ExtSQE) *Ctx0V5

Ctx0V5Of is a shorthand for ViewCtx(sqe).Vals5(). Use when you need 0 refs and 5 vals.

func CtxV5Of

func CtxV5Of(sqe *ExtSQE) *Ctx0V5

CtxV5Of is a shorthand for ViewCtx(sqe).Vals5().

type Ctx0V6

type Ctx0V6 struct {
	Fn   Handler // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
	Val5 int64   // 8 bytes
	Val6 int64   // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx0V6 has 0 refs, 6 vals, and 8 bytes of data.

func Ctx0V6Of

func Ctx0V6Of(sqe *ExtSQE) *Ctx0V6

Ctx0V6Of is a shorthand for ViewCtx(sqe).Vals6(). Use when you need 0 refs and 6 vals.

func CtxV6Of

func CtxV6Of(sqe *ExtSQE) *Ctx0V6

CtxV6Of is a shorthand for ViewCtx(sqe).Vals6().

type Ctx0V7

type Ctx0V7 struct {
	Fn   Handler // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
	Val5 int64   // 8 bytes
	Val6 int64   // 8 bytes
	Val7 int64   // 8 bytes
}

Ctx0V7 has 0 refs, 7 vals, and 0 bytes of data.

func Ctx0V7Of

func Ctx0V7Of(sqe *ExtSQE) *Ctx0V7

Ctx0V7Of is a shorthand for ViewCtx(sqe).Vals7(). Use when you need 0 refs and 7 vals.

func CtxV7Of

func CtxV7Of(sqe *ExtSQE) *Ctx0V7

CtxV7Of is a shorthand for ViewCtx(sqe).Vals7().

type Ctx1

type Ctx1[T1 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Data [48]byte // 48 bytes
}

Ctx1 has 1 ref, 0 vals, 48 bytes data.

func Ctx1Of

func Ctx1Of[T any](sqe *ExtSQE) *Ctx1[T]

Ctx1Of is a shorthand for ViewCtx1[T](sqe).Vals0(). Use when you need 1 ref and 0 vals (e.g., handler + connection ref).

type Ctx1V1

type Ctx1V1[T1 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Val1 int64    // 8 bytes
	Data [40]byte // 40 bytes
}

Ctx1V1 has 1 ref, 1 val, 40 bytes data.

func Ctx1V1Of

func Ctx1V1Of[T any](sqe *ExtSQE) *Ctx1V1[T]

Ctx1V1Of is a shorthand for ViewCtx1[T](sqe).Vals1() - the most common case. Use when you need 1 ref and 1 val (e.g., connection + timestamp).

type Ctx1V2

type Ctx1V2[T1 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Data [32]byte // 32 bytes
}

Ctx1V2 has 1 ref, 2 vals, 32 bytes data.

func Ctx1V2Of

func Ctx1V2Of[T any](sqe *ExtSQE) *Ctx1V2[T]

Ctx1V2Of is a shorthand for ViewCtx1[T](sqe).Vals2(). Use when you need 1 ref and 2 vals (e.g., connection + offset + length).

type Ctx1V3

type Ctx1V3[T1 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Val3 int64    // 8 bytes
	Data [24]byte // 24 bytes
}

Ctx1V3 has 1 ref, 3 vals, 24 bytes data.

func Ctx1V3Of

func Ctx1V3Of[T any](sqe *ExtSQE) *Ctx1V3[T]

Ctx1V3Of is a shorthand for ViewCtx1[T](sqe).Vals3(). Use when you need 1 ref and 3 vals.

type Ctx1V4

type Ctx1V4[T1 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Val3 int64    // 8 bytes
	Val4 int64    // 8 bytes
	Data [16]byte // 16 bytes
}

Ctx1V4 has 1 ref, 4 vals, 16 bytes data.

func Ctx1V4Of

func Ctx1V4Of[T any](sqe *ExtSQE) *Ctx1V4[T]

Ctx1V4Of is a shorthand for ViewCtx1[T](sqe).Vals4(). Use when you need 1 ref and 4 vals.

type Ctx1V5

type Ctx1V5[T1 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
	Val5 int64   // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx1V5 has 1 ref, 5 vals, 8 bytes data.

func Ctx1V5Of

func Ctx1V5Of[T any](sqe *ExtSQE) *Ctx1V5[T]

Ctx1V5Of is a shorthand for ViewCtx1[T](sqe).Vals5(). Use when you need 1 ref and 5 vals.

type Ctx1V6

type Ctx1V6[T1 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
	Val5 int64   // 8 bytes
	Val6 int64   // 8 bytes
}

Ctx1V6 has 1 ref, 6 vals, 0 bytes data.

func Ctx1V6Of

func Ctx1V6Of[T any](sqe *ExtSQE) *Ctx1V6[T]

Ctx1V6Of is a shorthand for ViewCtx1[T](sqe).Vals6(). Use when you need 1 ref and 6 vals.

type Ctx2

type Ctx2[T1, T2 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Data [40]byte // 40 bytes
}

Ctx2 has 2 refs, 0 vals, 40 bytes data.

func Ctx2Of

func Ctx2Of[T1, T2 any](sqe *ExtSQE) *Ctx2[T1, T2]

Ctx2Of is a shorthand for ViewCtx2[T1,T2](sqe).Vals0(). Use when you need 2 refs and 0 vals (e.g., conn + buffer).

type Ctx2V1

type Ctx2V1[T1, T2 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Val1 int64    // 8 bytes
	Data [32]byte // 32 bytes
}

Ctx2V1 has 2 refs, 1 val, 32 bytes data.

func Ctx2V1Of

func Ctx2V1Of[T1, T2 any](sqe *ExtSQE) *Ctx2V1[T1, T2]

Ctx2V1Of is a shorthand for ViewCtx2[T1,T2](sqe).Vals1(). Use when you need 2 refs and 1 val (e.g., conn + buf + offset).

type Ctx2V2

type Ctx2V2[T1, T2 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Data [24]byte // 24 bytes
}

Ctx2V2 has 2 refs, 2 vals, 24 bytes data.

func Ctx2V2Of

func Ctx2V2Of[T1, T2 any](sqe *ExtSQE) *Ctx2V2[T1, T2]

Ctx2V2Of is a shorthand for ViewCtx2[T1,T2](sqe).Vals2(). Use when you need 2 refs and 2 vals (e.g., conn + buf + offset + length).

type Ctx2V3

type Ctx2V3[T1, T2 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Val3 int64    // 8 bytes
	Data [16]byte // 16 bytes
}

Ctx2V3 has 2 refs, 3 vals, 16 bytes data.

func Ctx2V3Of

func Ctx2V3Of[T1, T2 any](sqe *ExtSQE) *Ctx2V3[T1, T2]

Ctx2V3Of is a shorthand for ViewCtx2[T1,T2](sqe).Vals3(). Use when you need 2 refs and 3 vals.

type Ctx2V4

type Ctx2V4[T1, T2 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx2V4 has 2 refs, 4 vals, 8 bytes data.

func Ctx2V4Of

func Ctx2V4Of[T1, T2 any](sqe *ExtSQE) *Ctx2V4[T1, T2]

Ctx2V4Of is a shorthand for ViewCtx2[T1,T2](sqe).Vals4(). Use when you need 2 refs and 4 vals.

type Ctx2V5

type Ctx2V5[T1, T2 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
	Val5 int64   // 8 bytes
}

Ctx2V5 has 2 refs, 5 vals, 0 bytes data.

func Ctx2V5Of

func Ctx2V5Of[T1, T2 any](sqe *ExtSQE) *Ctx2V5[T1, T2]

Ctx2V5Of is a shorthand for ViewCtx2[T1,T2](sqe).Vals5(). Use when you need 2 refs and 5 vals.

type Ctx3

type Ctx3[T1, T2, T3 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Ref3 *T3      // 8 bytes
	Data [32]byte // 32 bytes
}

Ctx3 has 3 refs, 0 vals, 32 bytes data.

func Ctx3Of

func Ctx3Of[T1, T2, T3 any](sqe *ExtSQE) *Ctx3[T1, T2, T3]

Ctx3Of is a shorthand for ViewCtx3[T1,T2,T3](sqe).Vals0(). Use when you need 3 refs and 0 vals.

type Ctx3V1

type Ctx3V1[T1, T2, T3 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Ref3 *T3      // 8 bytes
	Val1 int64    // 8 bytes
	Data [24]byte // 24 bytes
}

Ctx3V1 has 3 refs, 1 val, 24 bytes data.

func Ctx3V1Of

func Ctx3V1Of[T1, T2, T3 any](sqe *ExtSQE) *Ctx3V1[T1, T2, T3]

Ctx3V1Of is a shorthand for ViewCtx3[T1,T2,T3](sqe).Vals1(). Use when you need 3 refs and 1 val.

type Ctx3V2

type Ctx3V2[T1, T2, T3 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Ref3 *T3      // 8 bytes
	Val1 int64    // 8 bytes
	Val2 int64    // 8 bytes
	Data [16]byte // 16 bytes
}

Ctx3V2 has 3 refs, 2 vals, 16 bytes data.

func Ctx3V2Of

func Ctx3V2Of[T1, T2, T3 any](sqe *ExtSQE) *Ctx3V2[T1, T2, T3]

Ctx3V2Of is a shorthand for ViewCtx3[T1,T2,T3](sqe).Vals2(). Use when you need 3 refs and 2 vals.

type Ctx3V3

type Ctx3V3[T1, T2, T3 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx3V3 has 3 refs, 3 vals, 8 bytes data.

func Ctx3V3Of

func Ctx3V3Of[T1, T2, T3 any](sqe *ExtSQE) *Ctx3V3[T1, T2, T3]

Ctx3V3Of is a shorthand for ViewCtx3[T1,T2,T3](sqe).Vals3(). Use when you need 3 refs and 3 vals.

type Ctx3V4

type Ctx3V4[T1, T2, T3 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
	Val4 int64   // 8 bytes
}

Ctx3V4 has 3 refs, 4 vals, 0 bytes data.

func Ctx3V4Of

func Ctx3V4Of[T1, T2, T3 any](sqe *ExtSQE) *Ctx3V4[T1, T2, T3]

Ctx3V4Of is a shorthand for ViewCtx3[T1,T2,T3](sqe).Vals4(). Use when you need 3 refs and 4 vals.

type Ctx4

type Ctx4[T1, T2, T3, T4 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Ref3 *T3      // 8 bytes
	Ref4 *T4      // 8 bytes
	Data [24]byte // 24 bytes
}

Ctx4 has 4 refs, 0 vals, 24 bytes data.

func Ctx4Of

func Ctx4Of[T1, T2, T3, T4 any](sqe *ExtSQE) *Ctx4[T1, T2, T3, T4]

Ctx4Of is a shorthand for ViewCtx4[T1,T2,T3,T4](sqe).Vals0(). Use when you need 4 refs and 0 vals.

type Ctx4V1

type Ctx4V1[T1, T2, T3, T4 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Ref3 *T3      // 8 bytes
	Ref4 *T4      // 8 bytes
	Val1 int64    // 8 bytes
	Data [16]byte // 16 bytes
}

Ctx4V1 has 4 refs, 1 val, 16 bytes data.

func Ctx4V1Of

func Ctx4V1Of[T1, T2, T3, T4 any](sqe *ExtSQE) *Ctx4V1[T1, T2, T3, T4]

Ctx4V1Of is a shorthand for ViewCtx4[T1,T2,T3,T4](sqe).Vals1(). Use when you need 4 refs and 1 val.

type Ctx4V2

type Ctx4V2[T1, T2, T3, T4 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx4V2 has 4 refs, 2 vals, 8 bytes data.

func Ctx4V2Of

func Ctx4V2Of[T1, T2, T3, T4 any](sqe *ExtSQE) *Ctx4V2[T1, T2, T3, T4]

Ctx4V2Of is a shorthand for ViewCtx4[T1,T2,T3,T4](sqe).Vals2(). Use when you need 4 refs and 2 vals.

type Ctx4V3

type Ctx4V3[T1, T2, T3, T4 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
	Val3 int64   // 8 bytes
}

Ctx4V3 has 4 refs, 3 vals, 0 bytes data.

func Ctx4V3Of

func Ctx4V3Of[T1, T2, T3, T4 any](sqe *ExtSQE) *Ctx4V3[T1, T2, T3, T4]

Ctx4V3Of is a shorthand for ViewCtx4[T1,T2,T3,T4](sqe).Vals3(). Use when you need 4 refs and 3 vals.

type Ctx5

type Ctx5[T1, T2, T3, T4, T5 any] struct {
	Fn   Handler  // 8 bytes
	Ref1 *T1      // 8 bytes
	Ref2 *T2      // 8 bytes
	Ref3 *T3      // 8 bytes
	Ref4 *T4      // 8 bytes
	Ref5 *T5      // 8 bytes
	Data [16]byte // 16 bytes
}

Ctx5 has 5 refs, 0 vals, 16 bytes data.

func Ctx5Of

func Ctx5Of[T1, T2, T3, T4, T5 any](sqe *ExtSQE) *Ctx5[T1, T2, T3, T4, T5]

Ctx5Of is a shorthand for ViewCtx5[T1,T2,T3,T4,T5](sqe).Vals0(). Use when you need 5 refs and 0 vals.

type Ctx5V1

type Ctx5V1[T1, T2, T3, T4, T5 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Ref5 *T5     // 8 bytes
	Val1 int64   // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx5V1 has 5 refs, 1 val, 8 bytes data.

func Ctx5V1Of

func Ctx5V1Of[T1, T2, T3, T4, T5 any](sqe *ExtSQE) *Ctx5V1[T1, T2, T3, T4, T5]

Ctx5V1Of is a shorthand for ViewCtx5[T1,T2,T3,T4,T5](sqe).Vals1(). Use when you need 5 refs and 1 val.

type Ctx5V2

type Ctx5V2[T1, T2, T3, T4, T5 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Ref5 *T5     // 8 bytes
	Val1 int64   // 8 bytes
	Val2 int64   // 8 bytes
}

Ctx5V2 has 5 refs, 2 vals, 0 bytes data.

func Ctx5V2Of

func Ctx5V2Of[T1, T2, T3, T4, T5 any](sqe *ExtSQE) *Ctx5V2[T1, T2, T3, T4, T5]

Ctx5V2Of is a shorthand for ViewCtx5[T1,T2,T3,T4,T5](sqe).Vals2(). Use when you need 5 refs and 2 vals.

type Ctx6

type Ctx6[T1, T2, T3, T4, T5, T6 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Ref5 *T5     // 8 bytes
	Ref6 *T6     // 8 bytes
	Data [8]byte // 8 bytes
}

Ctx6 has 6 refs, 0 vals, 8 bytes data.

func Ctx6Of

func Ctx6Of[T1, T2, T3, T4, T5, T6 any](sqe *ExtSQE) *Ctx6[T1, T2, T3, T4, T5, T6]

Ctx6Of is a shorthand for ViewCtx6[T1,T2,T3,T4,T5,T6](sqe).Vals0(). Use when you need 6 refs and 0 vals.

type Ctx6V1

type Ctx6V1[T1, T2, T3, T4, T5, T6 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Ref5 *T5     // 8 bytes
	Ref6 *T6     // 8 bytes
	Val1 int64   // 8 bytes
}

Ctx6V1 has 6 refs, 1 val, 0 bytes data.

func Ctx6V1Of

func Ctx6V1Of[T1, T2, T3, T4, T5, T6 any](sqe *ExtSQE) *Ctx6V1[T1, T2, T3, T4, T5, T6]

Ctx6V1Of is a shorthand for ViewCtx6[T1,T2,T3,T4,T5,T6](sqe).Vals1(). Use when you need 6 refs and 1 val.

type Ctx7

type Ctx7[T1, T2, T3, T4, T5, T6, T7 any] struct {
	Fn   Handler // 8 bytes
	Ref1 *T1     // 8 bytes
	Ref2 *T2     // 8 bytes
	Ref3 *T3     // 8 bytes
	Ref4 *T4     // 8 bytes
	Ref5 *T5     // 8 bytes
	Ref6 *T6     // 8 bytes
	Ref7 *T7     // 8 bytes
}

Ctx7 has 7 refs, 0 vals, 0 bytes data.

func Ctx7Of

func Ctx7Of[T1, T2, T3, T4, T5, T6, T7 any](sqe *ExtSQE) *Ctx7[T1, T2, T3, T4, T5, T6, T7]

Ctx7Of is a shorthand for ViewCtx7[T1,T2,T3,T4,T5,T6,T7](sqe).Vals0(). Use when you need 7 refs and 0 vals.

type CtxRefs0

type CtxRefs0 struct {
	// contains filtered or unexported fields
}

CtxRefs0 is a view into ExtSQE with 0 refs. Use its methods to select the number of vals (0-7).

func ViewCtx

func ViewCtx(sqe *ExtSQE) CtxRefs0

ViewCtx creates a CtxRefs0 for accessing the UserData with 0 refs.

Example:

c := uring.ViewCtx(sqe).Vals3()  // 0 refs, 3 vals
c.Val1 = timestamp
c.Val2 = flags
c.Val3 = seqNum

func (CtxRefs0) Vals0

func (v CtxRefs0) Vals0() *Ctx0

Vals0 returns a Ctx0 pointer (0 refs, 0 vals, 56B data).

func (CtxRefs0) Vals1

func (v CtxRefs0) Vals1() *Ctx0V1

Vals1 returns a Ctx0V1 pointer (0 refs, 1 val, 48B data).

func (CtxRefs0) Vals2

func (v CtxRefs0) Vals2() *Ctx0V2

Vals2 returns a Ctx0V2 pointer (0 refs, 2 vals, 40B data).

func (CtxRefs0) Vals3

func (v CtxRefs0) Vals3() *Ctx0V3

Vals3 returns a Ctx0V3 pointer (0 refs, 3 vals, 32B data).

func (CtxRefs0) Vals4

func (v CtxRefs0) Vals4() *Ctx0V4

Vals4 returns a Ctx0V4 pointer (0 refs, 4 vals, 24B data).

func (CtxRefs0) Vals5

func (v CtxRefs0) Vals5() *Ctx0V5

Vals5 returns a Ctx0V5 pointer (0 refs, 5 vals, 16B data).

func (CtxRefs0) Vals6

func (v CtxRefs0) Vals6() *Ctx0V6

Vals6 returns a Ctx0V6 pointer (0 refs, 6 vals, 8B data).

func (CtxRefs0) Vals7

func (v CtxRefs0) Vals7() *Ctx0V7

Vals7 returns a Ctx0V7 pointer (0 refs, 7 vals, 0B data).

type CtxRefs1

type CtxRefs1[T1 any] struct {
	// contains filtered or unexported fields
}

CtxRefs1 is a view into ExtSQE with 1 ref. Use its methods to select the number of vals (0-6).

func ViewCtx1

func ViewCtx1[T1 any](sqe *ExtSQE) CtxRefs1[T1]

ViewCtx1 creates a CtxRefs1 for accessing the UserData with 1 typed ref.

Example:

c := uring.ViewCtx1[Connection](sqe).Vals1()  // 1 ref, 1 val
c.Val1 = time.Now().UnixNano()

func (CtxRefs1[T1]) Vals0

func (v CtxRefs1[T1]) Vals0() *Ctx1[T1]

Vals0 returns a Ctx1 pointer (1 ref, 0 vals, 48B data).

func (CtxRefs1[T1]) Vals1

func (v CtxRefs1[T1]) Vals1() *Ctx1V1[T1]

Vals1 returns a Ctx1V1 pointer (1 ref, 1 val, 40B data).

func (CtxRefs1[T1]) Vals2

func (v CtxRefs1[T1]) Vals2() *Ctx1V2[T1]

Vals2 returns a Ctx1V2 pointer (1 ref, 2 vals, 32B data).

func (CtxRefs1[T1]) Vals3

func (v CtxRefs1[T1]) Vals3() *Ctx1V3[T1]

Vals3 returns a Ctx1V3 pointer (1 ref, 3 vals, 24B data).

func (CtxRefs1[T1]) Vals4

func (v CtxRefs1[T1]) Vals4() *Ctx1V4[T1]

Vals4 returns a Ctx1V4 pointer (1 ref, 4 vals, 16B data).

func (CtxRefs1[T1]) Vals5

func (v CtxRefs1[T1]) Vals5() *Ctx1V5[T1]

Vals5 returns a Ctx1V5 pointer (1 ref, 5 vals, 8B data).

func (CtxRefs1[T1]) Vals6

func (v CtxRefs1[T1]) Vals6() *Ctx1V6[T1]

Vals6 returns a Ctx1V6 pointer (1 ref, 6 vals, 0B data).

type CtxRefs2

type CtxRefs2[T1, T2 any] struct {
	// contains filtered or unexported fields
}

CtxRefs2 is a view into ExtSQE with 2 refs. Use its methods to select the number of vals (0-5).

func ViewCtx2

func ViewCtx2[T1, T2 any](sqe *ExtSQE) CtxRefs2[T1, T2]

ViewCtx2 creates a CtxRefs2 for accessing the UserData with 2 typed refs.

Example:

c := uring.ViewCtx2[Connection, Buffer](sqe).Vals2()  // 2 refs, 2 vals
c.Val1 = offset
c.Val2 = length

func (CtxRefs2[T1, T2]) Vals0

func (v CtxRefs2[T1, T2]) Vals0() *Ctx2[T1, T2]

Vals0 returns a Ctx2 pointer (2 refs, 0 vals, 40B data).

func (CtxRefs2[T1, T2]) Vals1

func (v CtxRefs2[T1, T2]) Vals1() *Ctx2V1[T1, T2]

Vals1 returns a Ctx2V1 pointer (2 refs, 1 val, 32B data).

func (CtxRefs2[T1, T2]) Vals2

func (v CtxRefs2[T1, T2]) Vals2() *Ctx2V2[T1, T2]

Vals2 returns a Ctx2V2 pointer (2 refs, 2 vals, 24B data).

func (CtxRefs2[T1, T2]) Vals3

func (v CtxRefs2[T1, T2]) Vals3() *Ctx2V3[T1, T2]

Vals3 returns a Ctx2V3 pointer (2 refs, 3 vals, 16B data).

func (CtxRefs2[T1, T2]) Vals4

func (v CtxRefs2[T1, T2]) Vals4() *Ctx2V4[T1, T2]

Vals4 returns a Ctx2V4 pointer (2 refs, 4 vals, 8B data).

func (CtxRefs2[T1, T2]) Vals5

func (v CtxRefs2[T1, T2]) Vals5() *Ctx2V5[T1, T2]

Vals5 returns a Ctx2V5 pointer (2 refs, 5 vals, 0B data).

type CtxRefs3

type CtxRefs3[T1, T2, T3 any] struct {
	// contains filtered or unexported fields
}

CtxRefs3 is a view into ExtSQE with 3 refs. Use its methods to select the number of vals (0-4).

func ViewCtx3

func ViewCtx3[T1, T2, T3 any](sqe *ExtSQE) CtxRefs3[T1, T2, T3]

ViewCtx3 creates a CtxRefs3 for accessing the UserData with 3 typed refs.

func (CtxRefs3[T1, T2, T3]) Vals0

func (v CtxRefs3[T1, T2, T3]) Vals0() *Ctx3[T1, T2, T3]

Vals0 returns a Ctx3 pointer (3 refs, 0 vals, 32B data).

func (CtxRefs3[T1, T2, T3]) Vals1

func (v CtxRefs3[T1, T2, T3]) Vals1() *Ctx3V1[T1, T2, T3]

Vals1 returns a Ctx3V1 pointer (3 refs, 1 val, 24B data).

func (CtxRefs3[T1, T2, T3]) Vals2

func (v CtxRefs3[T1, T2, T3]) Vals2() *Ctx3V2[T1, T2, T3]

Vals2 returns a Ctx3V2 pointer (3 refs, 2 vals, 16B data).

func (CtxRefs3[T1, T2, T3]) Vals3

func (v CtxRefs3[T1, T2, T3]) Vals3() *Ctx3V3[T1, T2, T3]

Vals3 returns a Ctx3V3 pointer (3 refs, 3 vals, 8B data).

func (CtxRefs3[T1, T2, T3]) Vals4

func (v CtxRefs3[T1, T2, T3]) Vals4() *Ctx3V4[T1, T2, T3]

Vals4 returns a Ctx3V4 pointer (3 refs, 4 vals, 0B data).

type CtxRefs4

type CtxRefs4[T1, T2, T3, T4 any] struct {
	// contains filtered or unexported fields
}

CtxRefs4 is a view into ExtSQE with 4 refs. Use its methods to select the number of vals (0-3).

func ViewCtx4

func ViewCtx4[T1, T2, T3, T4 any](sqe *ExtSQE) CtxRefs4[T1, T2, T3, T4]

ViewCtx4 creates a CtxRefs4 for accessing the UserData with 4 typed refs.

func (CtxRefs4[T1, T2, T3, T4]) Vals0

func (v CtxRefs4[T1, T2, T3, T4]) Vals0() *Ctx4[T1, T2, T3, T4]

Vals0 returns a Ctx4 pointer (4 refs, 0 vals, 24B data).

func (CtxRefs4[T1, T2, T3, T4]) Vals1

func (v CtxRefs4[T1, T2, T3, T4]) Vals1() *Ctx4V1[T1, T2, T3, T4]

Vals1 returns a Ctx4V1 pointer (4 refs, 1 val, 16B data).

func (CtxRefs4[T1, T2, T3, T4]) Vals2

func (v CtxRefs4[T1, T2, T3, T4]) Vals2() *Ctx4V2[T1, T2, T3, T4]

Vals2 returns a Ctx4V2 pointer (4 refs, 2 vals, 8B data).

func (CtxRefs4[T1, T2, T3, T4]) Vals3

func (v CtxRefs4[T1, T2, T3, T4]) Vals3() *Ctx4V3[T1, T2, T3, T4]

Vals3 returns a Ctx4V3 pointer (4 refs, 3 vals, 0B data).

type CtxRefs5

type CtxRefs5[T1, T2, T3, T4, T5 any] struct {
	// contains filtered or unexported fields
}

CtxRefs5 is a view into ExtSQE with 5 refs. Use its methods to select the number of vals (0-2).

func ViewCtx5

func ViewCtx5[T1, T2, T3, T4, T5 any](sqe *ExtSQE) CtxRefs5[T1, T2, T3, T4, T5]

ViewCtx5 creates a CtxRefs5 for accessing the UserData with 5 typed refs.

func (CtxRefs5[T1, T2, T3, T4, T5]) Vals0

func (v CtxRefs5[T1, T2, T3, T4, T5]) Vals0() *Ctx5[T1, T2, T3, T4, T5]

Vals0 returns a Ctx5 pointer (5 refs, 0 vals, 16B data).

func (CtxRefs5[T1, T2, T3, T4, T5]) Vals1

func (v CtxRefs5[T1, T2, T3, T4, T5]) Vals1() *Ctx5V1[T1, T2, T3, T4, T5]

Vals1 returns a Ctx5V1 pointer (5 refs, 1 val, 8B data).

func (CtxRefs5[T1, T2, T3, T4, T5]) Vals2

func (v CtxRefs5[T1, T2, T3, T4, T5]) Vals2() *Ctx5V2[T1, T2, T3, T4, T5]

Vals2 returns a Ctx5V2 pointer (5 refs, 2 vals, 0B data).

type CtxRefs6

type CtxRefs6[T1, T2, T3, T4, T5, T6 any] struct {
	// contains filtered or unexported fields
}

CtxRefs6 is a view into ExtSQE with 6 refs. Use its methods to select the number of vals (0-1).

func ViewCtx6

func ViewCtx6[T1, T2, T3, T4, T5, T6 any](sqe *ExtSQE) CtxRefs6[T1, T2, T3, T4, T5, T6]

ViewCtx6 creates a CtxRefs6 for accessing the UserData with 6 typed refs.

func (CtxRefs6[T1, T2, T3, T4, T5, T6]) Vals0

func (v CtxRefs6[T1, T2, T3, T4, T5, T6]) Vals0() *Ctx6[T1, T2, T3, T4, T5, T6]

Vals0 returns a Ctx6 pointer (6 refs, 0 vals, 8B data).

func (CtxRefs6[T1, T2, T3, T4, T5, T6]) Vals1

func (v CtxRefs6[T1, T2, T3, T4, T5, T6]) Vals1() *Ctx6V1[T1, T2, T3, T4, T5, T6]

Vals1 returns a Ctx6V1 pointer (6 refs, 1 val, 0B data).

type CtxRefs7

type CtxRefs7[T1, T2, T3, T4, T5, T6, T7 any] struct {
	// contains filtered or unexported fields
}

CtxRefs7 is a view into ExtSQE with 7 refs. The only option is Vals0 (no vals available).

func ViewCtx7

func ViewCtx7[T1, T2, T3, T4, T5, T6, T7 any](sqe *ExtSQE) CtxRefs7[T1, T2, T3, T4, T5, T6, T7]

ViewCtx7 creates a CtxRefs7 for accessing the UserData with 7 typed refs.

func (CtxRefs7[T1, T2, T3, T4, T5, T6, T7]) Vals0

func (v CtxRefs7[T1, T2, T3, T4, T5, T6, T7]) Vals0() *Ctx7[T1, T2, T3, T4, T5, T6, T7]

Vals0 returns a Ctx7 pointer (7 refs, 0 vals, 0B data).

type DirectCQE

type DirectCQE struct {
	Res      int32  // Completion result (bytes transferred or negative errno)
	Flags    uint32 // CQE flags (IORING_CQE_F_*)
	Op       uint8  // IORING_OP_* opcode
	SQEFlags uint8  // SQE flags (IOSQE_*)
	BufGroup uint16 // Buffer group index
	FD       iofd.FD
}

DirectCQE is a zero-overhead CQE for Direct mode operations. It contains the completion result and unpacked context fields without any mode checking or pointer indirection.

Use WaitDirect when your application exclusively uses Direct mode (PackDirect) for all submissions. This avoids the 3-way mode check that the generic Wait/CQEView path requires per-CQE.

Layout: 24 bytes (fits in 1/3 cache line, no padding needed)

func (*DirectCQE) BufID

func (c *DirectCQE) BufID() uint16

BufID returns the buffer ID from CQE flags. Only valid when HasBuffer() returns true.

func (*DirectCQE) HasBuffer

func (c *DirectCQE) HasBuffer() bool

HasBuffer reports whether a buffer ID is available.

func (*DirectCQE) HasMore

func (c *DirectCQE) HasMore() bool

HasMore reports whether more completions are coming (multishot).

func (*DirectCQE) IsNotification

func (c *DirectCQE) IsNotification() bool

IsNotification reports whether this is a zero-copy notification CQE.

func (*DirectCQE) IsSuccess

func (c *DirectCQE) IsSuccess() bool

IsSuccess reports whether the operation completed successfully.

type EpollEvent

type EpollEvent struct {
	Events uint32

	Fd  int32
	Pad int32
	// contains filtered or unexported fields
}

EpollEvent represents an epoll event. Layout matches struct epoll_event in Linux.

type ExtCQE

type ExtCQE struct {
	Res   int32   // Completion result (bytes transferred or negative errno)
	Flags uint32  // CQE flags (IORING_CQE_F_*)
	Ext   *ExtSQE // Pointer to ExtSQE with full context
}

ExtCQE is a zero-overhead CQE for Extended mode operations. It provides direct access to the ExtSQE pointer without mode checking.

Use WaitExtended when your application exclusively uses Extended mode (PackExtended) for all submissions. This avoids the 3-way mode check that the generic Wait/CQEView path requires per-CQE.

Layout: 16 bytes (fits in 1/4 cache line)

func (*ExtCQE) BufID

func (c *ExtCQE) BufID() uint16

BufID returns the buffer ID from CQE flags. Only valid when HasBuffer() returns true.

func (*ExtCQE) FD

func (c *ExtCQE) FD() iofd.FD

FD returns the file descriptor from the stored SQE.

func (*ExtCQE) HasBuffer

func (c *ExtCQE) HasBuffer() bool

HasBuffer reports whether a buffer ID is available.

func (*ExtCQE) HasBufferMore

func (c *ExtCQE) HasBufferMore() bool

HasBufferMore reports whether the buffer was partially consumed.

func (*ExtCQE) HasMore

func (c *ExtCQE) HasMore() bool

HasMore reports whether more completions are coming (multishot).

func (*ExtCQE) IsNotification

func (c *ExtCQE) IsNotification() bool

IsNotification reports whether this is a zero-copy notification CQE.

func (*ExtCQE) IsSuccess

func (c *ExtCQE) IsSuccess() bool

IsSuccess reports whether the operation completed successfully.

func (*ExtCQE) Op

func (c *ExtCQE) Op() uint8

Op returns the IORING_OP_* opcode from the stored SQE.

type ExtSQE

type ExtSQE struct {
	SQE      ioUringSqe // 64 bytes - full system context
	UserData [64]byte   // 64 bytes - flexible user interpretation
}

ExtSQE stores a full SQE and 64 bytes of user data. Callers must stop using it after the matching pool release.

type Features

type Features struct {
	// SQEntries is the actual number of SQ entries allocated by the kernel.
	SQEntries int
	// CQEntries is the actual number of CQ entries allocated by the kernel.
	CQEntries int
	// SQEBytes is the width of each mapped SQE slot in bytes.
	SQEBytes int
	// UserDataByteOrder is the byte order for user_data field interpretation.
	UserDataByteOrder binary.ByteOrder
}

Features reports per-ring sizing and metadata returned at creation time.

type GiantBuffer

type GiantBuffer = iobuf.GiantBuffer

Buffer types re-exported from iobuf.

type GreatBuffer

type GreatBuffer = iobuf.GreatBuffer

Buffer types re-exported from iobuf.

type Handler

type Handler = func(ring *Uring, sqe *ioUringSqe, cqe *ioUringCqe)

Handler is the callback function signature for completion handling. It receives the Uring instance, the original SQE, and the completion result.

When embedded in raw `ExtSQE.UserData`, only static non-capturing functions are safe to store directly. Capturing closures and other live Go roots must stay outside raw user-data bytes.

Example:

func handleRecv(ring *Uring, sqe *ioUringSqe, cqe *ioUringCqe) {
    if cqe.res < 0 {
        // Handle error
        return
    }
    // Process received data
}

type HugeBuffer

type HugeBuffer = iobuf.HugeBuffer

Buffer types re-exported from iobuf.

type IncrementalHandler

type IncrementalHandler interface {
	// OnData is called with the newly received fragment.
	// hasMore indicates whether additional data remains in the same buffer.
	// The buffer is valid until this callback returns.
	OnData(buf []byte, hasMore bool)

	// OnComplete is called when all fragments have been received.
	OnComplete()

	// OnError is called when an error occurs.
	OnError(err error)
}

IncrementalHandler handles incremental receive completion events.

type IncrementalReceiver

type IncrementalReceiver struct {
	// contains filtered or unexported fields
}

IncrementalReceiver manages receives that use `IOU_PBUF_RING_INC`. `IORING_CQE_F_BUF_MORE` reports that the current buffer still has unread data.

func NewIncrementalReceiver

func NewIncrementalReceiver(ring *Uring, pool *ContextPools, groupID uint16, bufSize int, bufBacking []byte, entries int) *IncrementalReceiver

NewIncrementalReceiver creates an incremental receiver.

func (*IncrementalReceiver) HandleCQE

func (r *IncrementalReceiver) HandleCQE(cqe CQEView) bool

HandleCQE processes a CQE from an incremental receive operation. Returns true if the CQE was handled, false if it's not an incremental recv CQE.

func (*IncrementalReceiver) Recv

func (r *IncrementalReceiver) Recv(fd iofd.FD, handler IncrementalHandler) error

Recv submits an incremental receive operation. The handler will receive OnData for each data fragment, with hasMore indicating whether additional data remains. OnComplete is called when the message is fully received. Returns iox.ErrWouldBlock if the context pool is exhausted.

type IncrementalSubscriber

type IncrementalSubscriber struct {
	// contains filtered or unexported fields
}

IncrementalSubscriber adapts functions to `IncrementalHandler`.

func NewIncrementalSubscriber

func NewIncrementalSubscriber() *IncrementalSubscriber

NewIncrementalSubscriber creates a subscriber with default handlers.

func (*IncrementalSubscriber) Handler

Handler returns `s` as an `IncrementalHandler`.

func (*IncrementalSubscriber) OnComplete

func (s *IncrementalSubscriber) OnComplete(fn func()) *IncrementalSubscriber

OnComplete sets the completion handler.

func (*IncrementalSubscriber) OnData

func (s *IncrementalSubscriber) OnData(fn func(buf []byte, hasMore bool)) *IncrementalSubscriber

OnData sets the data handler.

func (*IncrementalSubscriber) OnError

func (s *IncrementalSubscriber) OnError(fn func(err error)) *IncrementalSubscriber

OnError sets the error handler.

type IndirectSQE

type IndirectSQE struct {
	// contains filtered or unexported fields
}

IndirectSQE stores a full SQE copy for indirect context. Callers must stop using it after the matching pool release.

type IoTimespec

type IoTimespec struct {
	TvSec  uint64 // Seconds
	TvNsec uint64 // Nanoseconds
}

IoTimespec is a 128-bit timespec for high-precision timestamps. Matches struct io_timespec in Linux.

type IoVec

type IoVec = zcall.Iovec

IoVec is the scatter/gather I/O vector type.

type LargeBuffer

type LargeBuffer = iobuf.LargeBuffer

Buffer types re-exported from iobuf.

type ListenerHandler

type ListenerHandler interface {
	OnSocketCreated(fd iofd.FD) bool
	OnBound() bool
	OnListening()
	OnError(op uint8, err error)
}

ListenerHandler receives completion events during listener setup.

type ListenerManager

type ListenerManager struct {
	// contains filtered or unexported fields
}

ListenerManager provides convenience methods for starting async listener creation. It prepares the initial SOCKET SQE and submits it. The caller is responsible for CQE routing and chain advancement (bind→listen) using DecodeListenerCQE + Prepare helpers.

func NewListenerManager

func NewListenerManager(ring *Uring, pool *ContextPools) *ListenerManager

NewListenerManager creates a listener manager.

func (*ListenerManager) ListenTCP4

func (m *ListenerManager) ListenTCP4(addr *net.TCPAddr, backlog int, handler ListenerHandler) (*ListenerOp, error)

ListenTCP4 prepares and submits the initial SOCKET for TCP IPv4. The caller must decode subsequent CQEs via DecodeListenerCQE and advance the chain with PrepareListenerBind/PrepareListenerListen.

func (*ListenerManager) ListenTCP6

func (m *ListenerManager) ListenTCP6(addr *net.TCPAddr, backlog int, handler ListenerHandler) (*ListenerOp, error)

ListenTCP6 prepares and submits the initial SOCKET for TCP IPv6.

func (*ListenerManager) ListenUnix

func (m *ListenerManager) ListenUnix(addr *net.UnixAddr, backlog int, handler ListenerHandler) (*ListenerOp, error)

ListenUnix prepares and submits the initial SOCKET for Unix domain.

func (*ListenerManager) Pool

func (m *ListenerManager) Pool() *ContextPools

Pool returns the context pool.

func (*ListenerManager) Ring

func (m *ListenerManager) Ring() *Uring

Ring returns the underlying Uring instance.

type ListenerOp

type ListenerOp struct {
	// contains filtered or unexported fields
}

ListenerOp is a handle to a listener creation operation. The caller drives the setup state machine; ListenerOp holds the listener FD and provides convenience methods around it.

func (*ListenerOp) AcceptMultishot

func (op *ListenerOp) AcceptMultishot(handler MultishotHandler, options ...OpOptionFunc) (*MultishotSubscription, error)

AcceptMultishot starts a multishot accept subscription on a ready listener. Options are forwarded to Uring.AcceptMultishot. Returns ErrNotReady until the listener FD is set and valid.

func (*ListenerOp) Close

func (op *ListenerOp) Close()

Close releases resources. If the listener has an open FD, it is closed. When a listener setup SQE is still in flight, Close keeps the pooled ExtSQE borrowed until the caller drains that CQE and calls Close again. Caller must drain all in-flight operations before calling Close. Close is not safe for concurrent use. Caller must serialize Close and only perform final cleanup after draining pending listener setup CQEs.

func (*ListenerOp) Ext

func (op *ListenerOp) Ext() *ExtSQE

Ext returns the ExtSQE for use with Prepare helpers and SubmitExtended.

func (*ListenerOp) FD

func (op *ListenerOp) FD() iofd.FD

FD returns the listener socket file descriptor. Returns -1 if the socket hasn't been created yet.

func (*ListenerOp) SetFD

func (op *ListenerOp) SetFD(fd iofd.FD)

SetFD stores the socket FD obtained from SOCKET completion.

type ListenerState

type ListenerState uint8

ListenerState tracks which stage of socket→bind→listen has completed.

const (
	ListenerStateInit   ListenerState = iota // Initial state
	ListenerStateSocket                      // Socket creation completed
	ListenerStateBind                        // Bind completed
	ListenerStateListen                      // Listen completed
	ListenerStateReady                       // Listener is ready for accept
	ListenerStateFailed                      // Operation failed
)

type ListenerStep

type ListenerStep struct {
	Op      uint8         // Kernel op that completed (IORING_OP_SOCKET, _BIND, _LISTEN)
	FD      iofd.FD       // Socket FD (meaningful after SOCKET completes)
	State   ListenerState // State at the time of this completion
	Err     error         // Non-nil if the operation failed
	Ext     *ExtSQE       // The ExtSQE carrying this listener context
	Handler ListenerHandler
}

ListenerStep is the decoded result of a listener-related CQE. The caller decides what happens next based on this value.

func DecodeListenerCQE

func DecodeListenerCQE(cqe CQEView) (ListenerStep, bool)

DecodeListenerCQE decodes a CQE into a ListenerStep. Returns (step, true) if the CQE belongs to a listener operation, or (zero, false) if it does not.

type ListenerSubscriber

type ListenerSubscriber struct {
	// contains filtered or unexported fields
}

ListenerSubscriber adapts functions to `ListenerHandler`.

func NewListenerSubscriber

func NewListenerSubscriber() *ListenerSubscriber

NewListenerSubscriber creates a subscriber with default handlers.

func (*ListenerSubscriber) Handler

func (s *ListenerSubscriber) Handler() ListenerHandler

Handler returns `s` as a `ListenerHandler`.

func (*ListenerSubscriber) OnBound

func (s *ListenerSubscriber) OnBound(fn func() bool) *ListenerSubscriber

OnBound sets the bound handler.

func (*ListenerSubscriber) OnError

func (s *ListenerSubscriber) OnError(fn func(op uint8, err error)) *ListenerSubscriber

OnError sets the error handler.

func (*ListenerSubscriber) OnListening

func (s *ListenerSubscriber) OnListening(fn func()) *ListenerSubscriber

OnListening sets the listening handler.

func (*ListenerSubscriber) OnSocketCreated

func (s *ListenerSubscriber) OnSocketCreated(fn func(fd iofd.FD) bool) *ListenerSubscriber

OnSocketCreated sets the socket-created handler.

type MediumBuffer

type MediumBuffer = iobuf.MediumBuffer

Buffer types re-exported from iobuf.

type MemRegionReg

type MemRegionReg struct {
	RegionUptr uint64 // Pointer to RegionDesc
	Flags      uint64 // Registration flags (IORING_MEM_REGION_REG_*)
	// contains filtered or unexported fields
}

MemRegionReg is the registration structure for memory regions. Matches struct io_uring_mem_region_reg in Linux.

type MicroBuffer

type MicroBuffer = iobuf.MicroBuffer

Buffer types re-exported from iobuf.

type Msghdr

type Msghdr = zcall.Msghdr

Msghdr represents a message header for sendmsg/recvmsg. Layout matches struct msghdr in Linux (LP64).

type MultishotAction

type MultishotAction uint8

MultishotAction tells `uring` whether to keep or stop a live subscription.

const (
	// MultishotContinue keeps the subscription live.
	// It applies only while the observed step still carries `IORING_CQE_F_MORE`.
	MultishotContinue MultishotAction = iota

	// MultishotStop requests cancellation after the current callback returns.
	// A final step ignores it because the kernel is already closing the stream.
	MultishotStop
)

type MultishotHandler

type MultishotHandler interface {
	// OnMultishotStep handles one observed CQE.
	// Return `MultishotStop` to request async cancellation after a non-final step.
	OnMultishotStep(step MultishotStep) MultishotAction

	// OnMultishotStop handles the terminal stop of the subscription.
	// It runs at most once if callbacks stay enabled. It carries the terminal
	// error or cancellation cause, but not a borrowed CQE view.
	OnMultishotStop(err error, cancelled bool)
}

MultishotHandler handles multishot step observations and the terminal stop. Retry or resubmission policy stays above `uring`.

type MultishotStep

type MultishotStep struct {
	CQE       CQEView
	Err       error
	Cancelled bool
}

MultishotStep describes one observed multishot CQE.

`CQE` is borrowed and valid only during `OnMultishotStep`. Negative `CQE.Res` values are decoded into `Err`. `Cancelled` reports that the kernel step was `-ECANCELED`.

func (MultishotStep) Final

func (s MultishotStep) Final() bool

Final reports whether the observed CQE lacked `IORING_CQE_F_MORE`.

func (MultishotStep) HasMore

func (s MultishotStep) HasMore() bool

HasMore reports whether the observed CQE carried `IORING_CQE_F_MORE`.

type MultishotSubscriber

type MultishotSubscriber struct {
	// contains filtered or unexported fields
}

MultishotSubscriber adapts functions to `MultishotHandler`.

func NewMultishotSubscriber

func NewMultishotSubscriber() *MultishotSubscriber

NewMultishotSubscriber creates a subscriber with default handlers.

func (*MultishotSubscriber) Handler

Handler returns `s` as a `MultishotHandler`.

func (*MultishotSubscriber) OnMultishotStep

func (s *MultishotSubscriber) OnMultishotStep(step MultishotStep) MultishotAction

OnMultishotStep implements MultishotHandler.

func (*MultishotSubscriber) OnMultishotStop

func (s *MultishotSubscriber) OnMultishotStop(err error, cancelled bool)

OnMultishotStop implements MultishotHandler.

func (*MultishotSubscriber) OnStep

OnStep sets the multishot step handler.

func (*MultishotSubscriber) OnStop

func (s *MultishotSubscriber) OnStop(fn func(err error, cancelled bool)) *MultishotSubscriber

OnStop sets the multishot terminal stop handler.

type MultishotSubscription

type MultishotSubscription struct {
	// contains filtered or unexported fields
}

MultishotSubscription tracks one multishot `io_uring` operation. It delivers zero or more step callbacks until cancelled or exhausted.

Lifecycle:

  1. Create it with `Uring.AcceptMultishot` or `Uring.ReceiveMultishot`
  2. Observe `OnMultishotStep` for each CQE
  3. End it with `Cancel`, `Unsubscribe`, or a terminal kernel completion
  4. If callbacks stay enabled, one `OnMultishotStop` runs at most once

Thread Safety: `Cancel` and `Unsubscribe` are safe from any goroutine. Observer callbacks run on the goroutine that dispatches the CQE, usually `Wait`.

func (*MultishotSubscription) Active

func (s *MultishotSubscription) Active() bool

Active reports whether the subscription has not yet reached its terminal CQE. A cancelling subscription remains active until the terminal CQE arrives.

func (*MultishotSubscription) Cancel

func (s *MultishotSubscription) Cancel() error

Cancel asks the kernel to stop this multishot operation. The subscription remains live until a terminal CQE arrives. It is safe to call more than once.

If callbacks remain enabled, later CQEs may still deliver `OnMultishotStep` before the terminal CQE delivers `OnMultishotStop`.

func (*MultishotSubscription) State

State returns the current subscription state.

func (*MultishotSubscription) Unsubscribe

func (s *MultishotSubscription) Unsubscribe()

Unsubscribe suppresses future callbacks and best-effort cancels the subscription. It also suppresses callbacks that have not started yet, including `OnMultishotStop`. Use it when you do not need a terminal callback.

In-flight callbacks may still finish after `Unsubscribe` returns. If the cancel submission fails, the kernel request can remain live until it terminates naturally, but further callbacks stay suppressed.

type NanoBuffer

type NanoBuffer = iobuf.NanoBuffer

Buffer types re-exported from iobuf.

type NapiReg

type NapiReg struct {
	BusyPollTo     uint32 // Busy poll timeout in microseconds
	PreferBusyPoll uint8  // Prefer busy poll over sleeping
	Opcode         uint8  // IO_URING_NAPI_* operation

	OpParam uint32 // Operation parameter (strategy or NAPI ID)
	// contains filtered or unexported fields
}

NapiReg is the registration structure for NAPI busy polling. Matches struct io_uring_napi in Linux.

type NetworkType

type NetworkType = sock.NetworkType

NetworkType represents the network address family.

type NoopIncrementalHandler

type NoopIncrementalHandler struct{}

NoopIncrementalHandler provides default implementations for IncrementalHandler. Embed this in custom handlers to only override methods you need.

Default behavior:

  • OnData: no-op
  • OnComplete: no-op
  • OnError: no-op

Example:

type myHandler struct {
   uring.NoopIncrementalHandler
   totalBytes int
}

func (h *myHandler) OnData(buf []byte, hasMore bool) {
   h.totalBytes += len(buf)
}

func (NoopIncrementalHandler) OnComplete

func (NoopIncrementalHandler) OnComplete()

OnComplete is a no-op.

func (NoopIncrementalHandler) OnData

func (NoopIncrementalHandler) OnData([]byte, bool)

OnData is a no-op.

func (NoopIncrementalHandler) OnError

func (NoopIncrementalHandler) OnError(error)

OnError is a no-op.

type NoopListenerHandler

type NoopListenerHandler struct{}

NoopListenerHandler provides default implementations for ListenerHandler. Embed this in custom handlers to only override methods you need.

Default behavior:

  • OnSocketCreated: returns true (continue to bind)
  • OnBound: returns true (continue to listen)
  • OnListening: no-op
  • OnError: no-op

Example:

type myHandler struct {
   uring.NoopListenerHandler
   fd iofd.FD
}

func (h *myHandler) OnSocketCreated(fd iofd.FD) bool {
   h.fd = fd
   // Set custom socket options here
   return true
}

func (NoopListenerHandler) OnBound

func (NoopListenerHandler) OnBound() bool

OnBound returns true to continue the chain.

func (NoopListenerHandler) OnError

func (NoopListenerHandler) OnError(uint8, error)

OnError is a no-op.

func (NoopListenerHandler) OnListening

func (NoopListenerHandler) OnListening()

OnListening is a no-op.

func (NoopListenerHandler) OnSocketCreated

func (NoopListenerHandler) OnSocketCreated(iofd.FD) bool

OnSocketCreated returns true to continue the chain.

type NoopMultishotHandler

type NoopMultishotHandler struct{}

NoopMultishotHandler provides default implementations for MultishotHandler. Embed this in custom handlers to only override methods you need.

Default behavior:

  • successful steps continue
  • error steps stop
  • terminal stop is ignored

Example:

type myObserver struct {
   uring.NoopMultishotHandler
   connections int
}

func (o *myObserver) OnMultishotStep(step uring.MultishotStep) uring.MultishotAction {
   if step.Err == nil && step.CQE.Res >= 0 {
       o.connections++
       return uring.MultishotContinue
   }
   return o.NoopMultishotHandler.OnMultishotStep(step)
}

func (NoopMultishotHandler) OnMultishotStep

func (NoopMultishotHandler) OnMultishotStep(step MultishotStep) MultishotAction

OnMultishotStep returns the default action for the observed step.

func (NoopMultishotHandler) OnMultishotStop

func (NoopMultishotHandler) OnMultishotStop(error, bool)

OnMultishotStop is a no-op.

type NoopZCHandler

type NoopZCHandler struct{}

NoopZCHandler provides default implementations for ZCHandler. Embed this in custom handlers to only override methods you need.

Default behavior:

  • OnCompleted: no-op
  • OnNotification: no-op

Example:

type myHandler struct {
   uring.NoopZCHandler
   bytesSent int32
}

func (h *myHandler) OnCompleted(result int32) {
   if result > 0 {
       h.bytesSent += result
   }
}

func (NoopZCHandler) OnCompleted

func (NoopZCHandler) OnCompleted(int32)

OnCompleted is a no-op.

func (NoopZCHandler) OnNotification

func (NoopZCHandler) OnNotification(int32)

OnNotification is a no-op.

type NoopZCRXHandler

type NoopZCRXHandler struct{}

NoopZCRXHandler provides default ZCRXHandler behavior. Embed to override selectively.

func (NoopZCRXHandler) OnData

func (NoopZCRXHandler) OnData(buf *ZCRXBuffer) bool

OnData releases the buffer and continues.

func (NoopZCRXHandler) OnError

func (NoopZCRXHandler) OnError(error) bool

OnError stops on any error.

func (NoopZCRXHandler) OnStopped

func (NoopZCRXHandler) OnStopped()

OnStopped is a no-op.

type OpOption

type OpOption struct {
	// SQE control flags (IOSQE_IO_LINK, IOSQE_IO_DRAIN, IOSQE_ASYNC, etc.)
	Flags uint8
	// I/O priority. For file I/O: (class<<13)|level where class ∈ {RT=1,BE=2,IDLE=3}.
	// For io_uring-specific ops: operation-defined flags (e.g. IORING_ACCEPT_MULTISHOT).
	IOPrio uint16
	// Registered file table slot index for direct descriptor operations.
	FileIndex uint32
	// Registered credential personality ID (0 = caller's credentials).
	Personality uint16
	// Listen backlog depth.
	Backlog int
	// Kernel-provided buffer size for receive with buffer selection.
	ReadBufferSize int
	// Write buffer sizing hint.
	WriteBufferSize int
	// Timeout duration for timeout-class operations.
	Duration time.Duration
	// File permission mode for open/mkdir operations.
	FileMode os.FileMode
	// Advisory hint for fadvise/madvise operations.
	Fadvise int
	// File offset for positioned I/O operations.
	Offset int64
	// Size limit (nil = use buffer length).
	N *int
	// Completion count for timeout operations.
	Count int
	// Timeout clock and behavior flags (IORING_TIMEOUT_ABS, _BOOTTIME, _REALTIME, etc.)
	TimeoutFlags uint32
	// Splice/tee operation flags (SPLICE_F_MOVE, _NONBLOCK, _MORE, _GIFT).
	SpliceFlags uint32
	// Fsync behavior flags (IORING_FSYNC_DATASYNC).
	FsyncFlags uint32
}

OpOption holds per-operation configuration. Each operation reads only the fields it uses, and ignores the rest.

func (*OpOption) Apply

func (uo *OpOption) Apply(opts ...OpOptionFunc)

Apply applies the given option functions to the OpOption.

type OpOptionFunc

type OpOptionFunc func(opt *OpOption)

OpOptionFunc is a type that defines a functional option for configuring a OpOption.

func WithBacklog

func WithBacklog(n int) OpOptionFunc

WithBacklog sets the listen backlog depth.

func WithCount

func WithCount(cnt int) OpOptionFunc

WithCount sets the completion count for timeout operations.

func WithDuration

func WithDuration(d time.Duration) OpOptionFunc

WithDuration sets the timeout duration.

func WithFadvise

func WithFadvise(advice int) OpOptionFunc

WithFadvise sets the advisory hint for fadvise/madvise operations.

func WithFileIndex

func WithFileIndex(idx uint32) OpOptionFunc

WithFileIndex sets the file index for direct descriptor operations. For socket/accept direct, pass IORING_FILE_INDEX_ALLOC for auto-allocation, or a specific slot index (0-based) into the registered file table.

func WithFileMode

func WithFileMode(mode os.FileMode) OpOptionFunc

WithFileMode sets the file permission mode for open/mkdir operations.

func WithFlags

func WithFlags(flags uint8) OpOptionFunc

WithFlags sets SQE control flags (IOSQE_IO_LINK, IOSQE_IO_DRAIN, etc.).

func WithFsyncDataSync

func WithFsyncDataSync() OpOptionFunc

WithFsyncDataSync uses fdatasync semantics (sync data only, skip metadata).

func WithIOPrio

func WithIOPrio(ioprio uint16) OpOptionFunc

WithIOPrio sets the raw I/O priority value.

func WithIOPrioClass

func WithIOPrioClass(class, level uint8) OpOptionFunc

WithIOPrioClass sets I/O priority from class (RT=1, BE=2, IDLE=3) and level (0-7).

func WithN

func WithN(n int) OpOptionFunc

WithN sets the size limit for read/write/splice operations.

func WithOffset

func WithOffset(offset int64) OpOptionFunc

WithOffset sets the file offset for positioned I/O operations.

func WithPersonality

func WithPersonality(id uint16) OpOptionFunc

WithPersonality sets the registered credential personality ID for per-operation credential override. Register personalities via IORING_REGISTER_PERSONALITY.

func WithReadBufferSize

func WithReadBufferSize(n int) OpOptionFunc

WithReadBufferSize sets the kernel-provided buffer size for receive operations with buffer selection.

func WithSpliceFlags

func WithSpliceFlags(flags uint32) OpOptionFunc

WithSpliceFlags sets splice/tee operation flags. Combine SPLICE_F_MOVE, SPLICE_F_NONBLOCK, SPLICE_F_MORE, SPLICE_F_GIFT.

func WithTimeoutAbsolute

func WithTimeoutAbsolute() OpOptionFunc

WithTimeoutAbsolute treats the timeout duration as an absolute timestamp.

func WithTimeoutBootTime

func WithTimeoutBootTime() OpOptionFunc

WithTimeoutBootTime uses CLOCK_BOOTTIME (survives system suspend).

func WithTimeoutEtimeSuccess

func WithTimeoutEtimeSuccess() OpOptionFunc

WithTimeoutEtimeSuccess converts -ETIME into successful completion.

func WithTimeoutFlags

func WithTimeoutFlags(flags uint32) OpOptionFunc

WithTimeoutFlags sets timeout clock and behavior flags. Combine IORING_TIMEOUT_ABS, IORING_TIMEOUT_BOOTTIME, IORING_TIMEOUT_REALTIME, IORING_TIMEOUT_ETIME_SUCCESS, or IORING_TIMEOUT_MULTISHOT.

func WithTimeoutMultishot

func WithTimeoutMultishot() OpOptionFunc

WithTimeoutMultishot makes the timeout fire repeatedly.

func WithTimeoutRealTime

func WithTimeoutRealTime() OpOptionFunc

WithTimeoutRealTime uses CLOCK_REALTIME (wall clock, NTP-adjusted).

func WithWriteBufferSize

func WithWriteBufferSize(n int) OpOptionFunc

WithWriteBufferSize sets the write buffer sizing hint.

type OpenHow

type OpenHow struct {
	Flags   uint64
	Mode    uint64
	Resolve uint64
}

OpenHow is the structure for openat2 syscall. Layout matches struct open_how in Linux.

type OptionFunc

type OptionFunc = func(opt *Options)

OptionFunc mutates Options during New construction.

var (
	// LargeLockedBufferMemOptions uses the maximum registered buffer memory (128 MiB).
	// This equals the default and keeps the helper-based option style available.
	LargeLockedBufferMemOptions OptionFunc = func(opt *Options) {
		opt.LockedBufferMem = registerBufferSize * registerBufferNum
	}
	// MultiSizeBufferOptions enables multi-size buffer groups.
	MultiSizeBufferOptions OptionFunc = func(opt *Options) {
		opt.MultiSizeBuffer = 1
	}
)

type Options

type Options struct {
	// Entries specifies the number of SQE slots (use Entries* constants).
	Entries int
	// LockedBufferMem is the total memory for registered buffers (bytes).
	LockedBufferMem int
	// ReadBufferSize is the size of each read buffer (bytes).
	ReadBufferSize int
	// ReadBufferNum is the number of read buffers to allocate.
	ReadBufferNum int
	// ReadBufferGidOffset is the base group ID for read buffers.
	ReadBufferGidOffset int
	// WriteBufferSize is the size of each write buffer (bytes).
	WriteBufferSize int
	// WriteBufferNum is the number of write buffers to allocate.
	WriteBufferNum int
	// MultiSizeBuffer enables multiple buffer size groups when > 0.
	MultiSizeBuffer int
	// MultiIssuers enables the shared-submit configuration for rings that accept
	// submissions from multiple goroutines. When false, New requests
	// SINGLE_ISSUER + DEFER_TASKRUN and callers must serialize submit-state
	// operations such as submit, Wait/enter, Stop, and ring resize so the
	// default fast path can skip shared synchronization. When true, it requests
	// COOP_TASKRUN and keeps the shared-submit synchronization path.
	MultiIssuers bool
	// NotifySucceed ensures CQEs are generated for all successful operations.
	NotifySucceed bool
	// IndirectSubmissionQueue enables the SQ array. When false, New requests
	// IORING_SETUP_NO_SQARRAY to reduce ring memory in the direct-index submit path.
	IndirectSubmissionQueue bool
	// SQE128 requests 128-byte SQE slots for the ring. This is required for
	// Nop128 and UringCmd128. When false, the ring keeps the default 64-byte SQE
	// layout.
	SQE128 bool
	// HybridPolling enables hybrid I/O polling mode (IORING_SETUP_HYBRID_IOPOLL).
	// This delays polling to reduce CPU usage while maintaining low latency.
	// Requires: O_DIRECT files on polling-capable storage devices (e.g., NVMe).
	// Available since kernel 6.13.
	HybridPolling bool
}

Options configures the io_uring instance behavior. All fields have sensible defaults if not specified.

func OptionsForBudget

func OptionsForBudget(budget int) Options

OptionsForBudget returns Options configured for the given memory budget.

Budget specifies the total memory in bytes to allocate for the io_uring instance. Supports budgets from 16 MiB to 128 GiB (values outside this range are clamped).

Memory is distributed as:

  • Ring entries: sized to match expected throughput (more budget = more entries)
  • Registered buffers: 25% for zero-copy operations (minimum 8 MiB, no maximum cap)
  • Buffer groups: use the remaining budget after registered buffers and ring overhead

Example:

// 256 MiB budget for a medium server
opts := OptionsForBudget(256 * MiB)
ring, err := New(func(o *Options) { *o = opts })

// 64 MiB budget for memory-constrained environment
opts := OptionsForBudget(64 * MiB)

func OptionsForSystem

func OptionsForSystem(systemMemory int) Options

OptionsForSystem returns Options configured for a machine with the given total memory. It derives a memory budget (25% of system memory) and delegates to OptionsForBudget.

Use the MachineMemory* constants for common configurations:

// 1GB machine (e.g., Linode Nanode, small CI runner)
opts := OptionsForSystem(MachineMemory1GB)
ring, err := New(func(o *Options) { *o = opts })

// 4GB machine (e.g., medium VM)
opts := OptionsForSystem(MachineMemory4GB)

Or pass the actual system memory:

opts := OptionsForSystem(3840 * MiB)  // 3.75 GiB

func (*Options) Apply

func (uo *Options) Apply(opts ...OptionFunc)

Apply applies option helpers in order.

type PicoBuffer

type PicoBuffer = iobuf.PicoBuffer

Buffer types re-exported from iobuf.

type PollCloser

type PollCloser = iofd.PollCloser

PollCloser extends PollFd with close capability.

type PollFd

type PollFd = iofd.PollFd

PollFd represents a file descriptor that can be polled by uring helpers.

type QueryHdr

type QueryHdr struct {
	NextEntry uint64 // Pointer to next query entry
	QueryData uint64 // Query-specific data pointer
	QueryOp   uint32 // Query operation type (IO_URING_QUERY_*)
	Size      uint32 // Size of the query response
	Result    int32  // Result code
	// contains filtered or unexported fields
}

QueryHdr is the header for query operations. Matches struct io_uring_query_hdr in Linux.

type QueryOpcode

type QueryOpcode struct {
	NrRequestOpcodes  uint32 // Number of supported IORING_OP_* opcodes
	NrRegisterOpcodes uint32 // Number of supported IORING_REGISTER_* opcodes
	FeatureFlags      uint64 // Raw kernel feature bitmask returned by IORING_REGISTER_QUERY
	RingSetupFlags    uint64 // Bitmask of IORING_SETUP_* flags
	EnterFlags        uint64 // Bitmask of IORING_ENTER_* flags
	SqeFlags          uint64 // Bitmask of IOSQE_* flags
	NrQueryOpcodes    uint32 // Number of available query opcodes
	// contains filtered or unexported fields
}

QueryOpcode returns information about supported io_uring operations. Matches struct io_uring_query_opcode in Linux.

type QuerySCQ

type QuerySCQ struct {
	HdrSize      uint64 // SQ/CQ rings header size
	HdrAlignment uint64 // Header alignment requirement
}

QuerySCQ returns information about SQ/CQ rings. Matches struct io_uring_query_scq in Linux.

type QueryZCRX

type QueryZCRX struct {
	RegisterFlags uint64 // Bitmask of supported ZCRX_REG_* flags
	AreaFlags     uint64 // Bitmask of IORING_ZCRX_AREA_* flags
	NrCtrlOpcodes uint32 // Number of supported ZCRX_CTRL_* opcodes

	RqHdrSize      uint32 // Refill ring header size
	RqHdrAlignment uint32 // Header alignment requirement
	// contains filtered or unexported fields
}

QueryZCRX returns information about ZCRX capabilities. Matches struct io_uring_query_zcrx in Linux.

type RawSockaddr

type RawSockaddr = sock.RawSockaddr

RawSockaddr is the base raw socket address structure.

type RawSockaddrAny

type RawSockaddrAny = sock.RawSockaddrAny

RawSockaddrAny is the widest raw socket address storage type.

type RawSockaddrInet4

type RawSockaddrInet4 = sock.RawSockaddrInet4

RawSockaddrInet4 is the raw IPv4 socket address structure.

type RawSockaddrInet6

type RawSockaddrInet6 = sock.RawSockaddrInet6

RawSockaddrInet6 is the raw IPv6 socket address structure.

type RawSockaddrUnix

type RawSockaddrUnix = sock.RawSockaddrUnix

RawSockaddrUnix is the raw Unix domain socket address structure.

type RegWait

type RegWait struct {
	Ts          Timespec // Timeout specification
	MinWaitUsec uint32   // Minimum wait time in microseconds
	Flags       uint32   // IORING_REG_WAIT_* flags
	Sigmask     uint64   // Signal mask
	SigmaskSz   uint32   // Signal mask size
	// contains filtered or unexported fields
}

RegWait is a registered wait region entry. Matches struct io_uring_reg_wait in Linux.

type RegionDesc

type RegionDesc struct {
	UserAddr   uint64 // User address of the region
	Size       uint64 // Size of the region
	Flags      uint32 // Region flags
	ID         uint32 // Region identifier
	MmapOffset uint64 // Offset for mmap
	// contains filtered or unexported fields
}

RegionDesc describes a memory region for io_uring. Matches struct io_uring_region_desc in Linux.

type RegisterBuffer

type RegisterBuffer = [registerBufferSize]byte

Buffer types re-exported from iobuf.

type RegisterBufferPool

type RegisterBufferPool struct {
	// contains filtered or unexported fields
}

RegisterBufferPool is a pool of registered buffers for io_uring. Uses 4KB buffers optimized for page-aligned I/O operations.

func NewRegisterBufferPool

func NewRegisterBufferPool(capacity int) *RegisterBufferPool

NewRegisterBufferPool creates a new buffer pool with the given capacity.

func (*RegisterBufferPool) Fill

func (p *RegisterBufferPool) Fill(factory func() RegisterBuffer)

Fill populates the pool using the provided factory function.

type SQEContext

type SQEContext uint64

SQEContext encodes `io_uring.user_data`. Direct mode packs opcode, flags, buffer group, and fd inline. Indirect and extended modes store aligned pointers in the low 62 bits.

const (
	// CtxModeDirect indicates inline context (8B, zero allocation).
	CtxModeDirect SQEContext = 0 << 62

	// CtxModeIndirect indicates pointer to IndirectSQE (64B).
	CtxModeIndirect SQEContext = 1 << 62

	// CtxModeExtended indicates pointer to ExtSQE (128B).
	CtxModeExtended SQEContext = 2 << 62
)

Context mode constants (bits 62-63).

func ForFD

func ForFD(fd int32) SQEContext

ForFD returns a direct-mode context with only the fd set.

func PackDirect

func PackDirect(op, flags uint8, bufGroup uint16, fd int32) SQEContext

PackDirect packs direct-mode submission context.

func PackExtended

func PackExtended(sqe *ExtSQE) SQEContext

PackExtended packs an extended-mode pointer.

func PackIndirect

func PackIndirect(sqe *IndirectSQE) SQEContext

PackIndirect packs an indirect-mode pointer.

func SQEContextFromRaw

func SQEContextFromRaw(v uint64) SQEContext

SQEContextFromRaw creates an SQEContext from a raw uint64 value. Used when decoding CQE.userData.

func (SQEContext) BufGroup

func (c SQEContext) BufGroup() uint16

BufGroup returns the buffer group index.

func (SQEContext) ExtSQE

func (c SQEContext) ExtSQE() *ExtSQE

ExtSQE returns the extended pointer stored in `c`.

func (SQEContext) FD

func (c SQEContext) FD() int32

FD returns the sign-extended 30-bit file descriptor.

func (SQEContext) Flags

func (c SQEContext) Flags() uint8

Flags returns the `IOSQE_*` flags.

func (SQEContext) HasBufferSelect

func (c SQEContext) HasBufferSelect() bool

HasBufferSelect reports whether the IOSQE_BUFFER_SELECT flag is set. Only valid for Direct mode contexts.

func (SQEContext) IndirectSQE

func (c SQEContext) IndirectSQE() *IndirectSQE

IndirectSQE returns the indirect pointer stored in `c`.

func (SQEContext) IsDirect

func (c SQEContext) IsDirect() bool

IsDirect reports whether this is a Direct mode context (inline data).

func (SQEContext) IsExtended

func (c SQEContext) IsExtended() bool

IsExtended reports whether this is an Extended mode context (pointer to 128B).

func (SQEContext) IsIndirect

func (c SQEContext) IsIndirect() bool

IsIndirect reports whether this is an Indirect mode context (pointer to 64B).

func (SQEContext) Mode

func (c SQEContext) Mode() SQEContext

Mode returns the context mode (Direct, Indirect, Extended, or Reserved).

func (SQEContext) Op

func (c SQEContext) Op() uint8

Op returns the `IORING_OP_*` opcode.

func (SQEContext) Raw

func (c SQEContext) Raw() uint64

Raw returns the underlying uint64 value for direct use in SQE.userData.

func (SQEContext) WithBufGroup

func (c SQEContext) WithBufGroup(bufGroup uint16) SQEContext

WithBufGroup returns a new context with the buffer group replaced. For Direct mode, modifies the inline bits. For Indirect/Extended modes, writes to the pointed-to SQE struct.

func (SQEContext) WithFD

func (c SQEContext) WithFD(fd iofd.FD) SQEContext

WithFD returns a new context with the file descriptor replaced. For Direct mode, modifies the inline bits. For Indirect/Extended modes, writes to the pointed-to SQE struct.

func (SQEContext) WithFlags

func (c SQEContext) WithFlags(flags uint8) SQEContext

WithFlags returns a new context with the flags replaced. For Direct mode, modifies the inline bits. For Indirect/Extended modes, writes to the pointed-to SQE struct.

func (SQEContext) WithOp

func (c SQEContext) WithOp(op uint8) SQEContext

WithOp returns a new context with the opcode replaced. For Direct mode, modifies the inline bits. For Indirect/Extended modes, writes to the pointed-to SQE struct.

type SQEView

type SQEView struct {
	// contains filtered or unexported fields
}

SQEView provides read-only access to submission queue entry fields. It wraps the internal ioUringSqe structure and exposes its fields through public accessor methods.

Usage

cqe := cqes[i]
if cqe.FullSQE() {
    sqe := cqe.SQE()
    op := sqe.Opcode()
    fd := sqe.FD()
    addr := sqe.Addr()
    // ...
}

Field Layout (64 bytes)

┌────────────┬────────────┬────────────┬────────────┐
│ opcode (1) │ flags (1)  │ ioprio (2) │   fd (4)   │  bytes 0-7
├────────────┴────────────┴────────────┴────────────┤
│                    off (8)                        │  bytes 8-15
├───────────────────────────────────────────────────┤
│                   addr (8)                        │  bytes 16-23
├────────────────────────┬──────────────────────────┤
│       len (4)          │      uflags (4)          │  bytes 24-31
├────────────────────────┴──────────────────────────┤
│                  userData (8)                     │  bytes 32-39
├────────────┬───────────┬──────────────────────────┤
│bufIndex(2) │person.(2) │     spliceFdIn (4)       │  bytes 40-47
├────────────┴───────────┴──────────────────────────┤
│                   pad[0] (8)                      │  bytes 48-55
├───────────────────────────────────────────────────┤
│                   pad[1] (8)                      │  bytes 56-63
└───────────────────────────────────────────────────┘

func ViewExtSQE

func ViewExtSQE(ext *ExtSQE) SQEView

ViewExtSQE creates an SQEView from an ExtSQE.

func ViewSQE

func ViewSQE(indirect *IndirectSQE) SQEView

ViewSQE creates an SQEView from an IndirectSQE.

func (SQEView) Addr

func (v SQEView) Addr() uint64

Addr returns the address/pointer field. Interpretation depends on the operation:

  • For read/write: buffer address
  • For accept: sockaddr pointer
  • For socket: protocol

func (SQEView) BufGroup

func (v SQEView) BufGroup() uint16

BufGroup returns the buffer group ID. Only valid when IOSQE_BUFFER_SELECT flag is set.

func (SQEView) BufIndex

func (v SQEView) BufIndex() uint16

BufIndex returns the buffer index for fixed buffer operations.

func (SQEView) FD

func (v SQEView) FD() iofd.FD

FD returns the file descriptor for the operation.

func (SQEView) FileIndex

func (v SQEView) FileIndex() uint32

FileIndex returns the file index for direct file operations. This is a union with spliceFdIn.

func (SQEView) Flags

func (v SQEView) Flags() uint8

Flags returns the IOSQE_* submission flags.

func (SQEView) HasAsync

func (v SQEView) HasAsync() bool

HasAsync reports whether IOSQE_ASYNC flag is set.

func (SQEView) HasBufferSelect

func (v SQEView) HasBufferSelect() bool

HasBufferSelect reports whether IOSQE_BUFFER_SELECT flag is set.

func (SQEView) HasCQESkipSuccess

func (v SQEView) HasCQESkipSuccess() bool

HasCQESkipSuccess reports whether IOSQE_CQE_SKIP_SUCCESS flag is set.

func (SQEView) HasFixedFile

func (v SQEView) HasFixedFile() bool

HasFixedFile reports whether IOSQE_FIXED_FILE flag is set.

func (SQEView) HasIODrain

func (v SQEView) HasIODrain() bool

HasIODrain reports whether IOSQE_IO_DRAIN flag is set.

func (v SQEView) HasIOHardlink() bool

HasIOHardlink reports whether IOSQE_IO_HARDLINK flag is set.

func (v SQEView) HasIOLink() bool

HasIOLink reports whether IOSQE_IO_LINK flag is set.

func (SQEView) IoPrio

func (v SQEView) IoPrio() uint16

IoPrio returns the I/O priority.

func (SQEView) Len

func (v SQEView) Len() uint32

Len returns the length field. Interpretation depends on the operation:

  • For read/write: buffer length
  • For accept: file flags
  • For socket: socket type

func (SQEView) Off

func (v SQEView) Off() uint64

Off returns the offset field. Interpretation depends on the operation:

  • For read/write: file offset
  • For accept: sockaddr length pointer
  • For timeout: count or flags

func (SQEView) Opcode

func (v SQEView) Opcode() uint8

Opcode returns the IORING_OP_* operation code.

func (SQEView) Personality

func (v SQEView) Personality() uint16

Personality returns the personality ID for credentials.

func (SQEView) RawFD

func (v SQEView) RawFD() int32

RawFD returns the raw file descriptor as int32.

func (SQEView) SpliceFDIn

func (v SQEView) SpliceFDIn() int32

SpliceFDIn returns the splice input file descriptor.

func (SQEView) UFlags

func (v SQEView) UFlags() uint32

UFlags returns the union flags field. Interpretation depends on the operation:

  • For read/write: RW flags
  • For poll: poll events
  • For timeout: timeout flags
  • For accept: accept flags
  • For open: open flags
  • For send/recv: msg flags

func (SQEView) UserData

func (v SQEView) UserData() uint64

UserData returns the user data field (the packed SQEContext).

func (SQEView) Valid

func (v SQEView) Valid() bool

Valid reports whether the view points to a valid SQE.

type ScopedExtSQE

type ScopedExtSQE struct {
	Ext *ExtSQE
	// contains filtered or unexported fields
}

ScopedExtSQE manages ExtSQE lifecycle to prevent pool leaks. It ensures the ExtSQE is returned to the pool on error paths. The wrapped ExtSQE is borrowed until either Submitted transfers release responsibility to completion handling or Release returns it to the pool.

Usage pattern:

scope := ring.NewScopedExtSQE()
if scope.Ext == nil {
    return iox.ErrWouldBlock
}
defer scope.Release()  // Returns to pool if not submitted

// ... setup ExtSQE ...

if err := ring.Submit(ctx); err != nil {
    return err  // Release() will return ExtSQE to pool
}
scope.Submitted()  // Mark as submitted, Release() becomes no-op

func (*ScopedExtSQE) Release

func (s *ScopedExtSQE) Release()

Release returns the ExtSQE to the pool if not submitted. This is safe to call multiple times and after Submitted(). After Release, the ExtSQE and every typed/raw view derived from it are invalid.

func (*ScopedExtSQE) Submitted

func (s *ScopedExtSQE) Submitted()

Submitted marks the ExtSQE as submitted. After this call, Release() becomes a no-op. The CQE handler is responsible for returning the ExtSQE to the pool, and any state that must outlive PutExtSQE must be copied before release.

func (*ScopedExtSQE) Valid

func (s *ScopedExtSQE) Valid() bool

Valid reports whether the scope has a valid ExtSQE.

type SendTargets

type SendTargets interface {
	// Count returns the number of targets.
	Count() int
	// FD returns the file descriptor at index i.
	FD(i int) iofd.FD
}

SendTargets represents a set of target sockets for multicast/broadcast.

type SmallBuffer

type SmallBuffer = iobuf.SmallBuffer

Buffer types re-exported from iobuf.

type Sockaddr

type Sockaddr = sock.Sockaddr

Sockaddr is the socket address interface used by socket operations.

func AddrToSockaddr

func AddrToSockaddr(addr Addr) Sockaddr

AddrToSockaddr converts an Addr to a Sockaddr. Submission paths that stage the result keep the returned Sockaddr alive.

type Socket

type Socket = sock.Socket

Socket is the minimal interface for socket operations.

type Statx

type Statx struct {
	Mask       uint32
	Blksize    uint32
	Attributes uint64
	Nlink      uint32
	Uid        uint32
	Gid        uint32
	Mode       uint16

	Ino              uint64
	Size             uint64
	Blocks           uint64
	Attributes_mask  uint64
	Atime            StatxTimestamp
	Btime            StatxTimestamp
	Ctime            StatxTimestamp
	Mtime            StatxTimestamp
	Rdev_major       uint32
	Rdev_minor       uint32
	Dev_major        uint32
	Dev_minor        uint32
	Mnt_id           uint64
	Dio_mem_align    uint32
	Dio_offset_align uint32
	// contains filtered or unexported fields
}

Statx represents the statx structure. Layout matches struct statx in Linux.

type StatxTimestamp

type StatxTimestamp struct {
	Sec  int64
	Nsec uint32
	// contains filtered or unexported fields
}

StatxTimestamp represents a timestamp in statx.

type SubscriptionState

type SubscriptionState uint32

SubscriptionState reports where a multishot subscription is in its lifecycle.

Typestate model:

The valid operations depend on the current state. In a language with built-in typestate support, each state would be a distinct type.

State machine:

┌──────────────────────────────────────────────────────────────────┐ │ │ │ ┌──────────┐ Cancel() ┌──────────────┐ │ │ │ Active │ ────────────▶ │ Cancelling │ │ │ └────┬─────┘ └──────┬───────┘ │ │ │ │ │ │ │ Final CQE (!MORE) │ Terminal CQE │ │ │ │ │ │ ▼ ▼ │ │ ┌─────────────────────────────────────────────────────────┐ │ │ │ Stopped │ │ │ │ (Terminal/Absorbing State) │ │ │ └─────────────────────────────────────────────────────────┘ │ │ │ └──────────────────────────────────────────────────────────────────┘

Available operations:

State | Allowed Operations -------------|---------------------------------------------------- Active | Cancel(), OnMultishotStep(), Active(), State() Cancelling | OnMultishotStep(), Active(), State() Stopped | Active(), State() [returns false/Stopped]

Invariants:

  • Active → Cancelling: Only via Cancel() with CAS
  • Active → Stopped: Only via a terminal CQE (success or error)
  • Cancelling → Stopped: Only via the terminal CQE after cancel wins the race
  • Stopped is absorbing: No outgoing transitions
const (
	// SubscriptionActive receives completions and can still cancel.
	SubscriptionActive SubscriptionState = iota

	// SubscriptionCancelling has enqueued cancel and awaits terminal CQE.
	SubscriptionCancelling

	// SubscriptionStopped is terminal; no more callbacks will start.
	SubscriptionStopped
)

type Timespec

type Timespec = zcall.Timespec

Timespec represents a time value with nanosecond precision. Layout matches struct timespec in Linux.

type TitanBuffer

type TitanBuffer = iobuf.TitanBuffer

Buffer types re-exported from iobuf.

type UnderlyingProtocol

type UnderlyingProtocol = sock.UnderlyingProtocol

UnderlyingProtocol represents the transport protocol.

type Uring

type Uring struct {
	*Options
	// Features reports actual ring sizing and userdata metadata.
	Features *Features
	// contains filtered or unexported fields
}

Uring is the main io_uring interface for submitting and completing I/O operations. It wraps the kernel io_uring instance with buffer management and typed operations. Default rings use the single-issuer fast path, so submit-state operations are not safe for concurrent use by multiple goroutines; caller must serialize submit, Wait/enter, Stop, and ResizeRings unless MultiIssuers is enabled.

func New

func New(options ...OptionFunc) (*Uring, error)

New creates a new io_uring instance with the specified options. Returns an unstarted ring; call Start() to initialize buffers and enable.

func (*Uring) Accept

func (ur *Uring) Accept(sqeCtx SQEContext, options ...OpOptionFunc) error

Accept accepts a new connection from a listener socket. The fd in sqeCtx should be set to the listener socket.

func (*Uring) AcceptDirect

func (ur *Uring) AcceptDirect(sqeCtx SQEContext, fileIndex uint32, options ...OpOptionFunc) error

AcceptDirect accepts a connection directly into a registered file table slot. The fileIndex specifies which slot to use (0-based), or use IORING_FILE_INDEX_ALLOC for auto-allocation (the allocated index is returned in CQE res). SOCK_CLOEXEC is not supported with direct accept. Requires registered files via RegisterFiles or RegisterFilesSparse.

func (*Uring) AcceptMultishot

func (ur *Uring) AcceptMultishot(sqeCtx SQEContext, handler MultishotHandler, options ...OpOptionFunc) (*MultishotSubscription, error)

AcceptMultishot starts a multishot accept subscription. The handler receives one `MultishotStep` per CQE and one terminal stop. `step.CQE.Res` contains the accepted fd on success. Options may add SQE control flags and accept-specific ioprio flags such as `IORING_ACCEPT_POLL_FIRST`.

func (*Uring) AsyncCancel

func (ur *Uring) AsyncCancel(sqeCtx SQEContext, targetUserData uint64, options ...OpOptionFunc) error

AsyncCancel cancels a pending async operation.

func (*Uring) AsyncCancelAll

func (ur *Uring) AsyncCancelAll(sqeCtx SQEContext, options ...OpOptionFunc) error

AsyncCancelAll cancels all pending operations. Returns the count of cancelled operations.

func (*Uring) AsyncCancelAny

func (ur *Uring) AsyncCancelAny(sqeCtx SQEContext, options ...OpOptionFunc) error

AsyncCancelAny cancels any one pending operation. Returns 0 on success, -ENOENT if no operations pending.

func (*Uring) AsyncCancelFD

func (ur *Uring) AsyncCancelFD(sqeCtx SQEContext, cancelAll bool, options ...OpOptionFunc) error

AsyncCancelFD cancels operations on a specific file descriptor. If cancelAll is true, cancels all matching operations and returns count. Otherwise cancels the first matching operation. The FD to cancel is taken from sqeCtx.FD().

func (*Uring) AsyncCancelOpcode

func (ur *Uring) AsyncCancelOpcode(sqeCtx SQEContext, opcode uint8, cancelAll bool, options ...OpOptionFunc) error

AsyncCancelOpcode cancels operations of a specific opcode type. If cancelAll is true, cancels all matching operations and returns count. Otherwise cancels the first matching operation.

func (*Uring) Bind

func (ur *Uring) Bind(sqeCtx SQEContext, addr Addr, options ...OpOptionFunc) error

Bind binds a socket to an address.

func (*Uring) BundleIterator

func (ur *Uring) BundleIterator(cqe CQEView, group uint16) (BundleIterator, bool)

BundleIterator constructs a BundleIterator for the given CQE and buffer group. The group parameter is the group ID as used during buffer ring registration. Returns a zero BundleIterator and false if the group is not registered or the CQE has no data.

The returned iterator borrows ring-owned backing memory. Process the buffers before calling Recycle, and call Recycle from a single goroutine without racing other buffer-ring recycle/advance activity on the same Uring.

func (*Uring) CQPending

func (ur *Uring) CQPending() int

CQPending returns the number of CQEs waiting to be reaped. Higher layers can use this to decide when to drain completions. On single-issuer rings it is not safe for concurrent use with ResizeRings; caller must serialize those operations.

func (*Uring) CloneBuffers

func (ur *Uring) CloneBuffers(srcFD int, srcOff, dstOff, count uint32, replace bool) error

CloneBuffers clones registered buffers from another io_uring ring. This enables buffer sharing between multiple rings.

Parameters:

  • srcFD: source ring file descriptor
  • srcOff: source buffer offset
  • dstOff: destination buffer offset
  • count: number of buffers to clone
  • replace: if true, replace existing buffers at destination offset

Requires kernel 6.19+.

func (*Uring) CloneBuffersFromRegistered

func (ur *Uring) CloneBuffersFromRegistered(srcRegisteredIdx int, srcOff, dstOff, count uint32, replace bool) error

CloneBuffersFromRegistered clones buffers from a registered ring. The source ring must be registered with IORING_REGISTER_RING_FDS. Requires kernel 6.19+.

func (*Uring) Close

func (ur *Uring) Close(sqeCtx SQEContext, options ...OpOptionFunc) error

Close submits `IORING_OP_CLOSE` for the file descriptor carried in sqeCtx. It closes a target fd; it does not tear down the Uring instance.

func (*Uring) Connect

func (ur *Uring) Connect(sqeCtx SQEContext, remote Addr, options ...OpOptionFunc) error

Connect initiates a socket connection to a remote address.

func (*Uring) EpollWait

func (ur *Uring) EpollWait(sqeCtx SQEContext, events []EpollEvent, timeout int32, options ...OpOptionFunc) error

EpollWait performs an epoll_wait operation via io_uring. The opcode does not accept an inline timeout; timeout must be 0 and callers should use LinkTimeout when they need a deadline. Caller must keep events valid until the operation completes.

func (*Uring) ExtSQE

func (ur *Uring) ExtSQE() *ExtSQE

ExtSQE acquires an ExtSQE from the pool for Extended mode submissions. Returns nil if the pool is exhausted (ring is full - natural backpressure). The returned ExtSQE is borrowed until PutExtSQE after the corresponding CQE is processed. Callers must not retain pointers into SQE or UserData after release.

func (*Uring) FGetXattr

func (ur *Uring) FGetXattr(sqeCtx SQEContext, name string, value []byte, options ...OpOptionFunc) error

FGetXattr gets an extended attribute from a file descriptor. The result length is returned in the CQE.

func (*Uring) FSetXattr

func (ur *Uring) FSetXattr(sqeCtx SQEContext, name string, value []byte, flags int, options ...OpOptionFunc) error

FSetXattr sets an extended attribute on a file descriptor.

func (*Uring) FTruncate

func (ur *Uring) FTruncate(sqeCtx SQEContext, length int64, options ...OpOptionFunc) error

FTruncate truncates a file to the specified length.

func (*Uring) Fallocate

func (ur *Uring) Fallocate(sqeCtx SQEContext, mode uint32, offset int64, length int64, options ...OpOptionFunc) error

Fallocate allocates space for a file.

func (*Uring) FileAdvise

func (ur *Uring) FileAdvise(sqeCtx SQEContext, offset int64, length int, advice int, options ...OpOptionFunc) error

FileAdvise provides advice about file access patterns.

func (*Uring) FilesUpdate

func (ur *Uring) FilesUpdate(sqeCtx SQEContext, fds []int32, offset int, options ...OpOptionFunc) error

FilesUpdate updates registered files at the specified offset. The fds slice contains the new file descriptors to register. Use -1 to unregister a slot.

func (*Uring) FixedFdInstall

func (ur *Uring) FixedFdInstall(sqeCtx SQEContext, fixedIndex int, flags uint32, options ...OpOptionFunc) error

FixedFdInstall installs a fixed (registered) file descriptor into the normal file descriptor table. Returns the new fd in the CQE result. The fixedIndex is the index in the registered files table.

func (*Uring) FutexWait

func (ur *Uring) FutexWait(sqeCtx SQEContext, addr *uint32, val uint64, mask uint64, flags uint32, options ...OpOptionFunc) error

FutexWait submits an async futex wait operation. Waits until the value at addr matches val, using the specified mask and flags. Caller must keep addr valid until the operation completes.

func (*Uring) FutexWaitV

func (ur *Uring) FutexWaitV(sqeCtx SQEContext, waitv unsafe.Pointer, count uint32, flags uint32, options ...OpOptionFunc) error

FutexWaitV submits a vectored futex wait operation. Waits on multiple futexes simultaneously. The waitv pointer should point to a struct futex_waitv array with count elements. Caller must keep waitv valid until the operation completes.

func (*Uring) FutexWake

func (ur *Uring) FutexWake(sqeCtx SQEContext, addr *uint32, val uint64, mask uint64, flags uint32, options ...OpOptionFunc) error

FutexWake submits an async futex wake operation. Wakes up to val waiters on the futex at addr, using the specified mask and flags. Caller must keep addr valid until the operation completes.

func (*Uring) GetXattr

func (ur *Uring) GetXattr(sqeCtx SQEContext, path, name string, value []byte, options ...OpOptionFunc) error

GetXattr gets an extended attribute from a path. The result length is returned in the CQE.

func (*Uring) IndirectSQE

func (ur *Uring) IndirectSQE() *IndirectSQE

IndirectSQE acquires an IndirectSQE from the pool for Indirect mode submissions. Returns nil if the pool is exhausted. The returned IndirectSQE is borrowed until PutIndirectSQE.

func (*Uring) LinkAt

func (ur *Uring) LinkAt(sqeCtx SQEContext, oldDirfd int, oldPath, newPath string, flags int, options ...OpOptionFunc) error

LinkAt creates a hard link.

func (*Uring) LinkTimeout

func (ur *Uring) LinkTimeout(sqeCtx SQEContext, d time.Duration, options ...OpOptionFunc) error

LinkTimeout creates a linked timeout operation.

func (*Uring) Listen

func (ur *Uring) Listen(sqeCtx SQEContext, options ...OpOptionFunc) error

Listen starts listening on a socket.

func (*Uring) MkdirAt

func (ur *Uring) MkdirAt(sqeCtx SQEContext, path string, mode uint32, options ...OpOptionFunc) error

MkdirAt creates a directory.

func (*Uring) MsgRing

func (ur *Uring) MsgRing(sqeCtx SQEContext, userData int64, result int32, options ...OpOptionFunc) error

MsgRing sends a message to another io_uring instance. The sqeCtx.FD() should be the target ring's file descriptor. userData and result are passed to the target ring's CQE.

func (*Uring) MsgRingFD

func (ur *Uring) MsgRingFD(sqeCtx SQEContext, srcFD uint32, dstSlot uint32, userData int64, skipCQE bool, options ...OpOptionFunc) error

MsgRingFD transfers a fixed file descriptor to another io_uring instance. This is useful for multi-ring architectures where one ring accepts connections and passes them to worker rings.

Parameters:

  • sqeCtx: Context with target ring's FD (from RingFD())
  • srcFD: Fixed file index in the source ring (this ring)
  • dstSlot: Fixed file slot in the target ring to install the FD
  • userData: Value passed to the target ring's CQE
  • skipCQE: If true, no CQE is posted to the target ring

Both rings must have registered file tables.

func (*Uring) Multicast

func (ur *Uring) Multicast(sqeCtx SQEContext, targets SendTargets, bufIndex int, p []byte, offset int64, n int, options ...OpOptionFunc) error

Multicast sends data to multiple sockets, selecting copy vs zero-copy per message size.

Strategy selection (conservative thresholds, based on Linux 6.18 measurements): io_uring cycle ~523ns, ZC needs 2 cycles (~1046ns overhead). Uses zero-copy only when memcpy savings clearly exceed overhead:

  • N < 8: >= 8 KiB uses zero-copy (high bar, overhead not amortized)
  • N < 64: >= 4 KiB uses zero-copy
  • N < 512: >= 3 KiB uses zero-copy
  • N < 4096: >= 2 KiB uses zero-copy
  • N >= 4096: >= 1.5 KiB uses zero-copy (fully amortized)

For aggressive zero-copy usage, use MulticastZeroCopy instead.

Zero-copy notes:

  • Produces two CQEs per send: completion (IORING_CQE_F_MORE) + notification
  • Buffer must not be modified until notification CQE is received
  • Requires TCP sockets; returns EOPNOTSUPP on Unix sockets or loopback

Parameters:

  • sqeCtx: base context (FD will be overwritten per target)
  • targets: collection of target sockets
  • bufIndex: registered buffer index (use -1 for non-registered buffer)
  • p: payload data (used when bufIndex < 0)
  • offset: offset within buffer
  • n: number of bytes to send

func (*Uring) MulticastZeroCopy

func (ur *Uring) MulticastZeroCopy(sqeCtx SQEContext, targets SendTargets, bufIndex int, offset int64, n int, options ...OpOptionFunc) error

MulticastZeroCopy sends data to multiple sockets using zero-copy with registered buffers. This method uses very aggressive thresholds - user explicitly requested zero-copy.

Very aggressive thresholds (use ZC whenever there's any reasonable chance of benefit):

  • N < 4: >= 1.5 KiB uses zero-copy (minimal bar)
  • N < 16: >= 1 KiB uses zero-copy
  • N < 64: >= 512 B uses zero-copy
  • N < 256: >= 128 B uses zero-copy
  • N >= 256: any size uses zero-copy (fully amortized)

For conservative zero-copy usage, use Multicast instead.

Prerequisites:

  • Buffer must be registered via IORING_REGISTER_BUFFERS2
  • bufIndex must be a valid registered buffer index

Use this for:

  • Live streaming (same video/audio frame to thousands of viewers)
  • Real-time gaming (same game state to many players)
  • Any scenario with O(1) payload and O(N) targets

Zero-copy notes:

  • Produces two CQEs per send: completion (IORING_CQE_F_MORE) + notification
  • Buffer must not be modified until notification CQE is received
  • May return EOPNOTSUPP on Unix sockets or loopback

func (*Uring) MustExtSQE

func (ur *Uring) MustExtSQE() *ExtSQE

MustExtSQE gets an ExtSQE from the pool, panicking if exhausted. Use only in contexts where pool exhaustion is a programming error.

func (*Uring) NAPIAddStaticID

func (ur *Uring) NAPIAddStaticID(napiID uint32) error

NAPIAddStaticID adds a NAPI ID for static tracking mode. Use this when IO_URING_NAPI_TRACKING_STATIC is enabled. Requires kernel 6.19+.

func (*Uring) NAPIDelStaticID

func (ur *Uring) NAPIDelStaticID(napiID uint32) error

NAPIDelStaticID removes a NAPI ID from static tracking. Use this when IO_URING_NAPI_TRACKING_STATIC is enabled. Requires kernel 6.19+.

func (*Uring) NewScopedExtSQE

func (ur *Uring) NewScopedExtSQE() ScopedExtSQE

NewScopedExtSQE gets an ExtSQE from the pool wrapped in a scope. Returns a scope with nil Ext if the pool is exhausted. The borrowed ExtSQE and all derived UserData views remain valid only until Submitted or Release closes the scope.

func (*Uring) Nop

func (ur *Uring) Nop(sqeCtx SQEContext, options ...OpOptionFunc) error

Nop submits a no-op request.

func (*Uring) Nop128

func (ur *Uring) Nop128(sqeCtx SQEContext, options ...OpOptionFunc) error

Nop128 submits a 128-byte NOP operation. Requires a ring created with Options.SQE128; otherwise it returns ErrNotSupported.

func (*Uring) OpenAt

func (ur *Uring) OpenAt(sqeCtx SQEContext, pathname string, openFlags int, mode uint32, options ...OpOptionFunc) error

OpenAt opens a file at the given path relative to a directory fd.

func (*Uring) Pipe

func (ur *Uring) Pipe(sqeCtx SQEContext, fds *[2]int32, pipeFlags uint32, options ...OpOptionFunc) error

Pipe creates a pipe using io_uring. The fds parameter must point to an int32[2] array where the kernel will write the read end (fds[0]) and write end (fds[1]) file descriptors. On successful completion, fds[0] will be the read end and fds[1] the write end. Caller must keep fds valid until the operation completes.

func (*Uring) PollAdd

func (ur *Uring) PollAdd(sqeCtx SQEContext, events int, options ...OpOptionFunc) error

PollAdd adds a file descriptor to the poll set.

func (*Uring) PollAddLevel

func (ur *Uring) PollAddLevel(sqeCtx SQEContext, events int, options ...OpOptionFunc) error

PollAddLevel adds a level-triggered poll request. Unlike edge-triggered poll which fires once when state changes, level-triggered poll fires continuously while the condition is true.

func (*Uring) PollAddMultishot

func (ur *Uring) PollAddMultishot(sqeCtx SQEContext, events int, options ...OpOptionFunc) error

PollAddMultishot adds a persistent poll request that generates multiple CQEs. Unlike PollAdd which requires re-submission after each event, multishot poll automatically re-arms and continues generating CQEs until cancelled. Each CQE has IORING_CQE_F_MORE set while poll continues; the final CQE has !IORING_CQE_F_MORE when poll terminates or is cancelled.

func (*Uring) PollAddMultishotLevel

func (ur *Uring) PollAddMultishotLevel(sqeCtx SQEContext, events int, options ...OpOptionFunc) error

PollAddMultishotLevel combines multishot and level-triggered modes. This creates a persistent, level-triggered poll subscription.

func (*Uring) PollRemove

func (ur *Uring) PollRemove(sqeCtx SQEContext, options ...OpOptionFunc) error

PollRemove removes a file descriptor from the poll set.

func (*Uring) PollUpdate

func (ur *Uring) PollUpdate(sqeCtx SQEContext, oldUserData, newUserData uint64, newEvents, updateFlags int, options ...OpOptionFunc) error

PollUpdate modifies an existing poll request in-place without cancellation. This atomically updates the poll events and/or userData of an active poll.

Parameters:

  • sqeCtx: Context for this update operation
  • oldUserData: userData of the target poll request to update
  • newUserData: New userData (used if updateFlags includes IORING_POLL_UPDATE_USER_DATA)
  • newEvents: New poll events (used if updateFlags includes IORING_POLL_UPDATE_EVENTS)
  • updateFlags: Combination of: IORING_POLL_UPDATE_EVENTS - update the poll event mask IORING_POLL_UPDATE_USER_DATA - update the userData IORING_POLL_ADD_MULTI - make the updated poll multishot

The poll is located by matching oldUserData. If no matching poll is found, the operation returns -ENOENT. If updateFlags is 0 (or only ADD_MULTI without UPDATE_EVENTS or UPDATE_USER_DATA), the operation behaves as PollRemove.

func (*Uring) PutExtSQE

func (ur *Uring) PutExtSQE(sqe *ExtSQE)

PutExtSQE returns an ExtSQE to the pool after completion processing. Must be called exactly once per ExtSQE to maintain pool balance. After this call the ExtSQE, typed context views, and raw CastUserData overlays derived from it are invalid.

func (*Uring) PutIndirectSQE

func (ur *Uring) PutIndirectSQE(sqe *IndirectSQE)

PutIndirectSQE returns an IndirectSQE to the pool. After this call the IndirectSQE is invalid and must not be reused.

func (*Uring) QueryOpcodes

func (ur *Uring) QueryOpcodes() (*QueryOpcode, error)

QueryOpcodes queries the kernel for supported io_uring operations. Returns detailed information about supported opcodes, features, and flags. Requires kernel 6.19+.

func (*Uring) QuerySCQ

func (ur *Uring) QuerySCQ() (*QuerySCQ, error)

QuerySCQ queries the kernel for SQ/CQ ring information. Returns header size and alignment requirements for shared rings. Requires kernel 6.19+.

func (*Uring) QueryZCRX

func (ur *Uring) QueryZCRX() (*QueryZCRX, error)

QueryZCRX queries the kernel for ZCRX (zero-copy receive) capabilities. Returns information about supported ZCRX features and configuration. Requires kernel 6.19+.

func (*Uring) Read

func (ur *Uring) Read(sqeCtx SQEContext, b []byte, options ...OpOptionFunc) error

Read performs a read operation. Caller must keep b valid until the operation completes.

func (*Uring) ReadFixed

func (ur *Uring) ReadFixed(sqeCtx SQEContext, bufIndex int, options ...OpOptionFunc) ([]byte, error)

ReadFixed performs a read with a registered (fixed) buffer.

func (*Uring) ReadV

func (ur *Uring) ReadV(sqeCtx SQEContext, iovs [][]byte, options ...OpOptionFunc) error

ReadV performs a vectored read operation. Caller must keep iovs and their backing buffers valid until the operation completes.

func (*Uring) ReadvFixed

func (ur *Uring) ReadvFixed(sqeCtx SQEContext, offset int64, bufIndices []int, options ...OpOptionFunc) error

ReadvFixed performs a vectored read using registered buffers. All buffer indices must refer to previously registered buffers.

func (*Uring) Receive

func (ur *Uring) Receive(sqeCtx SQEContext, pollFD PollFd, b []byte, options ...OpOptionFunc) error

Receive performs a socket receive operation with MSG_WAITALL semantics. If b is nil, it uses buffer selection from the kernel-provided buffer ring. If b is non-nil, caller must keep b valid until the operation completes.

func (*Uring) ReceiveBundle

func (ur *Uring) ReceiveBundle(sqeCtx SQEContext, pollFD PollFd, options ...OpOptionFunc) error

ReceiveBundle performs a bundle receive operation. Grabs multiple contiguous buffers from the buffer group in a single operation. The CQE result contains bytes received; use BundleBuffers() to get buffer range. Always uses buffer selection from the kernel-provided buffer ring.

func (*Uring) ReceiveMultishot

func (ur *Uring) ReceiveMultishot(sqeCtx SQEContext, handler MultishotHandler, options ...OpOptionFunc) (*MultishotSubscription, error)

ReceiveMultishot starts a multishot receive subscription with buffer selection. The handler receives one `MultishotStep` per CQE and one terminal stop. Use `step.CQE.BufID()` for the buffer ID and `step.CQE.Res` for the byte count. Options may add SQE control flags and recv-specific ioprio flags such as `IORING_RECVSEND_POLL_FIRST`.

func (*Uring) ReceiveZeroCopy

func (ur *Uring) ReceiveZeroCopy(sqeCtx SQEContext, pollFD PollFd, n int, zcrxIfqIdx uint32, options ...OpOptionFunc) error

ReceiveZeroCopy performs a zero-copy receive operation. Supported Linux 6.18+ kernels provide the opcode; the operation still requires a registered ZCRX interface queue. zcrxIfqIdx is the queue index returned by RegisterZCRXIfq.

func (*Uring) RecvMsg

func (ur *Uring) RecvMsg(sqeCtx SQEContext, pollFD PollFd, buffers [][]byte, oob []byte, options ...OpOptionFunc) error

RecvMsg receives a message with control data. Caller must keep buffers and oob valid until the operation completes.

func (*Uring) RegisterBufRingIncremental

func (ur *Uring) RegisterBufRingIncremental(entries int, groupID uint16) (*ioUringBufRing, error)

RegisterBufRingIncremental registers a buffer ring in incremental consumption mode. In this mode, buffers can be partially consumed across multiple completions. The CQE will have IORING_CQE_F_BUF_MORE set if more data remains. entries follows the same range and rounding contract as RegisterBufRingMMAP.

func (*Uring) RegisterBufRingMMAP

func (ur *Uring) RegisterBufRingMMAP(entries int, groupID uint16) (*ioUringBufRing, error)

RegisterBufRingMMAP registers a buffer ring with kernel-allocated memory. The kernel allocates the ring memory and the application uses mmap to access it. entries must be between 1 and 32768; non-power-of-two values round up. Returns the buffer ring for adding buffers.

func (*Uring) RegisterBufRingWithFlags

func (ur *Uring) RegisterBufRingWithFlags(entries int, groupID uint16, flags uint16) (*ioUringBufRing, error)

RegisterBufRingWithFlags registers a buffer ring with specified flags. Flags can be combined: IOU_PBUF_RING_MMAP | IOU_PBUF_RING_INC entries must be between 1 and 32768; non-power-of-two values round up.

func (*Uring) RegisterFiles

func (ur *Uring) RegisterFiles(fds []int32) error

RegisterFiles registers file descriptors for use with IOSQE_FIXED_FILE flag. Registered files bypass per-operation fget/fput kernel calls.

Once registered, use the file index (0-based) instead of the fd in SQEs, and set the IOSQE_FIXED_FILE flag.

Returns ErrExists if files are already registered. Use UnregisterFiles before re-registering.

func (*Uring) RegisterFilesSparse

func (ur *Uring) RegisterFilesSparse(count uint32) error

RegisterFilesSparse allocates a sparse file table of the given size. All entries are initially empty (-1) and can be populated dynamically using RegisterFilesUpdate.

Sparse registration is useful for applications that manage a dynamic set of file descriptors (e.g., connection pools, file caches).

func (*Uring) RegisterFilesUpdate

func (ur *Uring) RegisterFilesUpdate(offset uint32, fds []int32) error

RegisterFilesUpdate updates registered files at the specified offset. Use -1 to clear a slot, or a valid fd to set it.

This allows dynamic management of registered files without unregistering and re-registering the entire table.

func (*Uring) RegisterMemRegion

func (ur *Uring) RegisterMemRegion(region *RegionDesc, flags uint64) error

RegisterMemRegion registers a memory region with the io_uring ring. Memory regions enable shared access between user space and kernel. Requires kernel 6.19+.

func (*Uring) RegisterNAPI

func (ur *Uring) RegisterNAPI(busyPollTimeout uint32, preferBusyPoll bool, strategy uint32) error

RegisterNAPI enables NAPI busy polling for network operations. NAPI (New API) enables kernel-side batched network packet processing.

Parameters:

  • busyPollTimeout: timeout in microseconds for busy polling
  • preferBusyPoll: if true, prefer busy poll over sleeping
  • strategy: IO_URING_NAPI_TRACKING_* strategy

Requires kernel 6.19+.

func (*Uring) RegisterZCRXIfq

func (ur *Uring) RegisterZCRXIfq(ifIdx, ifRxq uint32, rqEntries uint32, area *ZCRXAreaReg, region *RegionDesc, rxBufLen uint32) (uint32, ZCRXOffsets, error)

RegisterZCRXIfq registers a zero-copy receive interface queue. region must describe the refill-ring memory (IORING_MEM_REGION_TYPE_USER). rxBufLen is the desired receive buffer chunk size; 0 means page size. Returns the ZCRX instance ID and kernel-reported refill ring offsets on success. Requires the Linux 6.18+ baseline and ZCRX-capable network hardware.

func (*Uring) RegisteredBuffer

func (ur *Uring) RegisteredBuffer(index int) []byte

RegisteredBuffer returns the registered buffer at the given index. Returns nil if the index is out of range. The returned slice shares memory with the kernel; writes are visible to zero-copy operations using the same buffer index.

func (*Uring) RegisteredBufferCount

func (ur *Uring) RegisteredBufferCount() int

RegisteredBufferCount returns the number of registered buffers.

func (*Uring) RegisteredFileCount

func (ur *Uring) RegisteredFileCount() int

RegisteredFileCount returns the number of registered files, or 0 if none.

func (*Uring) RenameAt

func (ur *Uring) RenameAt(sqeCtx SQEContext, oldPath, newPath string, flags int, options ...OpOptionFunc) error

RenameAt renames a file at a path.

func (*Uring) ResizeRings

func (ur *Uring) ResizeRings(newSQSize, newCQSize uint32) error

ResizeRings resizes the SQ and CQ rings of this io_uring instance. On single-issuer rings it is not safe for concurrent use with submit, Wait/enter, or Stop; caller must serialize those operations. This allows dynamic adjustment of ring sizes without recreating the ring.

Requirements:

  • The ring must be created with IORING_SETUP_DEFER_TASKRUN flag
  • The ring must not be in CQ overflow condition
  • Sizes must be power-of-two values

Parameters:

  • newSQSize: New SQ ring size (0 keeps the current SQ size)
  • newCQSize: New CQ ring size (0 defaults to 2×newSQSize)

Supported on Linux 6.18+.

func (*Uring) RingFD

func (ur *Uring) RingFD() int

RingFD returns the io_uring file descriptor. Required for cross-ring operations via IORING_OP_MSG_RING.

func (*Uring) SCTP4Socket

func (ur *Uring) SCTP4Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

SCTP4Socket creates an SCTP IPv4 socket.

func (*Uring) SCTP4SocketDirect

func (ur *Uring) SCTP4SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

SCTP4SocketDirect creates an SCTP IPv4 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) SCTP6Socket

func (ur *Uring) SCTP6Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

SCTP6Socket creates an SCTP IPv6 socket.

func (*Uring) SCTP6SocketDirect

func (ur *Uring) SCTP6SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

SCTP6SocketDirect creates an SCTP IPv6 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) SQAvailable

func (ur *Uring) SQAvailable() int

SQAvailable returns the number of SQEs available for submission. Higher layers can use this for admission control and backpressure. On single-issuer rings it is not safe for concurrent use with ResizeRings; caller must serialize those operations.

func (*Uring) Send

func (ur *Uring) Send(sqeCtx SQEContext, pollFD PollFd, p []byte, options ...OpOptionFunc) error

Send writes data to a socket. Caller must keep p valid until the operation completes.

func (*Uring) SendMsg

func (ur *Uring) SendMsg(sqeCtx SQEContext, pollFD PollFd, buffers [][]byte, oob []byte, to Addr, options ...OpOptionFunc) error

SendMsg sends a message with control data. Caller must keep buffers and oob valid until the operation completes.

func (*Uring) SetXattr

func (ur *Uring) SetXattr(sqeCtx SQEContext, path, name string, value []byte, flags int, options ...OpOptionFunc) error

SetXattr sets an extended attribute on a path.

func (*Uring) Shutdown

func (ur *Uring) Shutdown(sqeCtx SQEContext, how int, options ...OpOptionFunc) error

Shutdown gracefully closes a socket.

func (*Uring) SocketDirect

func (ur *Uring) SocketDirect(sqeCtx SQEContext, domain, typ, proto int, fileIndex uint32, options ...OpOptionFunc) error

SocketDirect creates a socket directly into a registered file table slot. The fileIndex specifies which slot to use (0-based), or use IORING_FILE_INDEX_ALLOC for auto-allocation (the allocated index is returned in CQE res). Requires registered files via RegisterFiles or RegisterFilesSparse.

func (*Uring) SocketRaw

func (ur *Uring) SocketRaw(sqeCtx SQEContext, domain, typ, proto int, options ...OpOptionFunc) error

SocketRaw creates a socket using io_uring. The fd field in sqeCtx is ignored (will be set to domain by the kernel).

func (*Uring) Splice

func (ur *Uring) Splice(sqeCtx SQEContext, fdIn int, n int, options ...OpOptionFunc) error

Splice transfers data between file descriptors.

func (*Uring) Start

func (ur *Uring) Start() (err error)

Start initializes the io_uring instance with buffers and enables the ring. Context pools are constructed eagerly by New and are intentionally not reset here so any SQEs borrowed before Start remain valid.

func (*Uring) Statx

func (ur *Uring) Statx(sqeCtx SQEContext, path string, flags, mask int, stat *Statx, options ...OpOptionFunc) error

Statx gets file status with extended information.

func (*Uring) Stop

func (ur *Uring) Stop() error

Stop tears down ring-owned resources and makes the ring permanently unusable. It is idempotent. On single-issuer rings it is not safe for concurrent use with submit, Wait/enter, or ResizeRings; caller must serialize those operations.

func (*Uring) SubmitAcceptDirectMultishot

func (ur *Uring) SubmitAcceptDirectMultishot(sqeCtx SQEContext, fileIndex uint32, options ...OpOptionFunc) error

SubmitAcceptDirectMultishot submits the raw kernel multishot accept-direct opcode. Each accepted connection uses the next available slot from auto-allocation. Requires IORING_FILE_INDEX_ALLOC as fileIndex for auto-allocation.

func (*Uring) SubmitAcceptMultishot

func (ur *Uring) SubmitAcceptMultishot(sqeCtx SQEContext, options ...OpOptionFunc) error

SubmitAcceptMultishot submits the raw kernel multishot accept opcode. For the managed subscription helper, use Uring.AcceptMultishot.

func (*Uring) SubmitExtended

func (ur *Uring) SubmitExtended(sqeCtx SQEContext) error

SubmitExtended submits an SQE using Extended mode context. The ExtSQE.SQE fields must be populated before calling this method. The io_uring.user_data field is set to the SQEContext (pointer + mode bits).

func (*Uring) SubmitReceiveBundleMultishot

func (ur *Uring) SubmitReceiveBundleMultishot(sqeCtx SQEContext, pollFD PollFd, options ...OpOptionFunc) error

SubmitReceiveBundleMultishot submits the raw kernel multishot bundle receive opcode. It combines multishot with bundle reception for high-throughput raw CQE flows.

func (*Uring) SubmitReceiveMultishot

func (ur *Uring) SubmitReceiveMultishot(sqeCtx SQEContext, pollFD PollFd, b []byte, options ...OpOptionFunc) error

SubmitReceiveMultishot submits the raw kernel multishot receive opcode. For the managed subscription helper, use Uring.ReceiveMultishot.

func (*Uring) SymlinkAt

func (ur *Uring) SymlinkAt(sqeCtx SQEContext, target, linkpath string, options ...OpOptionFunc) error

SymlinkAt creates a symbolic link.

func (*Uring) Sync

func (ur *Uring) Sync(sqeCtx SQEContext, options ...OpOptionFunc) error

Sync performs a file sync operation.

func (*Uring) SyncFileRange

func (ur *Uring) SyncFileRange(sqeCtx SQEContext, offset int64, length int, syncFlags int, options ...OpOptionFunc) error

SyncFileRange syncs a file range to storage.

func (*Uring) TCP4Socket

func (ur *Uring) TCP4Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

TCP4Socket creates a TCP IPv4 socket.

func (*Uring) TCP4SocketDirect

func (ur *Uring) TCP4SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

TCP4SocketDirect creates a TCP IPv4 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) TCP6Socket

func (ur *Uring) TCP6Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

TCP6Socket creates a TCP IPv6 socket.

func (*Uring) TCP6SocketDirect

func (ur *Uring) TCP6SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

TCP6SocketDirect creates a TCP IPv6 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) Tee

func (ur *Uring) Tee(sqeCtx SQEContext, fdIn int, length int, options ...OpOptionFunc) error

Tee duplicates data between pipes.

func (*Uring) Timeout

func (ur *Uring) Timeout(sqeCtx SQEContext, d time.Duration, options ...OpOptionFunc) error

Timeout submits a timeout request with the specified duration.

func (*Uring) TimeoutRemove

func (ur *Uring) TimeoutRemove(sqeCtx SQEContext, userData uint64, options ...OpOptionFunc) error

TimeoutRemove removes a timeout request.

func (*Uring) TimeoutUpdate

func (ur *Uring) TimeoutUpdate(sqeCtx SQEContext, userData uint64, d time.Duration, absolute bool, options ...OpOptionFunc) error

TimeoutUpdate modifies an existing timeout request in-place. This atomically updates the timeout's expiration without removing and re-adding.

Parameters:

  • sqeCtx: Context for this update operation
  • userData: userData of the target timeout to update
  • d: New timeout duration
  • absolute: If true, d is treated as absolute time; if false, relative from now

func (*Uring) UDP4Socket

func (ur *Uring) UDP4Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

UDP4Socket creates a UDP IPv4 socket.

func (*Uring) UDP4SocketDirect

func (ur *Uring) UDP4SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

UDP4SocketDirect creates a UDP IPv4 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) UDP6Socket

func (ur *Uring) UDP6Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

UDP6Socket creates a UDP IPv6 socket.

func (*Uring) UDP6SocketDirect

func (ur *Uring) UDP6SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

UDP6SocketDirect creates a UDP IPv6 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) UDPLITE4Socket

func (ur *Uring) UDPLITE4Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

UDPLITE4Socket creates a UDP-Lite IPv4 socket.

func (*Uring) UDPLITE4SocketDirect

func (ur *Uring) UDPLITE4SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

UDPLITE4SocketDirect creates a UDP-Lite IPv4 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) UDPLITE6Socket

func (ur *Uring) UDPLITE6Socket(sqeCtx SQEContext, options ...OpOptionFunc) error

UDPLITE6Socket creates a UDP-Lite IPv6 socket.

func (*Uring) UDPLITE6SocketDirect

func (ur *Uring) UDPLITE6SocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

UDPLITE6SocketDirect creates a UDP-Lite IPv6 socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) UnixSocket

func (ur *Uring) UnixSocket(sqeCtx SQEContext, options ...OpOptionFunc) error

UnixSocket creates a Unix domain socket.

func (*Uring) UnixSocketDirect

func (ur *Uring) UnixSocketDirect(sqeCtx SQEContext, options ...OpOptionFunc) error

UnixSocketDirect creates a Unix domain socket directly into a registered file table slot. Uses IORING_FILE_INDEX_ALLOC for auto-allocation; returns slot index in CQE res.

func (*Uring) UnlinkAt

func (ur *Uring) UnlinkAt(sqeCtx SQEContext, path string, flags int, options ...OpOptionFunc) error

UnlinkAt removes a file or directory.

func (*Uring) UnregisterFiles

func (ur *Uring) UnregisterFiles() error

UnregisterFiles removes all registered file descriptors. After unregistering, IOSQE_FIXED_FILE flag must not be used until new files are registered.

func (*Uring) UnregisterNAPI

func (ur *Uring) UnregisterNAPI() error

UnregisterNAPI disables NAPI busy polling for this ring.

func (*Uring) UringCmd

func (ur *Uring) UringCmd(sqeCtx SQEContext, cmdOp uint32, cmdData []byte, options ...OpOptionFunc) error

UringCmd submits a generic passthrough command. The cmdOp specifies the command operation, and cmdData provides optional data. The framework retains cmdData until the kernel consumes the SQE. Because the kernel driver may hold cmdData asynchronously until the CQE is posted, caller must keep cmdData valid until the completion is reaped.

func (*Uring) UringCmd128

func (ur *Uring) UringCmd128(sqeCtx SQEContext, cmdOp uint32, cmdData []byte, options ...OpOptionFunc) error

UringCmd128 submits a 128-byte passthrough command. It provides up to 80 bytes of inline command data inside the SQE and requires a ring created with Options.SQE128; otherwise it returns ErrNotSupported.

func (*Uring) Wait

func (ur *Uring) Wait(cqes []CQEView) (n int, err error)

Wait flushes pending submissions, drives deferred task work when needed, and collects completion events into cqes. On single-issuer rings it is not safe for concurrent use with submit, Stop, or ResizeRings; caller must serialize those operations. It returns the number of events received, ErrCQOverflow when the ring enters CQ overflow and no CQEs are immediately claimable, or `iox.ErrWouldBlock` if the CQ is empty.

CQEView provides direct field access to Res and Flags, and methods to access the submission context based on mode (Direct, Indirect, Extended).

Example:

cqes := make([]CQEView, 64)
n, err := ring.Wait(cqes)
for i := range n {
    cqe := &cqes[i]
    if cqe.Res < 0 {
        handleCompletionError(cqe.Op(), cqe.Res)
        continue
    }
    if cqe.Extended() {
        ext := cqe.ExtSQE()
        ctx := ViewCtx(ext).Vals1()
        seq := ctx.Val1
        handleCompletion(cqe.Res, seq)
    }
}

func (*Uring) WaitDirect

func (ur *Uring) WaitDirect(cqes []DirectCQE) (int, error)

WaitDirect retrieves completion events using Direct mode fast-path. This method skips mode detection since all CQEs are assumed to be from Direct mode submissions (PackDirect).

For applications using only Direct mode, this skips the mode dispatch that Wait([]CQEView) performs per CQE.

On single-issuer rings it is not safe for concurrent use with submit, Stop, or ResizeRings; caller must serialize those operations. Returns the number of CQEs retrieved, ErrCQOverflow when the ring enters CQ overflow and no CQEs are immediately claimable, or iox.ErrWouldBlock if none are available.

func (*Uring) WaitExtended

func (ur *Uring) WaitExtended(cqes []ExtCQE) (int, error)

WaitExtended retrieves completion events using Extended mode fast-path. This method skips mode detection since all CQEs are assumed to be from Extended mode submissions (PackExtended).

For applications using only Extended mode, this skips the mode dispatch that Wait([]CQEView) performs per CQE.

On single-issuer rings it is not safe for concurrent use with submit, Stop, or ResizeRings; caller must serialize those operations. Returns the number of CQEs retrieved, ErrCQOverflow when the ring enters CQ overflow and no CQEs are immediately claimable, or iox.ErrWouldBlock if none are available.

func (*Uring) Waitid

func (ur *Uring) Waitid(sqeCtx SQEContext, idtype, id int, infop unsafe.Pointer, options int, opts ...OpOptionFunc) error

Waitid waits for a process to change state asynchronously. idtype specifies which id to wait for (P_PID, P_PGID, P_ALL). The siginfo_t result is written to infop. Caller must keep infop valid until the operation completes.

func (*Uring) WithExtSQE

func (ur *Uring) WithExtSQE(fn func(ext *ExtSQE) error) error

WithExtSQE executes fn with an ExtSQE from the pool. The ExtSQE is automatically returned to the pool if fn returns an error.

If fn returns nil, the caller is responsible for eventually returning the ExtSQE to the pool (typically via CQE handler). Typed or raw UserData views produced inside fn are borrowed and must not escape past PutExtSQE.

Returns iox.ErrWouldBlock if the pool is exhausted.

Example:

err := ring.WithExtSQE(func(ext *uring.ExtSQE) error {
    ctx := uring.ViewCtx(ext).Vals0()
    ctx.Fn = handler
    ctx.Data = data

    sqeCtx := uring.PackExtended(ext)
    return ring.Read(sqeCtx, fd, buf)
})

func (*Uring) Write

func (ur *Uring) Write(sqeCtx SQEContext, b []byte, options ...OpOptionFunc) error

Write performs a write operation. Caller must keep b valid until the operation completes.

func (*Uring) WriteFixed

func (ur *Uring) WriteFixed(sqeCtx SQEContext, bufIndex int, n int, options ...OpOptionFunc) error

WriteFixed performs a write with a registered (fixed) buffer.

func (*Uring) WriteV

func (ur *Uring) WriteV(sqeCtx SQEContext, iovs [][]byte, options ...OpOptionFunc) error

WriteV performs a vectored write operation. Caller must keep iovs and their backing buffers valid until the operation completes.

func (*Uring) WritevFixed

func (ur *Uring) WritevFixed(sqeCtx SQEContext, offset int64, bufIndices []int, lengths []int, options ...OpOptionFunc) error

WritevFixed performs a vectored write using registered buffers. All buffer indices must refer to previously registered buffers.

func (*Uring) ZCRXExport

func (ur *Uring) ZCRXExport(zcrxID uint32) (int, error)

ZCRXExport exports a ZCRX instance for cross-ring sharing. Returns a file descriptor that can be passed to another process. Requires kernel 6.19+.

func (*Uring) ZCRXFlushRQ

func (ur *Uring) ZCRXFlushRQ(zcrxID uint32) error

ZCRXFlushRQ flushes the ZCRX refill queue. This ensures all pending refill queue entries are processed.

type VastBuffer

type VastBuffer = iobuf.VastBuffer

Buffer types re-exported from iobuf.

type ZCHandler

type ZCHandler interface {
	// OnCompleted is called when the send CQE arrives.
	// The result is the number of bytes sent or a negative errno value.
	// The buffer must not be modified until OnNotification runs.
	OnCompleted(result int32)

	// OnNotification is called when the buffer can be safely reused.
	// This is the second CQE in the zero-copy two-CQE model and it carries
	// the original send result observed at completion time.
	OnNotification(result int32)
}

ZCHandler handles zero-copy send completion events.

type ZCRXAreaReg

type ZCRXAreaReg struct {
	Addr        uint64    // Base address of the area
	Len         uint64    // Length of the area
	RqAreaToken uint64    // Token for RQ area
	Flags       uint32    // IORING_ZCRX_AREA_* flags
	DmabufFD    uint32    // DMA buffer file descriptor
	Resv        [2]uint64 // Reserved
}

ZCRXAreaReg is the area registration for ZCRX. Matches struct io_uring_zcrx_area_reg in Linux.

type ZCRXBuffer

type ZCRXBuffer struct {
	// contains filtered or unexported fields
}

ZCRXBuffer wraps a delivered ZCRX receive view. Payload memory is kernel-owned until Release returns nil. After a successful release, the buffer must not be reused.

func (*ZCRXBuffer) Bytes

func (b *ZCRXBuffer) Bytes() []byte

Bytes returns the received data. Valid until Release.

func (*ZCRXBuffer) Len

func (b *ZCRXBuffer) Len() int

Len returns the received data length.

func (*ZCRXBuffer) Release

func (b *ZCRXBuffer) Release() error

Release returns the buffer to the kernel via the refill queue. Call on the CQE path. Returns iox.ErrWouldBlock when the refill ring is full.

type ZCRXConfig

type ZCRXConfig struct {
	IfName       string // Network interface name (e.g., "eth0"). Required.
	RxQueue      int    // NIC RX queue index. Required.
	AreaSize     int    // Mapped receive area size in bytes.
	RqEntries    int    // Refill queue entries. Must be power of 2.
	ChunkSize    int    // Refill slot size. Must be a power-of-two multiple of page size.
	UseHugePages bool   // Use huge pages for the area.
}

ZCRXConfig configures a ZCRX receive instance bound to a NIC RX queue.

type ZCRXCqe

type ZCRXCqe struct {
	Off uint64 // Offset into the ZCRX area
	// contains filtered or unexported fields
}

ZCRXCqe is a zero-copy receive completion queue entry extension. Matches struct io_uring_zcrx_cqe in Linux.

type ZCRXCtrl

type ZCRXCtrl struct {
	ZcrxID uint32    // ZCRX instance ID
	Op     uint32    // ZCRX_CTRL_* operation
	Resv   [2]uint64 // Reserved
	// Union: either ZcExport or ZcFlush based on Op
	Data [48]byte // Large enough for both structures
}

ZCRXCtrl is the control structure for ZCRX operations. Matches struct zcrx_ctrl in Linux.

type ZCRXHandler

type ZCRXHandler interface {
	// OnData handles received data. Return false for best-effort stop. Zero-length buffers mark TCP EOF and carry no refill identity.
	OnData(buf *ZCRXBuffer) bool

	// OnError handles a CQE error. Return false for best-effort stop.
	OnError(err error) bool

	// OnStopped runs once on the CQE path during terminal retirement before state becomes Stopped.
	OnStopped()
}

ZCRXHandler handles ZCRX receive events.

type ZCRXIfqReg

type ZCRXIfqReg struct {
	IfIdx     uint32      // Network interface index
	IfRxq     uint32      // RX queue index
	RqEntries uint32      // Number of refill queue entries
	Flags     uint32      // ZCRX_REG_* flags
	AreaPtr   uint64      // Pointer to ZCRXAreaReg
	RegionPtr uint64      // Pointer to memory region descriptor
	Offsets   ZCRXOffsets // Offsets within the ring
	ZcrxID    uint32      // ZCRX instance ID (output)
	RxBufLen  uint32      // Chunk size hint; 0 defaults to page size
	Resv      [3]uint64   // Reserved
}

ZCRXIfqReg is the interface queue registration for ZCRX. Matches struct io_uring_zcrx_ifq_reg in Linux.

type ZCRXOffsets

type ZCRXOffsets struct {
	Head uint32 // Head offset
	Tail uint32 // Tail offset
	Rqes uint32 // RQE array offset

	Resv [2]uint64 // Reserved
	// contains filtered or unexported fields
}

ZCRXOffsets describes the layout of a ZCRX refill queue ring. Matches struct io_uring_zcrx_offsets in Linux.

type ZCRXReceiver

type ZCRXReceiver struct {
	// contains filtered or unexported fields
}

ZCRXReceiver manages a ZCRX receive lifecycle. External Stop and Close must not race CQE dispatch or terminal retirement. Call Close only after Stopped reports true.

func NewZCRXReceiver

func NewZCRXReceiver(ring *Uring, pool *ContextPools, cfg ZCRXConfig) (*ZCRXReceiver, error)

NewZCRXReceiver creates a ZCRX receiver. Returns ErrNotSupported if ZCRX is unavailable.

func (*ZCRXReceiver) Active

func (r *ZCRXReceiver) Active() bool

Active reports whether the receiver is receiving.

func (*ZCRXReceiver) Close

func (r *ZCRXReceiver) Close() error

Close releases ZCRX resources after the receiver has retired and the owning ring has been stopped. Returns iox.ErrWouldBlock while retirement is still in flight or while the ring still owns the ZCRX IFQ registration. Caller must drain all in-flight operations before calling Close. Close is not safe for concurrent use. Safe to call more than once; subsequent calls return nil.

func (*ZCRXReceiver) Start

func (r *ZCRXReceiver) Start(fd iofd.FD, handler ZCRXHandler) error

Start submits the multishot ZCRX receive. Call at most once. fd must remain valid until terminal retirement.

func (*ZCRXReceiver) Stop

func (r *ZCRXReceiver) Stop() error

Stop requests cancellation of the active multishot receive. Leaves the receiver active if cancel submission fails.

func (*ZCRXReceiver) Stopped

func (r *ZCRXReceiver) Stopped() bool

Stopped reports whether terminal retirement is complete. Close still requires the owning ring to have been stopped.

type ZCRXRqe

type ZCRXRqe struct {
	Off uint64 // Offset into the ZCRX area
	Len uint32 // Length of the buffer
	// contains filtered or unexported fields
}

ZCRXRqe is a zero-copy receive refill queue entry. Matches struct io_uring_zcrx_rqe in Linux.

type ZCSubscriber

type ZCSubscriber struct {
	// contains filtered or unexported fields
}

ZCSubscriber adapts functions to `ZCHandler`.

func NewZCSubscriber

func NewZCSubscriber() *ZCSubscriber

NewZCSubscriber creates a subscriber with default handlers.

func (*ZCSubscriber) Handler

func (s *ZCSubscriber) Handler() ZCHandler

Handler returns `s` as a `ZCHandler`.

func (*ZCSubscriber) OnCompleted

func (s *ZCSubscriber) OnCompleted(fn func(result int32)) *ZCSubscriber

OnCompleted sets the completed handler.

func (*ZCSubscriber) OnNotification

func (s *ZCSubscriber) OnNotification(fn func(result int32)) *ZCSubscriber

OnNotification sets the notification handler.

type ZCTracker

type ZCTracker struct {
	// contains filtered or unexported fields
}

ZCTracker manages zero-copy send operations through the two-CQE model. Zero-copy sends produce two CQEs:

  1. Operation CQE (IORING_CQE_F_MORE) - send completed, buffer still in use
  2. Notification CQE (IORING_CQE_F_NOTIF) - buffer can be reused

The tracker ensures:

  • Handlers are invoked in the correct order
  • Each handler is invoked exactly once
  • ExtSQE is returned to pool only after notification

func NewZCTracker

func NewZCTracker(ring *Uring, pool *ContextPools) *ZCTracker

NewZCTracker creates a tracker for managing zero-copy sends.

func (*ZCTracker) HandleCQE

func (t *ZCTracker) HandleCQE(cqe CQEView) bool

HandleCQE processes a CQE that may be a zero-copy completion or notification. Returns true if the CQE was handled, false if it's not a ZC tracker CQE.

func (*ZCTracker) SendZC

func (t *ZCTracker) SendZC(fd iofd.FD, buf []byte, handler ZCHandler, options ...OpOptionFunc) error

SendZC submits a zero-copy send operation. The handler receives OnCompleted for the send CQE and OnNotification when the buffer can be safely reused. Returns iox.ErrWouldBlock if the context pool is exhausted.

func (*ZCTracker) SendZCFixed

func (t *ZCTracker) SendZCFixed(fd iofd.FD, bufIndex int, offset int, length int, handler ZCHandler, options ...OpOptionFunc) error

SendZCFixed submits a zero-copy send using a registered buffer. Returns iox.ErrWouldBlock if the context pool is exhausted.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL