Documentation
¶
Overview ¶
Package engine defines the core Engine interface and associated types used by all I/O engine implementations (io_uring, epoll, adaptive, std).
This package is the dependency root for engine-related types. User code typically interacts with engines through the top-level celeris package rather than importing this package directly.
Documentation ¶
Full guides and examples: https://goceleris.dev/docs/engines
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ErrQueueFull = errors.New("celeris/engine: worker write queue full")
ErrQueueFull is returned by WorkerLoop.Write when the worker's outbound queue is full and cannot accept more data without blocking. Callers should apply backpressure to their data source rather than retry in a tight loop.
var ErrSwitchingNotFrozen = errors.New("celeris/engine: adaptive engine switching is not frozen; call FreezeSwitching before registering driver connections")
ErrSwitchingNotFrozen is returned by adaptive engines when a driver attempts to register a connection without first calling SwitchFreezer.FreezeSwitching. Driver FDs cannot migrate across epoll/io_uring worker boundaries, so the adaptive engine must be held on a single sub-engine for the driver's lifetime.
var ErrUnknownFD = errors.New("celeris/engine: file descriptor not registered")
ErrUnknownFD is returned when a WorkerLoop operation references a file descriptor that is not registered on that worker.
Functions ¶
This section is empty.
Types ¶
type AcceptController ¶ added in v0.3.0
type AcceptController interface {
// PauseAccept stops accepting new connections. Existing connections continue.
PauseAccept() error
// ResumeAccept resumes accepting new connections after a pause.
ResumeAccept() error
}
AcceptController is implemented by engines that support dynamic accept control, used by the adaptive engine to pause/resume individual sub-engines during switches.
type CPUMonitor ¶ added in v1.5.0
CPUMonitor samples CPU utilization. It is the public interface accepted by the adaptive engine so external callers can supply their own monitor (or the built-in /proc/stat implementation) without depending on an internal package. Implementations must be safe for concurrent use: a single monitor may be sampled from more than one goroutine.
type CPUSample ¶ added in v1.5.0
CPUSample is a point-in-time CPU utilization measurement supplied by a CPUMonitor. Utilization is a fraction in [0,1]; zero means "no signal" and is the documented fallback when system-wide CPU data is unavailable.
type CapabilityProfile ¶
type CapabilityProfile struct {
// OS is the operating system name (e.g. "linux").
OS string
// KernelVersion is the full kernel version string (e.g. "5.15.0-91-generic").
KernelVersion string
// KernelMajor is the major kernel version number.
KernelMajor int
// KernelMinor is the minor kernel version number.
KernelMinor int
// IOUringTier is the detected io_uring capability tier (None through Optional).
IOUringTier Tier
// EpollAvailable is true if epoll is supported on this system.
EpollAvailable bool
// MultishotAccept is true if io_uring multishot accept is available (kernel 5.19+).
MultishotAccept bool
// MultishotRecv is true if io_uring multishot recv is available (kernel 5.19+).
MultishotRecv bool
// ProvidedBuffers is true if io_uring provided buffer rings are available (kernel 5.19+).
ProvidedBuffers bool
// SQPoll is true if io_uring SQ polling mode is available (kernel 6.0+).
SQPoll bool
// CoopTaskrun is true if IORING_SETUP_COOP_TASKRUN is available (kernel 5.19+).
CoopTaskrun bool
// SingleIssuer is true if IORING_SETUP_SINGLE_ISSUER is available (kernel 6.0+).
SingleIssuer bool
// LinkedSQEs is true if io_uring linked SQE chains are supported.
LinkedSQEs bool
// DeferTaskrun is true if IORING_SETUP_DEFER_TASKRUN is available (kernel 6.1+).
DeferTaskrun bool
// FixedFiles is true if io_uring fixed file descriptors are available.
FixedFiles bool
// SendZC is true if io_uring zero-copy send is available.
SendZC bool
// Sendfile is true if sendfile(2) is available for zero-copy file
// responses. True on Linux (kernel 2.6.33+, which every supported
// distro is well past); false on non-Linux platforms. The epoll
// engine implements engine.SendfileCapable and the H1 static-file
// response path uses it (Context.File, middleware/static); engines
// without SendfileCapable (iouring, std) fall back to read+write.
Sendfile bool
// Zerocopy is true if a MSG_ZEROCOPY userspace-buffer send path is
// available. Currently always false: the engine does not ship a
// MSG_ZEROCOPY send path (a correct one needs SO_ZEROCOPY +
// sendmsg(MSG_ZEROCOPY) + errqueue completion draining + buffer
// pinning). sendfile(2) covers the zero-copy file-serving workload
// via the Sendfile flag above; iouring has IORING_OP_SEND_ZC (kernel
// 6.0+) which is a separate flag (SendZC above). TCP MSG_ZEROCOPY
// landed in Linux 4.14, UDP in 5.0 — flip this on (gated at 4.14)
// only when the send path is actually implemented.
Zerocopy bool
// NumCPU is the number of logical CPUs.
NumCPU int
// NUMANodes is the number of NUMA nodes detected.
NUMANodes int
}
CapabilityProfile describes the I/O capabilities detected on the current system. The probe package populates this at startup to guide engine selection.
func NewDefaultProfile ¶
func NewDefaultProfile() CapabilityProfile
NewDefaultProfile returns a CapabilityProfile with safe defaults.
type Carryover ¶ added in v1.5.6
type Carryover struct {
// RemoteAddr is the peer address string captured by the source engine, so
// the destination need not re-Getpeername the fd.
RemoteAddr string
// Buffered holds bytes the source engine had already read off the socket but
// not yet processed — buffered pipelined NEXT-request bytes that sat in the
// HTTP/1 parser at a clean request boundary (#383). The destination replays
// them through its parser before arming its first read. Empty in the common
// case (the socket was fully drained at the boundary). Only ever carries
// next-request bytes — the source guarantees no in-progress request remains
// (AtRequestBoundary), so a fresh parser can consume them correctly.
Buffered []byte
}
Carryover is the minimal per-connection state transferred when a live connection is transplanted from one engine to another at a request boundary (#383). At a clean HTTP/1 boundary the connection is equivalent to a freshly accepted socket, so only cosmetic/identity state needs to move; any in-flight request bytes remain in the kernel socket receive buffer and are recv'd by the destination engine after it arms its first read.
type Engine ¶
type Engine interface {
// Listen starts the engine and blocks until ctx is canceled or a fatal
// error occurs. The engine begins accepting connections on the configured address.
Listen(ctx context.Context) error
// Shutdown gracefully drains in-flight connections. The provided context
// controls the deadline; if it expires, remaining connections are closed.
Shutdown(ctx context.Context) error
// Metrics returns a point-in-time snapshot of engine performance counters.
Metrics() EngineMetrics
// Type returns the engine type identifier (IOUring, Epoll, Adaptive, or Std).
Type() EngineType
// Addr returns the bound listener address, or nil if not yet listening.
Addr() net.Addr
}
Engine is the interface that all I/O engine implementations must satisfy. Implementations include io_uring, epoll, adaptive, and the standard library net/http server. Engine methods are safe for concurrent use after Listen is called.
type EngineMetrics ¶
type EngineMetrics struct {
// RequestCount is the cumulative number of requests handled by this engine.
RequestCount uint64
// ActiveConnections is the current number of open connections.
ActiveConnections int64
// ErrorCount is the cumulative number of connection-level or protocol errors.
ErrorCount uint64
// Throughput is the recent requests-per-second rate.
Throughput float64
// AsyncRoutes is the count of routes registered with .Async(true) on
// this engine's handler. Static after Listen — derived from the
// router's per-route async flags and exposed for diagnostics so
// operators can see how many handlers run on the per-conn dispatch
// goroutine vs inline on the worker. Zero on engines whose handler
// does not implement [github.com/goceleris/celeris/protocol/h2/stream.AsyncRouteResolver].
AsyncRoutes int
// AsyncPromotedConns is the cumulative number of connections that
// have been promoted from inline-on-worker to the per-conn dispatch
// goroutine via per-handler async (celeris #300). Counts promotions,
// not currently-promoted conns — every fresh async-mode connection
// that hits an async route increments this once. Useful to observe
// how often the inline → goroutine handoff fires vs the pure-sync
// inline fast path.
AsyncPromotedConns uint64
// Workers is the number of I/O workers (io_uring) or event loops
// (epoll) the engine is running. Static after Listen. The adaptive
// controller divides ActiveConnections by it to derive the
// conns-per-worker load signal that drives engine selection.
Workers int
// AcceptCount is the cumulative number of connections accepted by this
// engine since start. Together with elapsed time it yields the accept
// rate (new-connection arrival rate) used as a secondary load signal.
AcceptCount uint64
// CloseCount is the cumulative number of connections closed by this
// engine since start. AcceptCount - CloseCount tracks the live count;
// a high close rate relative to accepts indicates short-lived
// churn-style connections.
CloseCount uint64
// BytesRead is the cumulative number of payload bytes received from
// the network across all connections. Used with BytesWritten and
// RequestCount to derive the average bytes-per-request signal that
// suppresses io_uring selection for link-bound (large-payload)
// workloads where the engines tie.
BytesRead uint64
// BytesWritten is the cumulative number of payload bytes sent to the
// network across all connections. See BytesRead.
BytesWritten uint64
// AdaptiveSwitches is the cumulative count of completed epoll⇄io_uring
// switches performed by the adaptive engine. Zero for non-adaptive
// engines, which never switch. Live switching is the adaptive engine's
// most complex path; surfacing the count lets ops and benchmarks
// correlate a throughput or tail-latency anomaly with switching activity
// (a rare switch transient can skew a single benchmark pass).
AdaptiveSwitches uint64
}
EngineMetrics is a point-in-time snapshot of engine-level performance counters. Each engine maintains internal atomic counters and populates a fresh snapshot on each Engine.Metrics call.
v1.5.0: LatencyP50 / LatencyP99 / LatencyP999 were removed (celeris#321). The fields were declared but never written by any sub-engine (the sub-engines don't track per-request latency histograms — they only count requests). The adaptive engine's aggregator at adaptive/engine.go was the only reader and it received zeros from both sub-engines, so removing the fields changes nothing observable. SyscallRate, referenced by the issue, never existed in the tree.
type EngineType ¶
type EngineType uint8 //nolint:revive // user-approved name
EngineType identifies which I/O engine implementation is in use.
const ( IOUring EngineType // io_uring-based engine Epoll // epoll-based engine Adaptive // runtime-adaptive engine selection Std // net/http stdlib engine )
I/O engine implementation types. The zero value is engineDefault which resolves to Adaptive on Linux and Std elsewhere (see resource.Config.WithDefaults).
func (EngineType) IsDefault ¶ added in v1.1.0
func (t EngineType) IsDefault() bool
IsDefault reports whether the engine type is the unset zero value.
func (EngineType) String ¶
func (t EngineType) String() string
String returns the human-readable engine type name (e.g. "io_uring", "epoll").
type EventLoopProvider ¶ added in v1.4.0
type EventLoopProvider interface {
// NumWorkers returns the number of worker event loops available for
// driver FD registration. This is typically equal to the engine's
// configured worker count (one per CPU core).
NumWorkers() int
// WorkerLoop returns the WorkerLoop for worker n, where 0 <= n < NumWorkers().
// Calling WorkerLoop with an out-of-range index panics.
WorkerLoop(n int) WorkerLoop
}
EventLoopProvider is implemented by engines that expose their per-worker event loops for database/cache driver integration. Drivers call WorkerLoop to obtain a specific worker and register file descriptors on it, sharing the HTTP server's epoll instance or io_uring rings.
A nil return from [celeris.Server.EventLoopProvider] (or a type assertion failure here) indicates the engine does not support driver integration (e.g., the standard net/http fallback). Drivers should fall back to the standalone mini event loop in driver/internal/eventloop in that case.
Implementations must be safe for concurrent use after the engine has begun listening.
type Protocol ¶
type Protocol uint8
Protocol represents the HTTP protocol version.
const ( HTTP1 Protocol // HTTP/1.1 H2C // HTTP/2 cleartext Auto // auto-negotiate )
HTTP protocol version constants. The zero value is protocolDefault which resolves to Auto (see resource.Config.WithDefaults).
type SendfileCapable ¶ added in v1.5.0
type SendfileCapable interface {
Sendfile(fdOut int, file *os.File, offset, length int64, headers []byte) (int64, error)
}
SendfileCapable is an optional interface implemented by engines that support zero-copy file responses via sendfile(2). The H1 static-file response path type-asserts the engine for it; engines that do not implement it (iouring, std) fall back to the buffered read+write path. See celeris#317. The epoll engine implements it.
The fdOut argument is the engine's per-connection socket FD. The caller drives the connState lifecycle (locking, dirty-list membership, timeout tracking, EAGAIN resume) — the engine's sendfile path is a syscall shim that advances as far as the kernel send buffer allows and reports backpressure for the caller to resume.
offset and length describe the slice of the source file to send. length ≤ 0 means "send to EOF" (the implementation stats the file and uses size-offset). The headers slice, when non-empty, is flushed via write(2) before the sendfile loop begins.
Returns the number of BODY bytes sent on this call and an error. A short send under kernel-send-buffer pressure surfaces EAGAIN/EWOULDBLOCK (via errors.Is) along with the partial body count, so the caller defers and resumes; the returned count is never corrupted on EAGAIN. EINTR is retried internally. The HEAD-request invariant (never send a body) is the caller's responsibility — the caller must not invoke Sendfile for a HEAD request.
type SwitchFreezer ¶ added in v0.3.0
type SwitchFreezer interface {
// FreezeSwitching prevents the adaptive engine from switching.
FreezeSwitching()
// UnfreezeSwitching allows engine switching to resume.
UnfreezeSwitching()
}
SwitchFreezer is implemented by the adaptive engine to allow external code (e.g., benchmarks) to temporarily prevent engine switches.
type Tier ¶
type Tier uint8
Tier represents an io_uring capability tier. Ordering forms a strict hierarchy suitable for >= comparisons.
const ( None Tier = iota // no io_uring support Base // kernel 5.10+ (LTS-stable io_uring baseline; linked SQE chains, single-shot accept/recv) High // kernel 5.19+ (multishot accept/recv, provided buffers, fixed files, COOP_TASKRUN, SINGLE_ISSUER) Optional // kernel 6.0+ (adds SQPOLL, SEND_ZC; 6.1+ swaps COOP_TASKRUN → DEFER_TASKRUN) )
io_uring capability tiers in ascending order of feature availability.
type TransplantTarget ¶ added in v1.5.6
type TransplantTarget interface {
// AdoptConn takes ownership of fd — a connected, non-blocking real socket
// sitting at an HTTP/1 request boundary — and begins serving it. The caller
// must have already removed fd from its own engine. On a returned error the
// caller still owns fd and is responsible for closing it.
AdoptConn(fd int, carry Carryover) error
}
TransplantTarget is implemented by an engine that can adopt a live, already connected file descriptor handed off from another engine mid-flight (#383).
Protocol: the SOURCE engine detaches the fd from its own event loop FIRST (EPOLL_CTL_DEL / ring-cancel + remove from its conn table), so the fd is owned by no engine at the moment AdoptConn is called; incoming bytes simply queue in the kernel socket receive buffer until the destination arms its read. Only then does it call AdoptConn.
Implementations must be safe to call from any goroutine; the actual attach is deferred onto the destination engine's own worker thread (where SQE/epoll submission is safe).
type WorkerLoop ¶ added in v1.4.0
type WorkerLoop interface {
// RegisterConn adds fd to the worker's interest set and installs the
// data and close callbacks. The fd is expected to be already connected
// and in non-blocking mode. Returns an error if fd is already registered
// on this or another worker.
RegisterConn(fd int, onRecv func([]byte), onClose func(error)) error
// UnregisterConn removes fd from the worker's interest set and triggers
// the onClose callback (with a nil error) after any in-flight operations
// complete. The caller is responsible for closing the fd itself.
UnregisterConn(fd int) error
// Write enqueues data for transmission on fd. The call does not block on
// the kernel send buffer: data is buffered and flushed asynchronously by
// the worker. Returns [ErrQueueFull] if the worker's outbound queue is
// saturated. The data slice may be retained by the worker until the
// write completes; callers must not modify it after calling Write.
Write(fd int, data []byte) error
// CPUID returns the CPU this worker is pinned to, or -1 if the worker is
// not pinned. Drivers may use this to co-locate allocations (e.g.,
// NUMA-aware buffers) with the worker.
CPUID() int
}
WorkerLoop is the per-worker surface that drivers use to register file descriptors, send data, and receive callbacks when data arrives or the connection closes.
Thread safety ¶
All methods are safe to call from any goroutine. [Write] is serialized per-FD internally (at most one send is in flight per descriptor), following the same invariant the HTTP path relies on.
Callback semantics ¶
The onRecv and onClose callbacks passed to [RegisterConn] are invoked from the worker's own goroutine. The []byte slice passed to onRecv is only valid for the duration of the callback — callers must copy before retaining it. onClose is called exactly once, either in response to [UnregisterConn] or when the kernel reports a closed/errored connection. The error argument to onClose is nil on orderly shutdown and non-nil on I/O errors.
Drivers must not call [RegisterConn] / [UnregisterConn] / [Write] from within an onRecv or onClose callback targeting the same FD; use a separate goroutine if such re-entrance is needed.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
Package epoll implements the epoll-based I/O engine for Linux.
|
Package epoll implements the epoll-based I/O engine for Linux. |
|
internal
|
|
|
bindiag
Package bindiag holds the shared bind/listen retry + diagnostic helpers used by both the epoll and iouring engines.
|
Package bindiag holds the shared bind/listen retry + diagnostic helpers used by both the epoll and iouring engines. |
|
Package iouring implements an engine backed by Linux io_uring.
|
Package iouring implements an engine backed by Linux io_uring. |
|
Package std provides an engine implementation backed by net/http.
|
Package std provides an engine implementation backed by net/http. |