Documentation
¶
Overview ¶
Package wcprof implements cheap wall-clock profiling for the engine.
It records operation intervals, wait (blocked-on) intervals, and link events into an in-memory buffer, for offline analysis of where wall-clock time goes and which operation classes are bottlenecks. It is intentionally not OTel: events are fixed-size structs appended to sharded in-memory buffers, with all strings interned. The expensive work (graph reconstruction, counterfactual simulation) happens offline after dumping the buffer via the engine debug endpoint.
Recording can be enabled two ways:
- Engine-global: the _DAGGER_WCPROF environment variable, the engine's --wcprof flag, or POSTing "on" to the /debug/wcprof/enabled endpoint. All work on the engine is recorded until "off" is POSTed to the same endpoint.
- Per-session: a client connecting with ClientMetadata.Profile set (the hidden --profile CLI flag). Only work attributable to that session (including its nested module/SDK clients) is recorded: the session server marks the session's contexts via ContextWithProfiling, and the mark propagates to derived contexts.
When profiling has never been enabled, all recording calls are a single atomic load + nil check.
Index ¶
- Constants
- func ContextWithOpID(ctx context.Context, opID uint64) context.Context
- func ContextWithProfiling(ctx context.Context) context.Context
- func CurrentOpID(ctx context.Context) uint64
- func DisableGlobal()
- func EnableGlobal()
- func Enabled(ctx context.Context) bool
- func EnsureRecorder()
- func GloballyEnabled() bool
- func Link(ctx context.Context, kind LinkKind, fromOpID, targetOpID uint64, ident string, ...)
- func NowNS() int64
- func ReadDump(rd io.Reader) (*DumpHeader, []DumpEvent, error)
- func RecordOp(ctx context.Context, kind OpKind, class string, opts OpOpts, ...) uint64
- type DumpEvent
- type DumpHeader
- type DumpOpenOp
- type Event
- type EventType
- type LinkKind
- type Op
- type OpKind
- type OpOpts
- type Outcome
- type Recorder
- type Wait
- type WaitReason
- type WorkType
Constants ¶
const DumpSchemaVersion = 1
DumpSchemaVersion identifies the dump wire format.
Variables ¶
This section is empty.
Functions ¶
func ContextWithOpID ¶
ContextWithOpID returns a context carrying the given op ID as the current op. Useful when handing work to a goroutine that should be attributed to an existing op.
func ContextWithProfiling ¶
ContextWithProfiling marks ctx (and any context derived from it) for recording even when engine-global recording is off. The session server applies this to contexts of sessions that opted into profiling.
func CurrentOpID ¶
CurrentOpID returns the profiling op ID carried by ctx, or 0.
func DisableGlobal ¶
func DisableGlobal()
DisableGlobal turns off engine-global recording. The recorder and its buffered events are retained for dumping, sessions that explicitly enabled profiling keep recording, and already-recording flows (contexts carrying a recorded op) run to completion.
func Enabled ¶
Enabled reports whether work running under ctx should be recorded. Call sites use it to skip computing op metadata (class/ident strings) when profiling is off. When profiling has never been enabled this is a single atomic load + nil check.
func EnsureRecorder ¶
func EnsureRecorder()
EnsureRecorder creates the recorder if it does not exist yet, without turning on engine-global recording. Used when a session opts into profiling.
func GloballyEnabled ¶
func GloballyEnabled() bool
GloballyEnabled reports whether engine-global recording is on.
func Link ¶
func Link(ctx context.Context, kind LinkKind, fromOpID, targetOpID uint64, ident string, resultID uint64)
Link records a non-blocking correlation from fromOpID (0 means the current op in ctx) to an op, an interned ident, and/or a dagql result ID.
func NowNS ¶
func NowNS() int64
NowNS returns the current recorder-relative timestamp, or 0 when profiling is disabled.
func ReadDump ¶
func ReadDump(rd io.Reader) (*DumpHeader, []DumpEvent, error)
ReadDump parses a dump produced by WriteDump.
func RecordOp ¶
func RecordOp(ctx context.Context, kind OpKind, class string, opts OpOpts, startNS, endNS int64, outcome Outcome) uint64
RecordOp records an already-completed op interval with explicit timestamps (from NowNS). Useful when an op's boundaries are observed from callbacks where holding an *Op handle would race. Returns the op ID, or 0 when profiling is disabled.
Types ¶
type DumpEvent ¶
type DumpEvent struct {
Type string `json:"e"`
OpKind string `json:"k,omitempty"`
WorkType string `json:"w,omitempty"`
Outcome string `json:"o,omitempty"`
Reason string `json:"rs,omitempty"`
LinkKind string `json:"lk,omitempty"`
OpID uint64 `json:"id,omitempty"`
ParentID uint64 `json:"p,omitempty"`
TargetID uint64 `json:"t,omitempty"`
ResultID uint64 `json:"r,omitempty"`
ClassID uint32 `json:"c,omitempty"`
IdentID uint32 `json:"i,omitempty"`
ClientID uint32 `json:"cl,omitempty"`
StartNS int64 `json:"s"`
EndNS int64 `json:"d"`
}
DumpEvent is the JSON form of an Event. Short keys keep dumps compact.
type DumpHeader ¶
type DumpHeader struct {
SchemaVersion int `json:"schema_version"`
EpochUnixNano int64 `json:"epoch_unix_nano"`
DumpedUnixNano int64 `json:"dumped_unix_nano"`
DroppedEvents uint64 `json:"dropped_events"`
EventCount int `json:"event_count"`
Strings []string `json:"strings"`
// OpenOps are ops begun but not ended at dump time (e.g. in-flight or
// hung work).
OpenOps []DumpOpenOp `json:"open_ops,omitempty"`
}
DumpHeader is the first line of a dump. The remaining lines are one DumpEvent JSON object per line.
type DumpOpenOp ¶
type DumpOpenOp struct {
OpID uint64 `json:"op_id"`
ParentID uint64 `json:"parent_id,omitempty"`
Kind string `json:"kind"`
WorkType string `json:"work_type,omitempty"`
ClassID uint32 `json:"class_id,omitempty"`
IdentID uint32 `json:"ident_id,omitempty"`
ClientID uint32 `json:"client_id,omitempty"`
StartNS int64 `json:"start_ns"`
}
DumpOpenOp describes an in-progress op at dump time.
type Event ¶
type Event struct {
Type EventType
OpKind OpKind
WorkType WorkType
Outcome Outcome
Reason WaitReason
LinkKind LinkKind
OpID uint64
ParentID uint64
TargetID uint64
ResultID uint64
ClassID uint32
IdentID uint32
ClientID uint32
StartNS int64
EndNS int64
}
Event is one fixed-size profiling record. Field meaning varies by Type:
- Op: OpID/ParentID identify the op and its structural parent. StartNS and EndNS bound the interval. ClassID/IdentID/ClientID are interned strings. ResultID is the dagql shared result ID when known.
- Wait: ParentID is the waiting op, TargetID the awaited op (or 0 when waiting on a named resource, in which case IdentID names it).
- Link: ParentID is the from-op, TargetID an op (optional), IdentID an interned identifier (optional), ResultID a dagql result (optional).
type EventType ¶
type EventType uint8
EventType discriminates the union in Event.
const ( EventTypeInvalid EventType = iota // EventTypeOp is a completed operation interval. EventTypeOp // EventTypeWait is a completed wait interval: an op was blocked on // another op (or named resource) from Start to End. EventTypeWait // EventTypeLink is a non-blocking correlation between an op and another // op or an interned identifier. EventTypeLink )
type LinkKind ¶
type LinkKind uint8
LinkKind classifies link events.
const ( LinkKindInvalid LinkKind = iota // LinkKindNestedClient: op (an exec) hosts a nested client whose ID is // the link's interned ident. Ops recorded for that client belong under // this exec. LinkKindNestedClient // LinkKindResult: op produced or returned the dagql result with // ResultID. LinkKindResult // LinkKindReusedResult: op was satisfied by reusing the dagql result // with ResultID (cache hit). LinkKindReusedResult )
type Op ¶
type Op struct {
// contains filtered or unexported fields
}
Op is a handle for an in-progress operation. A nil *Op is valid and all methods are no-ops, so call sites do not need to check whether profiling is enabled.
func BeginOp ¶
BeginOp starts recording an operation. It returns a derived context that carries the new op as the current op (so nested ops and waits parent to it), and a handle to end it. When profiling is disabled it returns ctx unchanged and a nil handle.
func (*Op) EndWithResult ¶
EndWithResult records the op's completion, associating it with a dagql shared result ID when non-zero.
func (*Op) OutcomeHint ¶
OutcomeHint returns the hint set by SetOutcomeHint, or OutcomeNone.
func (*Op) SetIdent ¶
SetIdent updates the op's instance identity (useful when it only becomes known after the op began, e.g. a recipe digest derived mid-call). Must be called from the goroutine that owns the op, before End.
func (*Op) SetOutcomeHint ¶
outcomeHint carries an outcome decided mid-op (e.g. joined vs executed), read back by the code that ends the op.
type OpKind ¶
type OpKind uint8
OpKind classifies operations.
const ( OpKindInvalid OpKind = iota // OpKindCall is one dagql GetOrInitCall invocation (per caller, // including cache hits and singleflight joiners). OpKindCall // OpKindCallExec is the shared execution of a call's resolver function. // Singleflighted callers all wait on one of these. OpKindCallExec // OpKindLazy is one run of a lazy evaluation callback for a result. OpKindLazy // OpKindExecPhase is a setup/run phase of a container exec // (e.g. exec.setupNetwork, exec.runContainer). OpKindExecPhase // OpKindExec is the overall run of a container by the executor. OpKindExec // OpKindServiceStart is the start (incl. health check) of a service. OpKindServiceStart // OpKindSessionPhase is a per-query session serving phase // (e.g. session.workspaceLoad, session.query). OpKindSessionPhase // OpKindIO is a leaf I/O operation (git fetch, image pull, filesync...). OpKindIO // OpKindInternal is engine-internal background work (gc, persistence). OpKindInternal )
type OpOpts ¶
type OpOpts struct {
// Ident is an instance identity for the op (e.g. recipe digest, exec ID).
Ident string
// ClientID is the dagger client this op serves, when known.
ClientID string
// ResultID may be set at Begin time when already known.
WorkType WorkType
}
OpOpts carries optional per-op metadata.
type Outcome ¶
type Outcome uint8
Outcome describes how an op completed.
const ( OutcomeNone Outcome = iota // OutcomeHit: dagql call satisfied from cache. OutcomeHit // OutcomeExecuted: dagql call missed cache and this caller spawned the // execution. OutcomeExecuted // OutcomeJoined: dagql call missed cache and joined an in-flight // execution started by another caller. OutcomeJoined // OutcomeDoNotCache: dagql call executed inline without caching. OutcomeDoNotCache // OutcomeOK: generic success for non-call ops. OutcomeOK // OutcomeError: op failed. OutcomeError // OutcomeCanceled: op canceled. OutcomeCanceled )
type Recorder ¶
type Recorder struct {
// contains filtered or unexported fields
}
Recorder collects profiling events. Safe for concurrent use.
func Active ¶
func Active() *Recorder
Active returns the recorder, or nil if profiling was never enabled. The recorder outlives DisableGlobal so buffered events can still be dumped; use GloballyEnabled/Enabled to ask whether work is being recorded.
func NewRecorder ¶
NewRecorder returns a recorder with the given event cap (<=0 means default).
type Wait ¶
type Wait struct {
// contains filtered or unexported fields
}
Wait is a handle for an in-progress wait interval. A nil *Wait is valid and End is a no-op.
func BeginWait ¶
func BeginWait(ctx context.Context, targetOpID uint64, reason WaitReason) *Wait
BeginWait starts recording that the current op (from ctx) is blocked on targetOpID for the given reason. Returns nil when profiling is disabled.
func BeginWaitIdent ¶
func BeginWaitIdent(ctx context.Context, ident string, reason WaitReason) *Wait
BeginWaitIdent is BeginWait for waits on a named resource (e.g. a lock) rather than another op.
type WaitReason ¶
type WaitReason uint8
WaitReason describes why an op was blocked.
const ( WaitReasonInvalid WaitReason = iota // WaitReasonCallExec: blocked waiting for a call's resolver execution // (the caller that spawned it). WaitReasonCallExec // WaitReasonSingleflight: blocked joining another caller's in-flight // execution of the same call. WaitReasonSingleflight // WaitReasonLazy: blocked waiting for a lazy evaluation to finish. WaitReasonLazy // WaitReasonService: blocked waiting for a service to start/be healthy. WaitReasonService // WaitReasonLock: blocked acquiring a lock (target is an interned // resource name, not an op). WaitReasonLock // WaitReasonExec: blocked waiting for a container exec to finish. WaitReasonExec // WaitReasonIO: blocked on external I/O. WaitReasonIO )
func (WaitReason) String ¶
func (r WaitReason) String() string
Directories
¶
| Path | Synopsis |
|---|---|
|
Package wcanalyze reconstructs an operation graph from a wcprof dump and runs offline wall-clock bottleneck analysis over it: self-time accounting, a replay-based counterfactual simulator, and per-class what-if rankings.
|
Package wcanalyze reconstructs an operation graph from a wcprof dump and runs offline wall-clock bottleneck analysis over it: self-time accounting, a replay-based counterfactual simulator, and per-class what-if rankings. |