Documentation
¶
Overview ¶
Package aksmachinepoller provides a GET-based poller for tracking individual AKS machine provisioning status by polling GET machine until terminal state. This is an alternative to the Azure SDK poller, which polls on AKS operation objects (through GET operation).
This approach works because provisioning error details and success status are derived from the AKS machine object itself (through the ProvisioningError field). One use case is batched AKS machine provisioning, where the batch coordinator sends one API call for N machines and gets back one SDK poller for the entire batch — it cannot track individual machines. Each machine needs its own poller, and polling GET machine is the most straightforward approach.
The poller sits on top of the same SDK HTTP client, so each GET call still passes through the SDK pipeline's per-request retry policy. See the Options doc comment for a detailed comparison with the SDK poller.
Note: there is a proposal to stop relying on ProvisioningError from machine objects and rely on AKS operation errors instead. That would require batched request error returning (potentially via upcoming ARM batch API) and rewriting error handling based on AKS error formats instead of CRP error formats. If that transition happens, this approach would need to be revisited.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type AKSMachineGetter ¶
type AKSMachineGetter interface {
Get(ctx context.Context, resourceGroupName string, resourceName string, agentPoolName string, aksMachineName string, options *armcontainerservice.MachinesClientGetOptions) (armcontainerservice.MachinesClientGetResponse, error)
}
type Options ¶
type Options struct {
// PollInterval is the interval between GET requests to check operation state.
// Equivalent to SDK's PollUntilDoneOptions.Frequency (default 30s).
PollInterval time.Duration
// RetryDelay is the initial delay before retrying after a transient GET error or
// unexpected state (nil/unrecognized provisioning state). Doubles on each consecutive
// retry, capped at MaxRetryDelay (exponential backoff).
// Equivalent to SDK's policy.RetryOptions.RetryDelay (default 800ms), but applied at
// the polling loop level rather than per-HTTP-request.
RetryDelay time.Duration
// MaxRetryDelay is the maximum delay between retries (exponential backoff cap).
// Equivalent to SDK's policy.RetryOptions.MaxRetryDelay (default 60s).
MaxRetryDelay time.Duration
// MaxRetries is the maximum number of consecutive retry attempts for transient GET
// errors or unexpected states before giving up. Resets to its initial value whenever
// a healthy non-terminal state (Creating/Updating) is observed (see comment above).
// Equivalent to SDK's policy.RetryOptions.MaxRetries (default 3), but scoped to the
// polling session rather than individual HTTP requests.
MaxRetries int
}
Options contains configuration for polling long-running operations.
How this poller relates to the Azure SDK poller ¶
The Azure SDK's runtime.Poller has two layers:
- Polling loop — exposes one option: Frequency (interval between polls, default 30s).
- HTTP pipeline — each poll request passes through policy.RetryOptions, which handles transient HTTP errors with exponential backoff (RetryDelay=800ms, MaxRetryDelay=60s, MaxRetries=3, status codes 408/429/500/502/503/504). Retries are scoped per-request: each poll gets its own fresh retry budget.
Our GET-based poller sits on top of the same SDK HTTP client, so each GET call still benefits from the SDK pipeline's per-request retries. The options here control an additional retry layer at the polling loop level, handling cases the SDK pipeline cannot: successful GETs that return unexpected state (nil or unrecognized provisioning state), or transient errors that persist beyond the SDK pipeline's per-request budget.
Differences from the SDK poller and why ¶
Retry-After headers: The SDK poller honors Retry-After on poll responses, overriding Frequency. We use a fixed PollInterval because the server does not typically send Retry-After on successful provisioning GET responses. Per-request Retry-After (e.g., on 429s) is still handled by the SDK HTTP pipeline underneath.
Retry budget reset: We reset MaxRetries when a healthy non-terminal state (Creating/ Updating) is observed. The SDK doesn't need this because its retries are per-request (each poll starts fresh). Ours are per-session (one budget across the entire loop), so without resetting, intermittent errors across a long-running operation would accumulate and exhaust the budget even though the operation is making progress. The reset makes our session-scoped budget behave equivalently to the SDK's per-request budget.
Per-try timeout (TryTimeout): Not implemented. The SDK HTTP pipeline's transport-level timeouts and context cancellation provide equivalent protection.
func DefaultOptions ¶
func DefaultOptions() Options
DefaultOptions returns production poller configuration.
func InstantOptions ¶
func InstantOptions() Options
InstantOptions returns poller configuration for tests where the fake returns Succeeded immediately. Uses minimal intervals to avoid delays while still exercising the polling code path.
type Poller ¶
type Poller struct {
// contains filtered or unexported fields
}
Poller polls AKS machine instances until they reach a terminal state. This follows Azure SDK polling patterns with exponential backoff for transient errors.
func (*Poller) PollUntilDone ¶
func (p *Poller) PollUntilDone(ctx context.Context) (*armcontainerservice.ErrorDetail, error)
PollUntilDone polls the AKS machine instance with GET calls until provisioning state is stabilized. If the provisioning is a success, returns nil. If provisioning is a failure, returns provisioning error. The function itself will error (second return value) only if the function is not performing as expected. E.g., getting a proper provisioning error from AKS machine API is the expected behavior of this function, this won't be considered function error.
ASSUMPTION: the AKS machine creation has already begun, and is visible from the API (using GET).