recipe

package
v0.11.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 21, 2026 License: Apache-2.0 Imports: 28 Imported by: 0

Documentation

Overview

Package recipe provides recipe building and matching functionality.

Package recipe provides recipe building and matching functionality.

Package recipe provides configuration recipe generation based on deployment criteria.

Overview

The recipe package generates tailored configuration recommendations for GPU-accelerated Kubernetes clusters. It uses a metadata-driven model where base configurations are enhanced with criteria-specific overlays to produce deployment-ready component references.

Core Types

Criteria: Specifies target deployment parameters

type Criteria struct {
    Service     CriteriaServiceType     // eks, gke, aks, any
    Accelerator CriteriaAcceleratorType // h100, gb200, b200, a100, l40, any
    Intent      CriteriaIntentType      // training, inference, any
    OS          CriteriaOSType          // ubuntu, cos, rhel, any
    Nodes       int                     // node count (0 = any)
}

RecipeResult: Generated configuration result

type RecipeResult struct {
    Header                              // API version, kind, metadata
    Criteria      *Criteria             // Input criteria
    MatchedRules  []string              // Applied overlay rules
    ComponentRefs []ComponentRef        // Component references (Helm or Kustomize)
    Constraints   []ConstraintRef       // Validation constraints
}

Recipe: Legacy format still used by bundlers

type Recipe struct {
    Header                              // API version, kind, metadata
    Request      *RequestInfo           // Input metadata (optional)
    MatchedRules []string               // Applied overlay rules
    Measurements []*measurement.Measurement // Configuration data
}

Builder: Generates recipes from criteria

type Builder struct {
    Version string  // Builder version for tracking
}

Criteria Types

Service types for Kubernetes environments:

  • CriteriaServiceEKS: Amazon EKS
  • CriteriaServiceGKE: Google GKE
  • CriteriaServiceAKS: Azure AKS
  • CriteriaServiceAny: Any service (wildcard)

Accelerator types for GPU selection:

  • CriteriaAcceleratorH100: NVIDIA H100
  • CriteriaAcceleratorGB200: NVIDIA GB200
  • CriteriaAcceleratorB200: NVIDIA B200
  • CriteriaAcceleratorA100: NVIDIA A100
  • CriteriaAcceleratorL40: NVIDIA L40
  • CriteriaAcceleratorAny: Any accelerator (wildcard)

Intent types for workload optimization:

  • CriteriaIntentTraining: ML training workloads
  • CriteriaIntentInference: Inference workloads
  • CriteriaIntentAny: Generic workloads

Usage

Basic recipe generation with criteria:

criteria := recipe.NewCriteria()
criteria.Service = recipe.CriteriaServiceEKS
criteria.Accelerator = recipe.CriteriaAcceleratorH100
criteria.Intent = recipe.CriteriaIntentTraining

ctx := context.Background()
builder := recipe.NewBuilder()
result, err := builder.BuildFromCriteria(ctx, criteria)
if err != nil {
    log.Fatal(err)
}

fmt.Printf("Matched rules: %v\n", result.MatchedRules)
for _, ref := range result.ComponentRefs {
    fmt.Printf("Component: %s, Version: %s\n", ref.Name, ref.Version)
}

HTTP handler for API server:

builder := recipe.NewBuilder()
http.HandleFunc("/v1/recipe", builder.HandleRecipes)

Parse criteria from HTTP request:

criteria, err := recipe.ParseCriteriaFromRequest(r)
if err != nil {
    http.Error(w, err.Error(), http.StatusBadRequest)
    return
}

Query Parameters (HTTP API - GET)

The HTTP handler accepts these query parameters for GET requests:

  • service: eks, gke, aks, any (default: any)
  • accelerator: h100, gb200, b200, a100, l40, any (default: any)
  • gpu: alias for accelerator (backwards compatibility)
  • intent: training, inference, any (default: any)
  • os: ubuntu, cos, rhel, any (default: any)
  • nodes: integer node count (default: 0 = any)

Criteria Files (CLI and HTTP API - POST)

Criteria can be defined in a Kubernetes-style YAML or JSON file using the RecipeCriteria resource type. This provides an alternative to individual CLI flags or query parameters.

RecipeCriteria: Kubernetes-style resource for criteria definition

type RecipeCriteria struct {
    Kind       string    // Must be "RecipeCriteria"
    APIVersion string    // Must be "aicr.nvidia.com/v1alpha1"
    Metadata   struct {
        Name string       // Optional descriptive name
    }
    Spec *Criteria        // The criteria specification
}

Example criteria file (criteria.yaml):

kind: RecipeCriteria
apiVersion: aicr.nvidia.com/v1alpha1
metadata:
  name: gb200-eks-ubuntu-training
spec:
  service: eks
  os: ubuntu
  accelerator: gb200
  intent: training

Load criteria from a file:

criteria, err := recipe.LoadCriteriaFromFile("/path/to/criteria.yaml")
if err != nil {
    log.Fatal(err)
}
result, err := builder.BuildFromCriteria(ctx, criteria)

Parse criteria from HTTP request body (POST):

criteria, err := recipe.ParseCriteriaFromBody(r.Body, r.Header.Get("Content-Type"))
if err != nil {
    http.Error(w, err.Error(), http.StatusBadRequest)
    return
}

CLI usage with criteria file:

aicr recipe --criteria /path/to/criteria.yaml --output recipe.yaml

CLI flags can override criteria file values:

aicr recipe --criteria criteria.yaml --service gke --output recipe.yaml

HTTP API POST request:

curl -X POST http://localhost:8080/v1/recipe \
  -H "Content-Type: application/yaml" \
  -d @criteria.yaml

Criteria Matching

Criteria use asymmetric matching with priority-based resolution:

Recipe Wildcard (recipe field = "any"):

  • Recipe "any" acts as a wildcard, matching any query value
  • Example: Recipe with accelerator="any" matches query accelerator="h100"

Query Wildcard (query field = "any"):

  • Query "any" only matches recipes that also have "any"
  • Prevents generic queries from matching overly-specific recipes
  • Example: Query accelerator="any" does NOT match recipe accelerator="h100"

Exact Match:

  • Query service="eks", accelerator="h100" matches recipe with same values

Priority:

  • More specific overlays take precedence
  • Multiple matching overlays are applied in priority order
  • Later overlays can override earlier ones

Metadata Store Model

Recipe generation uses YAML metadata files:

1. Load overlays/base.yaml (common component versions and settings) 2. Find matching overlay files based on criteria 3. Merge overlay configurations into result 4. Return RecipeResult with component references

Base structure (recipes/overlays/base.yaml):

apiVersion: aicr.nvidia.com/v1alpha1
kind: Base
metadata:
  name: base
  version: v1.0.0
components:
  - name: gpu-operator
    version: v25.3.3
    repository: https://helm.ngc.nvidia.com/nvidia

Overlay structure (recipes/overlays/*.yaml):

apiVersion: aicr.nvidia.com/v1alpha1
kind: Overlay
metadata:
  name: h100-training
  priority: 100
match:
  accelerator: h100
  intent: training
components:
  - name: gpu-operator
    version: v25.3.3
    values:
      mig.strategy: mixed

RecipeInput Interface

The RecipeInput interface allows bundlers to work with both legacy Recipe and new RecipeResult formats:

type RecipeInput interface {
    GetMeasurements() []*measurement.Measurement
    GetComponentRef(name string) *ComponentRef
    GetValuesForComponent(name string) (map[string]any, error)
}

Error Handling

BuildFromCriteria returns errors when:

  • Criteria is nil
  • Metadata store cannot be loaded
  • No matching overlays found
  • Component configuration is invalid

ParseCriteriaFromRequest returns errors when:

  • Service type is invalid
  • Accelerator type is invalid
  • Intent type is invalid
  • Nodes count is negative or non-numeric

Data Source

Recipe metadata is embedded at build time from:

  • recipes/overlays/base.yaml (base component versions)
  • recipes/overlays/*.yaml (criteria-specific overlays)

The metadata store is loaded once and cached (singleton pattern with sync.Once).

Observability

The recipe builder exports Prometheus metrics:

  • recipe_built_duration_seconds: Time to build recipe
  • recipe_rule_match_total{status}: Rule matching statistics

Integration

The recipe package is used by:

  • pkg/cli - recipe command for CLI usage
  • pkg/api - API recipe endpoint
  • pkg/bundler - Bundle generation from recipes

It depends on:

  • pkg/measurement - Measurement data structures
  • pkg/version - Version parsing
  • pkg/header - Common header types
  • pkg/errors - Structured error handling

Component Types

The recipe system supports two component deployment types:

Helm Components:

  • Use Helm charts for deployment
  • Configured via helm section in registry.yaml
  • Support values files and inline overrides

Kustomize Components:

  • Use Kustomize for deployment
  • Configured via kustomize section in registry.yaml
  • Support Git/OCI sources with path and tag

The component registry (recipes/registry.yaml) determines component defaults. Components must have either helm OR kustomize configuration.

Subpackages

  • recipe/version - Semantic version parsing (moved to pkg/version)
  • recipe/header - Common header structures (moved to pkg/header)

Package recipe provides recipe building and matching functionality.

Index

Constants

View Source
const (
	EnvAllowedAccelerators = "AICR_ALLOWED_ACCELERATORS"
	EnvAllowedServices     = "AICR_ALLOWED_SERVICES"
	EnvAllowedIntents      = "AICR_ALLOWED_INTENTS"
	EnvAllowedOSTypes      = "AICR_ALLOWED_OS"
)

Environment variable names for allowlist configuration.

View Source
const (
	// DefaultMaxFileSize is the default maximum file size (10MB).
	DefaultMaxFileSize = 10 * 1024 * 1024
)
View Source
const RecipeAPIVersion = "aicr.nvidia.com/v1alpha1"

RecipeAPIVersion is the API version for recipe metadata and result resources.

View Source
const RecipeCriteriaAPIVersion = "aicr.nvidia.com/v1alpha1"

RecipeCriteriaAPIVersion is the API version for RecipeCriteria resources.

View Source
const RecipeCriteriaKind = "RecipeCriteria"

RecipeCriteriaKind is the kind value for RecipeCriteria resources.

View Source
const RecipeMetadataKind = "RecipeMetadata"

RecipeMetadataKind is the kind value for RecipeMetadata resources.

View Source
const RecipeResultKind = "RecipeResult"

RecipeResultKind is the kind value for RecipeResult resources.

Variables

This section is empty.

Functions

func GetCriteriaAcceleratorTypes

func GetCriteriaAcceleratorTypes() []string

GetCriteriaAcceleratorTypes returns all supported accelerator types sorted alphabetically.

func GetCriteriaIntentTypes

func GetCriteriaIntentTypes() []string

GetCriteriaIntentTypes returns all supported intent types sorted alphabetically.

func GetCriteriaOSTypes

func GetCriteriaOSTypes() []string

GetCriteriaOSTypes returns all supported OS types sorted alphabetically.

func GetCriteriaPlatformTypes

func GetCriteriaPlatformTypes() []string

GetCriteriaPlatformTypes returns all supported platform types sorted alphabetically.

func GetCriteriaServiceTypes

func GetCriteriaServiceTypes() []string

GetCriteriaServiceTypes returns all supported service types sorted alphabetically.

func GetEmbeddedFS

func GetEmbeddedFS() embed.FS

GetEmbeddedFS returns the embedded data filesystem. This is used by the CLI to create layered data providers.

func GetManifestContent

func GetManifestContent(path string) ([]byte, error)

GetManifestContent retrieves a manifest file from the data provider. Path should be relative to data directory (e.g., "components/gpu-operator/manifests/dcgm-exporter.yaml").

func HydrateResult added in v0.11.0

func HydrateResult(result *RecipeResult) (map[string]any, error)

HydrateResult builds a fully hydrated map from a RecipeResult. Component values are merged via GetValuesForComponent so the output contains the final resolved configuration, not file references.

func MatchesCriteriaField added in v0.9.0

func MatchesCriteriaField(recipeValue, queryValue string) bool

MatchesCriteriaField implements asymmetric matching for a single criteria field. Returns true if the recipe field matches the query field.

Matching rules:

  • Query is "any"/empty → only matches if recipe is also "any"/empty
  • Recipe is "any"/empty → matches any query value (recipe is generic/wildcard)
  • Otherwise → must match exactly

func ResetComponentRegistryForTesting added in v0.7.8

func ResetComponentRegistryForTesting()

ResetComponentRegistryForTesting resets the singleton registry so it will be reloaded from the current DataProvider on the next call to GetComponentRegistry. This must only be called from tests.

func Select added in v0.11.0

func Select(hydrated map[string]any, selector string) (any, error)

Select walks a dot-path selector against a hydrated map and returns the value at that path. Returns ErrCodeNotFound for invalid paths. An empty selector returns the entire map.

func SetDataProvider

func SetDataProvider(provider DataProvider)

SetDataProvider sets the global data provider. This should be called before any recipe operations if using external data. Note: This invalidates cached data, so callers should ensure this is called early in the application lifecycle.

Types

type AllowLists

type AllowLists struct {
	// Accelerators is the list of allowed accelerator types (e.g., "h100", "l40").
	// If empty, all accelerator types are allowed.
	Accelerators []CriteriaAcceleratorType

	// Services is the list of allowed service types (e.g., "eks", "gke").
	// If empty, all service types are allowed.
	Services []CriteriaServiceType

	// Intents is the list of allowed intent types (e.g., "training", "inference").
	// If empty, all intent types are allowed.
	Intents []CriteriaIntentType

	// OSTypes is the list of allowed OS types (e.g., "ubuntu", "rhel").
	// If empty, all OS types are allowed.
	OSTypes []CriteriaOSType
}

AllowLists defines which criteria values are permitted for API requests. An empty or nil slice means all values are allowed for that criteria type. This is used by the API server to restrict which values can be requested, while the CLI always allows all values.

func ParseAllowListsFromEnv

func ParseAllowListsFromEnv() (*AllowLists, error)

ParseAllowListsFromEnv parses allowlist configuration from environment variables. Returns nil if no allowlist environment variables are set. Environment variables:

  • AICR_ALLOWED_ACCELERATORS: comma-separated list of accelerator types (e.g., "h100,l40")
  • AICR_ALLOWED_SERVICES: comma-separated list of service types (e.g., "eks,gke")
  • AICR_ALLOWED_INTENTS: comma-separated list of intent types (e.g., "training,inference")
  • AICR_ALLOWED_OS: comma-separated list of OS types (e.g., "ubuntu,rhel")

Invalid values in the environment variables are skipped with a warning logged.

func (*AllowLists) AcceleratorStrings

func (a *AllowLists) AcceleratorStrings() []string

AcceleratorStrings returns the allowed accelerator types as strings.

func (*AllowLists) IntentStrings

func (a *AllowLists) IntentStrings() []string

IntentStrings returns the allowed intent types as strings.

func (*AllowLists) IsEmpty

func (a *AllowLists) IsEmpty() bool

IsEmpty returns true if no allowlists are configured (all values allowed).

func (*AllowLists) OSTypeStrings

func (a *AllowLists) OSTypeStrings() []string

OSTypeStrings returns the allowed OS types as strings.

func (*AllowLists) ServiceStrings

func (a *AllowLists) ServiceStrings() []string

ServiceStrings returns the allowed service types as strings.

func (*AllowLists) ValidateCriteria

func (a *AllowLists) ValidateCriteria(c *Criteria) error

ValidateCriteria checks if the given criteria values are permitted by the allowlists. Returns nil if validation passes, or an error with details about what value is not allowed. The "any" value is always allowed regardless of the allowlist configuration.

type Builder

type Builder struct {
	Version    string
	AllowLists *AllowLists
}

Builder constructs RecipeResult payloads based on Criteria specifications. It loads recipe metadata, applies matching overlays, and generates tailored configuration recipes.

func NewBuilder

func NewBuilder(opts ...Option) *Builder

NewBuilder creates a new Builder instance with the provided functional options.

func (*Builder) BuildFromCriteria

func (b *Builder) BuildFromCriteria(ctx context.Context, c *Criteria) (*RecipeResult, error)

BuildFromCriteria creates a RecipeResult payload for the provided criteria. It loads the metadata store, applies matching overlays, and returns a RecipeResult with merged components and computed deployment order.

func (*Builder) BuildFromCriteriaWithEvaluator

func (b *Builder) BuildFromCriteriaWithEvaluator(ctx context.Context, c *Criteria, evaluator ConstraintEvaluatorFunc) (*RecipeResult, error)

BuildFromCriteriaWithEvaluator creates a RecipeResult payload for the provided criteria, filtering overlays based on constraint evaluation against snapshot data.

When an evaluator function is provided:

  • Overlays that match by criteria but fail constraint evaluation are excluded
  • Constraint warnings are included in the result metadata for visibility
  • Only overlays whose constraints pass (or have no constraints) are merged

The evaluator function is typically created by wrapping validator.EvaluateConstraint with the snapshot data.

func (*Builder) HandleQuery added in v0.11.0

func (b *Builder) HandleQuery(w http.ResponseWriter, r *http.Request)

HandleQuery processes query requests. It resolves a recipe from criteria, hydrates all component values, and returns the value at the given selector path. Supports GET with query parameters (+selector) and POST with JSON/YAML body.

func (*Builder) HandleRecipes

func (b *Builder) HandleRecipes(w http.ResponseWriter, r *http.Request)

HandleRecipes processes recipe requests using the criteria-based system. It supports GET requests with query parameters and POST requests with JSON/YAML body to specify recipe criteria. The response returns a RecipeResult with component references and constraints. Errors are handled and returned in a structured format.

type ComponentConfig

type ComponentConfig struct {
	// Name is the component identifier used in recipes (e.g., "gpu-operator").
	Name string `yaml:"name"`

	// DisplayName is the human-readable name used in templates and output.
	DisplayName string `yaml:"displayName"`

	// ValueOverrideKeys are alternative keys for --set flag matching.
	// Example: ["gpuoperator"] allows --set gpuoperator:key=value
	ValueOverrideKeys []string `yaml:"valueOverrideKeys,omitempty"`

	// Helm contains default Helm chart settings.
	Helm HelmConfig `yaml:"helm,omitempty"`

	// Kustomize contains default Kustomize settings.
	Kustomize KustomizeConfig `yaml:"kustomize,omitempty"`

	// NodeScheduling defines paths for injecting node selectors and tolerations.
	NodeScheduling NodeSchedulingConfig `yaml:"nodeScheduling,omitempty"`

	PodScheduling PodSchedulingConfig `yaml:"podScheduling,omitempty"`

	// Validations defines component-specific validation checks.
	Validations []ComponentValidationConfig `yaml:"validations,omitempty"`

	// HealthCheck defines custom health check configuration for this component.
	HealthCheck HealthCheckConfig `yaml:"healthCheck,omitempty"`
}

ComponentConfig defines the bundler configuration for a component. This replaces the per-component Go packages with declarative YAML.

func (*ComponentConfig) GetAcceleratedNodeSelectorPaths

func (c *ComponentConfig) GetAcceleratedNodeSelectorPaths() []string

GetAcceleratedNodeSelectorPaths returns all accelerated node selector paths for a component.

func (*ComponentConfig) GetAcceleratedTaintStrPaths

func (c *ComponentConfig) GetAcceleratedTaintStrPaths() []string

GetAcceleratedTaintStrPaths returns all accelerated taint string paths for a component.

func (*ComponentConfig) GetAcceleratedTolerationPaths

func (c *ComponentConfig) GetAcceleratedTolerationPaths() []string

GetAcceleratedTolerationPaths returns all accelerated toleration paths for a component.

func (*ComponentConfig) GetNodeCountPaths added in v0.8.0

func (c *ComponentConfig) GetNodeCountPaths() []string

GetNodeCountPaths returns Helm value paths where the node count is injected.

func (*ComponentConfig) GetSystemNodeSelectorPaths

func (c *ComponentConfig) GetSystemNodeSelectorPaths() []string

GetSystemNodeSelectorPaths returns all system node selector paths for a component.

func (*ComponentConfig) GetSystemTolerationPaths

func (c *ComponentConfig) GetSystemTolerationPaths() []string

GetSystemTolerationPaths returns all system toleration paths for a component.

func (*ComponentConfig) GetType

func (c *ComponentConfig) GetType() ComponentType

GetType returns the component deployment type based on which config is present. Returns ComponentTypeKustomize if Kustomize.DefaultSource is set, otherwise returns ComponentTypeHelm (the default).

func (*ComponentConfig) GetValidations

func (c *ComponentConfig) GetValidations() []ComponentValidationConfig

GetValidations returns all validation configurations for a component.

func (*ComponentConfig) GetWorkloadSelectorPaths

func (c *ComponentConfig) GetWorkloadSelectorPaths() []string

GetWorkloadSelectorPaths returns all workload selector paths for a component.

type ComponentRef

type ComponentRef struct {
	// Name is the unique identifier for this component.
	Name string `json:"name" yaml:"name"`

	// Namespace is the Kubernetes namespace for deploying this component.
	Namespace string `json:"namespace,omitempty" yaml:"namespace,omitempty"`

	// Chart is the Helm chart name (e.g., "gpu-operator").
	Chart string `json:"chart,omitempty" yaml:"chart,omitempty"`

	// Type is the deployment type (Helm, Kustomize).
	Type ComponentType `json:"type" yaml:"type"`

	// Source is the repository URL or OCI reference.
	Source string `json:"source" yaml:"source"`

	// Version is the chart/component version (for Helm).
	Version string `json:"version,omitempty" yaml:"version,omitempty"`

	// Tag is the image/resource tag (for Kustomize).
	Tag string `json:"tag,omitempty" yaml:"tag,omitempty"`

	// ValuesFile is the path to the values file (relative to data directory).
	ValuesFile string `json:"valuesFile,omitempty" yaml:"valuesFile,omitempty"`

	// Overrides contains inline values that override those from ValuesFile.
	// Merge order: base values → ValuesFile → Overrides (highest precedence).
	Overrides map[string]any `json:"overrides,omitempty" yaml:"overrides,omitempty"`

	// Patches is a list of patch files to apply (for Kustomize).
	Patches []string `json:"patches,omitempty" yaml:"patches,omitempty"`

	// DependencyRefs is a list of component names this component depends on.
	DependencyRefs []string `json:"dependencyRefs,omitempty" yaml:"dependencyRefs,omitempty"`

	// ManifestFiles lists manifest files to include in the component bundle.
	// Paths are relative to the data directory.
	// Example: ["components/gpu-operator/manifests/dcgm-exporter.yaml"]
	ManifestFiles []string `json:"manifestFiles,omitempty" yaml:"manifestFiles,omitempty"`

	// Path is the path within the repository to the kustomization (for Kustomize).
	Path string `json:"path,omitempty" yaml:"path,omitempty"`

	// Cleanup indicates whether to uninstall this component after validation.
	// Used for validation infrastructure components (e.g., nccl-doctor).
	Cleanup bool `json:"cleanup,omitempty" yaml:"cleanup,omitempty"`

	// ExpectedResources lists Kubernetes resources that should exist after deployment.
	// Used by deployment phase validation to verify component health.
	ExpectedResources []ExpectedResource `json:"expectedResources,omitempty" yaml:"expectedResources,omitempty"`

	// HealthCheckAsserts contains raw Chainsaw-style assert YAML loaded from the
	// registry's healthCheck.assertFile via the DataProvider. When non-empty, the
	// expected-resources check runs Chainsaw CLI to evaluate assertions instead of
	// the default auto-discovery + typed replica checks.
	HealthCheckAsserts string `json:"healthCheckAsserts,omitempty" yaml:"healthCheckAsserts,omitempty"`
}

ComponentRef represents a reference to a deployable component.

func (*ComponentRef) ApplyRegistryDefaults

func (ref *ComponentRef) ApplyRegistryDefaults(config *ComponentConfig)

ApplyRegistryDefaults fills in ComponentRef fields from ComponentConfig defaults. This applies registry defaults for fields that are not already set in the ComponentRef.

func (ComponentRef) IsEnabled added in v0.10.13

func (c ComponentRef) IsEnabled() bool

IsEnabled returns whether this component is enabled for deployment. A component is disabled when its Overrides map contains enabled: false. Components without an explicit enabled override are enabled by default.

type ComponentRegistry

type ComponentRegistry struct {
	APIVersion string            `yaml:"apiVersion"`
	Kind       string            `yaml:"kind"`
	Components []ComponentConfig `yaml:"components"`
	// contains filtered or unexported fields
}

ComponentRegistry holds the declarative configuration for all components. This is loaded from embedded recipe data (recipes/registry.yaml) at startup.

func GetComponentRegistry

func GetComponentRegistry() (*ComponentRegistry, error)

GetComponentRegistry returns the global component registry. The registry is loaded once from embedded data and cached. Returns an error if the registry file cannot be loaded or parsed.

func (*ComponentRegistry) Count

func (r *ComponentRegistry) Count() int

Count returns the number of components in the registry.

func (*ComponentRegistry) Get

Get returns the component configuration by name. Returns nil if the component is not found.

func (*ComponentRegistry) GetByOverrideKey

func (r *ComponentRegistry) GetByOverrideKey(key string) *ComponentConfig

GetByOverrideKey returns the component configuration by value override key. This is used for matching --set flags like --set gpuoperator:key=value. Returns nil if no component matches the key.

func (*ComponentRegistry) Names

func (r *ComponentRegistry) Names() []string

Names returns all component names in the registry.

func (*ComponentRegistry) Validate

func (r *ComponentRegistry) Validate() []error

Validate checks the component registry for errors. Returns a slice of validation errors (empty if valid).

type ComponentType

type ComponentType string

ComponentType represents the type of component deployment.

const (
	ComponentTypeHelm      ComponentType = "Helm"
	ComponentTypeKustomize ComponentType = "Kustomize"
)

ComponentType constants for supported deployment types.

type ComponentValidationConfig

type ComponentValidationConfig struct {
	// Function is the name of the validation function to execute (e.g., "CheckWorkloadSelectorMissing").
	Function string `yaml:"function"`

	// Severity determines whether failures are warnings or errors ("warning" or "error").
	Severity string `yaml:"severity"`

	// Conditions are optional conditions that must be met for the validation to run.
	// Values are arrays of strings for OR matching (single element arrays are equivalent to single values).
	// Example: {"intent": ["training"]} or {"intent": ["training", "inference"]}
	Conditions map[string][]string `yaml:"conditions,omitempty"`

	// Message is an optional detail message to append to validation failures/warnings.
	Message string `yaml:"message,omitempty"`
}

ComponentValidationConfig defines a component-specific validation check.

type Constraint

type Constraint struct {
	// Name is the constraint identifier (e.g., "k8s", "worker-os").
	Name string `json:"name" yaml:"name"`

	// Value is the constraint expression (e.g., ">= 1.30", "ubuntu").
	Value string `json:"value" yaml:"value"`

	// Severity indicates the constraint severity ("error" or "warning").
	Severity string `json:"severity,omitempty" yaml:"severity,omitempty"`

	// Remediation provides actionable guidance for fixing failed constraints.
	Remediation string `json:"remediation,omitempty" yaml:"remediation,omitempty"`

	// Unit specifies the unit for numeric constraints (e.g., "GB/s").
	Unit string `json:"unit,omitempty" yaml:"unit,omitempty"`
}

Constraint represents a deployment constraint/assumption.

type ConstraintEvalResult

type ConstraintEvalResult struct {
	// Passed indicates if the constraint was satisfied.
	Passed bool

	// Actual is the actual value extracted from the snapshot.
	Actual string

	// Error contains the error if evaluation failed (e.g., value not found).
	Error error
}

ConstraintEvalResult represents the result of evaluating a single constraint. This mirrors the result from pkg/validator to avoid circular imports.

type ConstraintEvaluatorFunc

type ConstraintEvaluatorFunc func(constraint Constraint) ConstraintEvalResult

ConstraintEvaluatorFunc is a function type for evaluating constraints. It takes a constraint and returns the evaluation result. This function type allows the recipe package to use constraint evaluation from the validator package without creating a circular dependency.

type ConstraintWarning

type ConstraintWarning struct {
	// Overlay is the name of the overlay that was excluded.
	Overlay string `json:"overlay" yaml:"overlay"`

	// Constraint is the name of the constraint that failed.
	Constraint string `json:"constraint" yaml:"constraint"`

	// Expected is the expected constraint value.
	Expected string `json:"expected" yaml:"expected"`

	// Actual is the actual value from the snapshot (if found).
	Actual string `json:"actual,omitempty" yaml:"actual,omitempty"`

	// Reason explains why the constraint evaluation resulted in exclusion.
	Reason string `json:"reason" yaml:"reason"`
}

ConstraintWarning represents a warning about an overlay that matched criteria but was excluded due to failing constraint validation against the snapshot.

type Criteria

type Criteria struct {
	// Service is the Kubernetes service type (eks, gke, aks, oke, self-managed).
	Service CriteriaServiceType `json:"service,omitempty" yaml:"service,omitempty"`

	// Accelerator is the GPU/accelerator type (h100, gb200, b200, a100, l40).
	Accelerator CriteriaAcceleratorType `json:"accelerator,omitempty" yaml:"accelerator,omitempty"`

	// Intent is the workload intent (training, inference).
	Intent CriteriaIntentType `json:"intent,omitempty" yaml:"intent,omitempty"`

	// OS is the worker node operating system type.
	OS CriteriaOSType `json:"os,omitempty" yaml:"os,omitempty"`

	// Platform is the platform/framework type (kubeflow).
	Platform CriteriaPlatformType `json:"platform,omitempty" yaml:"platform,omitempty"`

	// Nodes is the number of worker nodes (0 means any/unspecified).
	Nodes int `json:"nodes,omitempty" yaml:"nodes,omitempty"`
}

Criteria represents the input parameters for recipe matching. All fields are optional and default to "any" if not specified.

func BuildCriteria

func BuildCriteria(opts ...CriteriaOption) (*Criteria, error)

BuildCriteria creates a Criteria from functional options.

func ExtractCriteriaFromSnapshot

func ExtractCriteriaFromSnapshot(snap *snapshotter.Snapshot) *Criteria

ExtractCriteriaFromSnapshot extracts criteria from a snapshot. This maps snapshot measurements to criteria fields.

func LoadCriteriaFromFile

func LoadCriteriaFromFile(path string) (*Criteria, error)

LoadCriteriaFromFile loads criteria from a YAML or JSON file. The file format is auto-detected from the file extension. All fields are optional and default to "any" if not specified.

Example file (YAML):

kind: RecipeCriteria
apiVersion: aicr.nvidia.com/v1alpha1
metadata:
  name: gb200-eks-ubuntu-training
spec:
  service: eks
  os: ubuntu
  accelerator: gb200
  intent: training

func LoadCriteriaFromFileWithContext

func LoadCriteriaFromFileWithContext(ctx context.Context, path string) (*Criteria, error)

LoadCriteriaFromFileWithContext loads criteria from a YAML or JSON file with context support. The file format is auto-detected from the file extension. All fields are optional and default to "any" if not specified.

For HTTP/HTTPS URLs, the context is used for timeout and cancellation. For local file paths, the context is currently not used but is accepted for API consistency.

Example file (YAML):

kind: RecipeCriteria
apiVersion: aicr.nvidia.com/v1alpha1
metadata:
  name: gb200-eks-ubuntu-training
spec:
  service: eks
  os: ubuntu
  accelerator: gb200
  intent: training

func NewCriteria

func NewCriteria() *Criteria

NewCriteria creates a new Criteria with all fields set to "any".

func ParseCriteriaFromBody

func ParseCriteriaFromBody(body io.Reader, contentType string) (*Criteria, error)

ParseCriteriaFromBody parses criteria from an io.Reader (HTTP request body). Supports JSON and YAML based on the Content-Type header. All fields are optional and default to "any" if not specified.

Supported Content-Types:

  • application/json
  • application/x-yaml, application/yaml, text/yaml

If Content-Type is empty or unrecognized, JSON is assumed.

Example JSON body:

{
  "kind": "RecipeCriteria",
  "apiVersion": "aicr.nvidia.com/v1alpha1",
  "metadata": {"name": "my-criteria"},
  "spec": {"service": "eks", "accelerator": "h100"}
}

func ParseCriteriaFromRequest

func ParseCriteriaFromRequest(r *http.Request) (*Criteria, error)

ParseCriteriaFromRequest parses recipe criteria from HTTP query parameters. All parameters are optional and default to "any" if not specified. Supported parameters: service, accelerator (alias: gpu), intent, os, nodes.

func ParseCriteriaFromValues

func ParseCriteriaFromValues(values url.Values) (*Criteria, error)

ParseCriteriaFromValues parses recipe criteria from URL values. All parameters are optional and default to "any" if not specified. Supported parameters: service, accelerator (alias: gpu), intent, os, nodes.

func (*Criteria) Matches

func (c *Criteria) Matches(other *Criteria) bool

Matches checks if this recipe criteria matches the given query criteria. Uses asymmetric matching:

  • Query "any" (or empty) = ONLY matches recipes that are also "any"/empty for that field
  • Recipe "any" (or empty) = wildcard (matches any query value for that field)
  • Query specific + Recipe specific = must match exactly

This ensures a generic query (e.g., accelerator=any) only matches generic recipes (e.g., accelerator=any), while a specific query (e.g., accelerator=gb200) can match both generic recipes and recipes with that specific value.

func (*Criteria) Specificity

func (c *Criteria) Specificity() int

Specificity returns a score indicating how specific this criteria is. Higher scores mean more specific criteria (fewer "any" fields). Used for ordering overlay application - more specific overlays are applied later.

func (*Criteria) String

func (c *Criteria) String() string

String returns a human-readable representation of the criteria.

func (*Criteria) Validate added in v0.11.0

func (c *Criteria) Validate() error

Validate checks that all non-empty criteria fields contain valid values. This runs the same parsing/normalization as ParseCriteriaFromValues, ensuring POST request bodies are validated the same as GET query parameters.

type CriteriaAcceleratorType

type CriteriaAcceleratorType string

CriteriaAcceleratorType represents the GPU/accelerator type.

const (
	CriteriaAcceleratorAny   CriteriaAcceleratorType = "any"
	CriteriaAcceleratorH100  CriteriaAcceleratorType = "h100"
	CriteriaAcceleratorGB200 CriteriaAcceleratorType = "gb200"
	CriteriaAcceleratorB200  CriteriaAcceleratorType = "b200"
	CriteriaAcceleratorA100  CriteriaAcceleratorType = "a100"
	CriteriaAcceleratorL40   CriteriaAcceleratorType = "l40"
)

CriteriaAcceleratorType constants for supported accelerators.

func ParseCriteriaAcceleratorType

func ParseCriteriaAcceleratorType(s string) (CriteriaAcceleratorType, error)

ParseCriteriaAcceleratorType parses a string into a CriteriaAcceleratorType.

type CriteriaIntentType

type CriteriaIntentType string

CriteriaIntentType represents the workload intent.

const (
	CriteriaIntentAny       CriteriaIntentType = "any"
	CriteriaIntentTraining  CriteriaIntentType = "training"
	CriteriaIntentInference CriteriaIntentType = "inference"
)

CriteriaIntentType constants for supported workload intents.

func ParseCriteriaIntentType

func ParseCriteriaIntentType(s string) (CriteriaIntentType, error)

ParseCriteriaIntentType parses a string into a CriteriaIntentType.

type CriteriaOSType

type CriteriaOSType string

CriteriaOSType represents an operating system type.

const (
	CriteriaOSAny         CriteriaOSType = "any"
	CriteriaOSUbuntu      CriteriaOSType = "ubuntu"
	CriteriaOSRHEL        CriteriaOSType = "rhel"
	CriteriaOSCOS         CriteriaOSType = "cos"
	CriteriaOSAmazonLinux CriteriaOSType = "amazonlinux"
)

CriteriaOSType constants for supported operating systems.

func ParseCriteriaOSType

func ParseCriteriaOSType(s string) (CriteriaOSType, error)

ParseCriteriaOSType parses a string into a CriteriaOSType.

type CriteriaOption

type CriteriaOption func(*Criteria) error

CriteriaOption is a functional option for building Criteria.

func WithCriteriaAccelerator

func WithCriteriaAccelerator(s string) CriteriaOption

WithCriteriaAccelerator sets the accelerator type.

func WithCriteriaIntent

func WithCriteriaIntent(s string) CriteriaOption

WithCriteriaIntent sets the intent type.

func WithCriteriaNodes

func WithCriteriaNodes(n int) CriteriaOption

WithCriteriaNodes sets the number of nodes.

func WithCriteriaOS

func WithCriteriaOS(s string) CriteriaOption

WithCriteriaOS sets the OS type.

func WithCriteriaPlatform

func WithCriteriaPlatform(s string) CriteriaOption

WithCriteriaPlatform sets the platform type.

func WithCriteriaService

func WithCriteriaService(s string) CriteriaOption

WithCriteriaService sets the service type.

type CriteriaPlatformType

type CriteriaPlatformType string

CriteriaPlatformType represents a platform/framework type.

const (
	CriteriaPlatformAny      CriteriaPlatformType = "any"
	CriteriaPlatformDynamo   CriteriaPlatformType = "dynamo"
	CriteriaPlatformKubeflow CriteriaPlatformType = "kubeflow"
)

CriteriaPlatformType constants for supported platforms.

func ParseCriteriaPlatformType

func ParseCriteriaPlatformType(s string) (CriteriaPlatformType, error)

ParseCriteriaPlatformType parses a string into a CriteriaPlatformType.

type CriteriaServiceType

type CriteriaServiceType string

CriteriaServiceType represents the Kubernetes service/platform type for criteria.

const (
	CriteriaServiceAny  CriteriaServiceType = "any"
	CriteriaServiceEKS  CriteriaServiceType = "eks"
	CriteriaServiceGKE  CriteriaServiceType = "gke"
	CriteriaServiceAKS  CriteriaServiceType = "aks"
	CriteriaServiceOKE  CriteriaServiceType = "oke"
	CriteriaServiceKind CriteriaServiceType = "kind"
)

CriteriaServiceType constants for supported Kubernetes services.

func ParseCriteriaServiceType

func ParseCriteriaServiceType(s string) (CriteriaServiceType, error)

ParseCriteriaServiceType parses a string into a CriteriaServiceType.

type DataProvider

type DataProvider interface {
	// ReadFile reads a file by path (relative to data directory).
	ReadFile(path string) ([]byte, error)

	// WalkDir walks the directory tree rooted at root.
	WalkDir(root string, fn fs.WalkDirFunc) error

	// Source returns a description of where data came from (for debugging).
	Source(path string) string
}

DataProvider abstracts access to recipe data files. This allows layering external directories over embedded data.

func GetDataProvider

func GetDataProvider() DataProvider

GetDataProvider returns the global data provider. Returns the embedded provider if none was set.

type EmbeddedDataProvider

type EmbeddedDataProvider struct {
	// contains filtered or unexported fields
}

EmbeddedDataProvider wraps an embed.FS to implement DataProvider.

func NewEmbeddedDataProvider

func NewEmbeddedDataProvider(efs embed.FS, prefix string) *EmbeddedDataProvider

NewEmbeddedDataProvider creates a provider from an embedded filesystem.

func (*EmbeddedDataProvider) ReadFile

func (p *EmbeddedDataProvider) ReadFile(path string) ([]byte, error)

ReadFile reads a file from the embedded filesystem.

func (*EmbeddedDataProvider) Source

func (p *EmbeddedDataProvider) Source(path string) string

Source returns "embedded" for all paths.

func (*EmbeddedDataProvider) WalkDir

func (p *EmbeddedDataProvider) WalkDir(root string, fn fs.WalkDirFunc) error

WalkDir walks the embedded filesystem.

type ExpectedResource

type ExpectedResource struct {
	// Kind is the resource kind (e.g., "Deployment", "DaemonSet").
	Kind string `json:"kind" yaml:"kind"`

	// Name is the resource name.
	Name string `json:"name" yaml:"name"`

	// Namespace is the resource namespace (optional for cluster-scoped resources).
	Namespace string `json:"namespace,omitempty" yaml:"namespace,omitempty"`
}

ExpectedResource represents a Kubernetes resource that should exist after deployment.

type HealthCheckConfig added in v0.7.8

type HealthCheckConfig struct {
	// AssertFile is the path to a Chainsaw-style assert YAML file (relative to data directory).
	// When set, the expected-resources check uses Chainsaw CLI to evaluate assertions
	// instead of the default auto-discovery + typed replica checks.
	AssertFile string `yaml:"assertFile,omitempty"`
}

HealthCheckConfig defines custom health check settings for a component.

type HelmConfig

type HelmConfig struct {
	// DefaultRepository is the default Helm repository URL.
	DefaultRepository string `yaml:"defaultRepository,omitempty"`

	// DefaultChart is the chart name (e.g., "nvidia/gpu-operator").
	DefaultChart string `yaml:"defaultChart,omitempty"`

	// DefaultVersion is the default chart version if not specified in recipe.
	DefaultVersion string `yaml:"defaultVersion,omitempty"`

	// DefaultNamespace is the Kubernetes namespace for deploying this component.
	DefaultNamespace string `yaml:"defaultNamespace,omitempty"`
}

HelmConfig contains default Helm chart settings for a component.

type KustomizeConfig

type KustomizeConfig struct {
	// DefaultSource is the default Git repository or OCI reference.
	DefaultSource string `yaml:"defaultSource,omitempty"`

	// DefaultPath is the path within the repository to the kustomization.
	DefaultPath string `yaml:"defaultPath,omitempty"`

	// DefaultTag is the default Git tag, branch, or commit.
	DefaultTag string `yaml:"defaultTag,omitempty"`
}

KustomizeConfig contains default Kustomize settings for a component.

type LayeredDataProvider

type LayeredDataProvider struct {
	// contains filtered or unexported fields
}

LayeredDataProvider overlays an external directory on top of embedded data. For registryFileName: merges external components with embedded (external takes precedence). For all other files: external completely replaces embedded if present.

func NewLayeredDataProvider

func NewLayeredDataProvider(embedded *EmbeddedDataProvider, config LayeredProviderConfig) (*LayeredDataProvider, error)

NewLayeredDataProvider creates a provider that layers external data over embedded. Returns an error if: - External directory doesn't exist - External directory doesn't contain registryFileName - Path traversal is detected - File size exceeds limits

func (*LayeredDataProvider) ExternalDir added in v0.8.0

func (p *LayeredDataProvider) ExternalDir() string

ExternalDir returns the path to the external data directory.

func (*LayeredDataProvider) ExternalFiles added in v0.8.0

func (p *LayeredDataProvider) ExternalFiles() []string

ExternalFiles returns a sorted list of file paths that came from the external data directory. Paths are relative to the external directory root.

func (*LayeredDataProvider) ReadFile

func (p *LayeredDataProvider) ReadFile(path string) ([]byte, error)

ReadFile reads a file, checking external directory first. For registryFileName, returns merged content. For other files, external completely replaces embedded.

func (*LayeredDataProvider) Source

func (p *LayeredDataProvider) Source(path string) string

Source returns "external" or "embedded" depending on where the file comes from.

func (*LayeredDataProvider) WalkDir

func (p *LayeredDataProvider) WalkDir(root string, fn fs.WalkDirFunc) error

WalkDir walks both embedded and external directories. External files take precedence over embedded files.

type LayeredProviderConfig

type LayeredProviderConfig struct {
	// ExternalDir is the path to the external data directory.
	ExternalDir string

	// MaxFileSize is the maximum allowed file size in bytes (default: 10MB).
	MaxFileSize int64

	// AllowSymlinks allows symlinks in the external directory (default: false).
	AllowSymlinks bool
}

LayeredProviderConfig configures the layered data provider.

type MetadataStore

type MetadataStore struct {
	// Base is the base recipe metadata.
	Base *RecipeMetadata

	// Overlays is a list of overlay recipes indexed by name.
	Overlays map[string]*RecipeMetadata

	// ValuesFiles contains embedded values file contents indexed by filename.
	ValuesFiles map[string][]byte
}

MetadataStore holds the base recipe and all overlays.

func (*MetadataStore) BuildRecipeResult

func (s *MetadataStore) BuildRecipeResult(ctx context.Context, criteria *Criteria) (*RecipeResult, error)

BuildRecipeResult builds a RecipeResult by merging base with matching overlays. Each matching overlay is resolved through its inheritance chain before merging. This enables multi-level inheritance: base → intermediate → overlay.

func (*MetadataStore) BuildRecipeResultWithEvaluator

func (s *MetadataStore) BuildRecipeResultWithEvaluator(ctx context.Context, criteria *Criteria, evaluator ConstraintEvaluatorFunc) (*RecipeResult, error)

BuildRecipeResultWithEvaluator builds a RecipeResult by merging base with matching overlays, filtering overlays based on constraint evaluation using the provided evaluator function.

This method extends BuildRecipeResult with constraint-aware filtering:

  • Each overlay that matches by criteria is tested against its constraints
  • Overlays with failing constraints are excluded from the merge
  • Warnings about excluded overlays are included in the result metadata

The evaluator function is called for each constraint in each matching overlay. If evaluator is nil, this method behaves identically to BuildRecipeResult.

func (*MetadataStore) FindMatchingOverlays

func (s *MetadataStore) FindMatchingOverlays(criteria *Criteria) []*RecipeMetadata

FindMatchingOverlays finds all overlays that match the given criteria. Returns overlays sorted by specificity (least specific first).

func (*MetadataStore) GetRecipeByName

func (s *MetadataStore) GetRecipeByName(name string) (*RecipeMetadata, bool)

GetRecipeByName returns a recipe metadata by name. Returns the base recipe if name is "base", otherwise looks up in overlays.

func (*MetadataStore) GetValuesFile

func (s *MetadataStore) GetValuesFile(filename string) ([]byte, error)

GetValuesFile returns the content of a values file by filename.

type NodeSchedulingConfig

type NodeSchedulingConfig struct {
	// System defines paths for system component scheduling.
	System SchedulingPaths `yaml:"system,omitempty"`

	// Accelerated defines paths for GPU/accelerated node scheduling.
	Accelerated SchedulingPaths `yaml:"accelerated,omitempty"`

	// NodeCountPaths are Helm value paths where the bundle-time node count is injected (e.g. estimatedNodeCount for skyhook-operator).
	NodeCountPaths []string `yaml:"nodeCountPaths,omitempty"`
}

NodeSchedulingConfig defines paths for node scheduling injection.

type NodeSelection

type NodeSelection struct {
	// Selector specifies label-based node selection.
	Selector map[string]string `json:"selector,omitempty" yaml:"selector,omitempty"`

	// MaxNodes limits the number of nodes to validate.
	MaxNodes int `json:"maxNodes,omitempty" yaml:"maxNodes,omitempty"`

	// ExcludeNodes lists node names to exclude from validation.
	ExcludeNodes []string `json:"excludeNodes,omitempty" yaml:"excludeNodes,omitempty"`
}

NodeSelection defines node filtering for validation scope.

type Option

type Option func(*Builder)

Option is a functional option for configuring Builder instances.

func WithAllowLists

func WithAllowLists(al *AllowLists) Option

WithAllowLists returns an Option that sets criteria allowlists for the Builder. When allowlists are configured, the Builder will reject criteria values that are not in the allowed list. This is used by the API server to restrict which criteria values can be requested.

func WithVersion

func WithVersion(version string) Option

WithVersion returns an Option that sets the Builder version string. The version is included in recipe metadata for tracking purposes.

type PodSchedulingConfig

type PodSchedulingConfig struct {
	// Workload defines paths for workload pod scheduling.
	Workload WorkloadSchedulingPaths `yaml:"workload,omitempty"`
}

PodSchedulingConfig defines paths for pod scheduling injection.

type QueryRequest added in v0.11.0

type QueryRequest struct {
	Criteria *Criteria `json:"criteria" yaml:"criteria"`
	Selector string    `json:"selector" yaml:"selector"`
}

QueryRequest represents a query API request body for POST.

type Recipe

type Recipe struct {
	header.Header `json:",inline" yaml:",inline"`

	Request      *RequestInfo               `json:"request,omitempty" yaml:"request,omitempty"`
	MatchedRules []string                   `json:"matchedRules,omitempty" yaml:"matchedRules,omitempty"`
	Measurements []*measurement.Measurement `json:"measurements" yaml:"measurements"`
}

Recipe represents the recipe response structure.

func (*Recipe) GetComponentRef

func (r *Recipe) GetComponentRef(name string) *ComponentRef

GetComponentRef returns nil for Recipe (v1 format doesn't have components).

func (*Recipe) GetCriteria

func (r *Recipe) GetCriteria() *Criteria

GetCriteria returns nil for Recipe (v1 format doesn't have criteria).

func (*Recipe) GetValuesForComponent

func (r *Recipe) GetValuesForComponent(name string) (map[string]any, error)

GetValuesForComponent extracts values from measurements for Recipe. This maintains backward compatibility with the legacy measurements-based format.

func (*Recipe) GetVersion

func (r *Recipe) GetVersion() string

GetVersion returns the recipe version from metadata.

func (*Recipe) Validate

func (v *Recipe) Validate() error

Validate validates a recipe against all registered bundlers that implement Validator.

func (*Recipe) ValidateStructure

func (v *Recipe) ValidateStructure() error

ValidateStructure performs basic structural validation.

type RecipeCriteria

type RecipeCriteria struct {
	// Kind is always "RecipeCriteria".
	Kind string `json:"kind" yaml:"kind"`

	// APIVersion is the API version (e.g., "aicr.nvidia.com/v1alpha1").
	APIVersion string `json:"apiVersion" yaml:"apiVersion"`

	// Metadata contains the name and other metadata.
	Metadata struct {
		// Name is the unique identifier for this criteria set.
		Name string `json:"name" yaml:"name"`
	} `json:"metadata" yaml:"metadata"`

	// Spec contains the actual criteria specification.
	Spec *Criteria `json:"spec" yaml:"spec"`
}

RecipeCriteria represents a Kubernetes-style criteria resource. This is the format used in criteria files and API requests.

Example:

kind: RecipeCriteria
apiVersion: aicr.nvidia.com/v1alpha1
metadata:
  name: gb200-eks-ubuntu-training
spec:
  service: eks
  os: ubuntu
  accelerator: gb200
  intent: training

type RecipeInput

type RecipeInput interface {
	// GetComponentRef returns the component reference for a given component name.
	// Returns nil if the component is not found.
	GetComponentRef(name string) *ComponentRef

	// GetValuesForComponent returns the values map for a given component.
	// For Recipe, this extracts values from measurements.
	// For RecipeResult, this loads values from the component's valuesFile.
	GetValuesForComponent(name string) (map[string]any, error)

	// GetVersion returns the recipe version (CLI version that generated the recipe).
	// Returns empty string if version is not available.
	GetVersion() string

	// GetCriteria returns the criteria used to generate this recipe.
	// Returns nil if criteria is not available (e.g., for legacy Recipe format).
	GetCriteria() *Criteria
}

RecipeInput is an interface that both Recipe and RecipeResult implement. This allows bundlers to work with either format during the transition period.

type RecipeMetadata

type RecipeMetadata struct {
	RecipeMetadataHeader `json:",inline" yaml:",inline"`

	// Spec contains the recipe specification.
	Spec RecipeMetadataSpec `json:"spec" yaml:"spec"`
}

RecipeMetadata represents a recipe definition (base or overlay).

type RecipeMetadataHeader

type RecipeMetadataHeader struct {
	// Kind is always "RecipeMetadata".
	Kind string `json:"kind" yaml:"kind"`

	// APIVersion is the API version (e.g., "aicr.nvidia.com/v1alpha1").
	APIVersion string `json:"apiVersion" yaml:"apiVersion"`

	// Metadata contains the name and other metadata.
	Metadata struct {
		Name string `json:"name" yaml:"name"`
	} `json:"metadata" yaml:"metadata"`
}

RecipeMetadataHeader contains the Kubernetes-style header fields.

type RecipeMetadataSpec

type RecipeMetadataSpec struct {
	// Base is the name of the parent recipe to inherit from.
	// If empty, the recipe inherits from "base" (the root base.yaml).
	// This enables multi-level inheritance chains like:
	//   base → eks → eks-training → h100-eks-training
	Base string `json:"base,omitempty" yaml:"base,omitempty"`

	// Criteria defines when this recipe/overlay applies.
	// Only present in overlay files, not in base.
	Criteria *Criteria `json:"criteria,omitempty" yaml:"criteria,omitempty"`

	// Constraints are deployment assumptions/requirements.
	Constraints []Constraint `json:"constraints,omitempty" yaml:"constraints,omitempty"`

	// ComponentRefs is the list of components to deploy.
	ComponentRefs []ComponentRef `json:"componentRefs,omitempty" yaml:"componentRefs,omitempty"`

	// Validation defines multi-phase validation configuration.
	// Presence of a phase implies it is enabled.
	Validation *ValidationConfig `json:"validation,omitempty" yaml:"validation,omitempty"`
}

RecipeMetadataSpec contains the specification for a recipe.

func (*RecipeMetadataSpec) Merge

func (s *RecipeMetadataSpec) Merge(other *RecipeMetadataSpec)

Merge merges another RecipeMetadataSpec into this one. The other spec takes precedence for conflicts.

func (*RecipeMetadataSpec) TopologicalSort

func (s *RecipeMetadataSpec) TopologicalSort() ([]string, error)

TopologicalSort returns components in dependency order (dependencies first). Components with no dependencies come first, then components that depend only on already-listed components, etc.

func (*RecipeMetadataSpec) ValidateDependencies

func (s *RecipeMetadataSpec) ValidateDependencies() error

ValidateDependencies validates that all dependencyRefs reference existing components. Returns an error if any dependency is missing or if there are circular dependencies.

type RecipeResult

type RecipeResult struct {
	// Kind is always "RecipeResult".
	Kind string `json:"kind" yaml:"kind"`

	// APIVersion is the API version.
	APIVersion string `json:"apiVersion" yaml:"apiVersion"`

	// Metadata contains result metadata.
	Metadata struct {
		// Version is the recipe version (CLI version that generated this recipe).
		Version string `json:"version,omitempty" yaml:"version,omitempty"`

		// AppliedOverlays lists the overlay names in order of application.
		AppliedOverlays []string `json:"appliedOverlays,omitempty" yaml:"appliedOverlays,omitempty"`

		// ExcludedOverlays lists overlays that matched criteria but were excluded
		// due to failing constraint validation against the snapshot.
		// Only populated when a snapshot is provided during recipe generation.
		ExcludedOverlays []string `json:"excludedOverlays,omitempty" yaml:"excludedOverlays,omitempty"`

		// ConstraintWarnings contains details about why specific overlays were excluded.
		// Helps users understand why certain environment-specific configurations
		// were not applied and what would need to change to include them.
		ConstraintWarnings []ConstraintWarning `json:"constraintWarnings,omitempty" yaml:"constraintWarnings,omitempty"`
	} `json:"metadata" yaml:"metadata"`

	// Criteria is the input criteria used to generate this result.
	Criteria *Criteria `json:"criteria" yaml:"criteria"`

	// Constraints is the merged list of constraints.
	Constraints []Constraint `json:"constraints,omitempty" yaml:"constraints,omitempty"`

	// ComponentRefs is the merged list of components.
	ComponentRefs []ComponentRef `json:"componentRefs" yaml:"componentRefs"`

	// DeploymentOrder is the topologically sorted component names for deployment.
	// Components should be deployed in this order to satisfy dependencies.
	DeploymentOrder []string `json:"deploymentOrder" yaml:"deploymentOrder"`

	// Validation defines multi-phase validation configuration.
	// Inherited from recipe metadata during merging.
	Validation *ValidationConfig `json:"validation,omitempty" yaml:"validation,omitempty"`
}

RecipeResult represents the final merged recipe output.

func (*RecipeResult) GetComponentRef

func (r *RecipeResult) GetComponentRef(name string) *ComponentRef

GetComponentRef returns the component reference for a given component name.

func (*RecipeResult) GetCriteria

func (r *RecipeResult) GetCriteria() *Criteria

GetCriteria returns the criteria used to generate this recipe result.

func (*RecipeResult) GetValuesForComponent

func (r *RecipeResult) GetValuesForComponent(name string) (map[string]any, error)

GetValuesForComponent loads values from the component's valuesFile and inline overrides. Merge order: base values → ValuesFile → Overrides (highest precedence). This supports three patterns:

  1. ValuesFile only: Traditional separate file approach
  2. Overrides only: Fully self-contained recipe with inline overrides
  3. ValuesFile + Overrides: Hybrid - reusable base with recipe-specific tweaks

func (*RecipeResult) GetVersion

func (r *RecipeResult) GetVersion() string

GetVersion returns the recipe version from metadata.

type RequestInfo

type RequestInfo struct {
	Os        string `json:"os,omitempty" yaml:"os,omitempty"`
	OsVersion string `json:"osVersion,omitempty" yaml:"osVersion,omitempty"`
	Service   string `json:"service,omitempty" yaml:"service,omitempty"`
	K8s       string `json:"k8s,omitempty" yaml:"k8s,omitempty"`
	GPU       string `json:"gpu,omitempty" yaml:"gpu,omitempty"`
	Intent    string `json:"intent,omitempty" yaml:"intent,omitempty"`
}

RequestInfo holds simplified request metadata for documentation purposes. This replaces the old Query type with just the fields needed for bundle documentation.

type SchedulingPaths

type SchedulingPaths struct {
	// NodeSelectorPaths are paths where node selectors are injected.
	NodeSelectorPaths []string `yaml:"nodeSelectorPaths,omitempty"`

	// TolerationPaths are paths where tolerations are injected.
	TolerationPaths []string `yaml:"tolerationPaths,omitempty"`

	// TaintPaths are paths where taints are injected as structured objects.
	// Intended to be used instea of TaintStrPaths for components that need to set specific parts of taints
	// and can't process the string format.
	TaintPaths []string `yaml:"taintPaths,omitempty"`

	// TaintStrPaths are paths where taints are injected as strings (format: key=value:effect or key:effect).
	TaintStrPaths []string `yaml:"taintStrPaths,omitempty"`
}

SchedulingPaths holds the Helm value paths for node scheduling.

type ValidationConfig

type ValidationConfig struct {
	// Readiness defines readiness validation phase settings.
	Readiness *ValidationPhase `json:"readiness,omitempty" yaml:"readiness,omitempty"`

	// Deployment defines deployment validation phase settings.
	Deployment *ValidationPhase `json:"deployment,omitempty" yaml:"deployment,omitempty"`

	// Performance defines performance validation phase settings.
	Performance *ValidationPhase `json:"performance,omitempty" yaml:"performance,omitempty"`

	// Conformance defines conformance validation phase settings.
	Conformance *ValidationPhase `json:"conformance,omitempty" yaml:"conformance,omitempty"`
}

ValidationConfig defines validation phases and settings.

type ValidationPhase

type ValidationPhase struct {
	// Timeout is the maximum duration for this phase (e.g., "10m").
	Timeout string `json:"timeout,omitempty" yaml:"timeout,omitempty"`

	// Constraints are phase-level constraints to evaluate.
	Constraints []Constraint `json:"constraints,omitempty" yaml:"constraints,omitempty"`

	// Checks are named validation checks to run in this phase.
	Checks []string `json:"checks,omitempty" yaml:"checks,omitempty"`

	// NodeSelection defines which nodes to include in validation.
	NodeSelection *NodeSelection `json:"nodeSelection,omitempty" yaml:"nodeSelection,omitempty"`

	// Infrastructure references a componentRef that provides validation infrastructure.
	// Example: "nccl-doctor" for performance testing.
	Infrastructure string `json:"infrastructure,omitempty" yaml:"infrastructure,omitempty"`
}

ValidationPhase represents a single validation phase configuration.

type WorkloadSchedulingPaths

type WorkloadSchedulingPaths struct {
	// WorkloadSelectorPaths are paths where workload selectors are injected.
	WorkloadSelectorPaths []string `yaml:"workloadSelectorPaths,omitempty"`
}

WorkloadSchedulingPaths holds the Helm value paths for workload scheduling.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL