semantic

package
v0.27.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 14, 2026 License: GPL-3.0 Imports: 21 Imported by: 0

Documentation

Overview

Package semantic provides a semantic model for Dockerfiles that enables cross-instruction analysis such as stage resolution, variable scoping, and COPY --from validation.

The semantic model is built in a single pass from a ParseResult and is immutable after construction. Construction-time violations are accumulated and returned with the model.

Index

Constants

This section is empty.

Variables

View Source
var DefaultShell = []string{"/bin/sh", "-c"}

DefaultShell is the default shell used by Docker for Linux RUN instructions.

Functions

func DefaultWindowsShell added in v0.19.0

func DefaultWindowsShell() []string

DefaultWindowsShell returns the default shell for Windows containers. Returns a fresh copy to avoid mutation.

func ExpectedPlatform

func ExpectedPlatform(info *StageInfo, model *Model) (string, []string)

ExpectedPlatform determines the expected platform for a stage.

Resolution order:

  1. FROM --platform if present and resolvable via the semantic model's fromArgEval
  2. DOCKER_DEFAULT_PLATFORM environment variable
  3. Default container platform (linux/<host-arch>)

Returns the platform string (e.g., "linux/amd64") and any unresolved ARG names.

Types

type ArgEntry

type ArgEntry struct {
	// Name is the argument name.
	Name string
	// Value is the default value (nil means no default).
	Value *string
	// Location is where the ARG was declared.
	Location []parser.Range
}

ArgEntry represents a single ARG declaration.

type BaseImageOS added in v0.19.0

type BaseImageOS int

BaseImageOS represents the detected operating system of a stage's base image.

const (
	// BaseImageOSUnknown means the OS could not be determined from static analysis.
	BaseImageOSUnknown BaseImageOS = iota
	// BaseImageOSLinux indicates a Linux-based base image.
	BaseImageOSLinux
	// BaseImageOSWindows indicates a Windows-based base image.
	BaseImageOSWindows
)

type BaseImageRef

type BaseImageRef struct {
	// Raw is the original base image string (e.g., "ubuntu:22.04", "builder").
	Raw string

	// IsStageRef is true if this references another stage in the Dockerfile.
	IsStageRef bool

	// StageIndex is the index of the referenced stage, or -1 if external image.
	StageIndex int

	// Platform is the --platform value if specified.
	Platform string

	// Location is the location of the FROM instruction.
	Location []parser.Range
}

BaseImageRef contains information about a stage's base image.

type Builder

type Builder struct {
	// contains filtered or unexported fields
}

Builder constructs a semantic model from a parse result. It performs single-pass analysis and accumulates violations.

func NewBuilder

func NewBuilder(pr *dockerfile.ParseResult, buildArgs map[string]string, file string) *Builder

NewBuilder creates a new semantic model builder.

func (*Builder) Build

func (b *Builder) Build() *Model

Build constructs the semantic model. This performs single-pass analysis of the Dockerfile, detecting construction-time violations (e.g., instruction order issues).

func (*Builder) WithShellDirectives

func (b *Builder) WithShellDirectives(directives []directive.ShellDirective) *Builder

WithShellDirectives sets the shell directives to be applied during build.

type CopyFromRef

type CopyFromRef struct {
	// From is the original --from value.
	From string

	// IsStageRef is true if this references another stage.
	IsStageRef bool

	// StageIndex is the index of the referenced stage, or -1 if not found/external.
	StageIndex int

	// Command is a reference to the COPY instruction.
	Command *instructions.CopyCommand

	// Location is the location of the COPY instruction.
	Location []parser.Range
}

CopyFromRef contains information about a COPY --from reference.

type EnvEntry

type EnvEntry struct {
	// Name is the environment variable name.
	Name string
	// Value is the environment variable value.
	Value string
	// Location is where the ENV was declared.
	Location []parser.Range
}

EnvEntry represents a single ENV declaration.

type FromArgRef

type FromArgRef struct {
	// Name is the referenced variable name without $ or ${}.
	Name string
	// Suggest is an optional suggested variable name.
	Suggest string
}

FromArgRef represents a variable reference (e.g., $FOO) used in a FROM instruction that was not declared in the global ARG scope.

type FromArgsInfo

type FromArgsInfo struct {
	// UndefinedBaseName contains variable references used in the base image name
	// expression (the part after FROM) that are not declared in the global ARG scope.
	UndefinedBaseName []FromArgRef

	// UndefinedPlatform contains variable references used in the --platform expression
	// that are not declared in the global ARG scope.
	UndefinedPlatform []FromArgRef

	// InvalidDefaultBaseName is true when evaluating the base image expression using
	// only default values for global ARGs results in an empty or invalid image name.
	InvalidDefaultBaseName bool
}

FromArgsInfo contains semantic analysis results for the FROM instruction of a stage.

type HeredocShellOverride added in v0.19.0

type HeredocShellOverride struct {
	// Line is the 1-based Dockerfile line of the RUN instruction.
	Line int

	// Shell is the shell name from the shebang (e.g., "bash", "sh", "ksh").
	Shell string

	// Variant is the shell variant derived from Shell.
	Variant shell.Variant
}

HeredocShellOverride records a per-instruction shell override from a BuildKit heredoc shebang line (e.g., #!/bin/bash in a RUN <<EOF body). Docker respects these shebangs and uses the specified interpreter.

type Issue

type Issue struct {
	// Location is where the issue occurred (first range).
	Location parser.Range

	// File is the path to the Dockerfile.
	File string

	// Code is the rule code (e.g., "DL3024").
	Code string

	// Message is a human-readable description.
	Message string

	// DocURL links to documentation about this issue.
	DocURL string

	// Severity overrides the default severity for this issue.
	// Zero value (SeverityError) is the default for backward compatibility.
	Severity rules.Severity
}

Issue represents a semantic problem detected during model construction. This is similar to rules.Violation but without the dependency on the rules package to avoid import cycles. The lint.go command converts these to rules.Violation before output.

type Model

type Model struct {
	// contains filtered or unexported fields
}

Model represents the semantic analysis of a Dockerfile. It provides O(1) lookups for stages, variable resolution with proper precedence, and dependency graph analysis for COPY --from.

The model is immutable after construction. All methods are safe for concurrent read access.

func NewModel

func NewModel(pr *dockerfile.ParseResult, buildArgs map[string]string, file string) *Model

NewModel creates a semantic model from a parse result. This is a convenience wrapper around NewBuilder().Build().

func (*Model) ConstructionIssues

func (m *Model) ConstructionIssues() []Issue

ConstructionIssues returns issues detected during model construction. The caller should convert these to rules.Violation for output.

func (*Model) ExternalImageStages

func (m *Model) ExternalImageStages() func(yield func(*StageInfo) bool)

ExternalImageStages returns an iterator over stages that use external images (not "scratch" and not referencing another stage in the Dockerfile). This is useful for rules that need to check image tags/versions.

func (*Model) FromDescendants added in v0.21.2

func (m *Model) FromDescendants(stageIdx int, skip func(childIdx int) bool) []int

FromDescendants returns all stage indices that transitively inherit from stageIdx via FROM <stage> references.

If skip is non-nil it is called for each direct child; when it returns true the child (and its entire subtree) is excluded from the result.

func (*Model) Graph

func (m *Model) Graph() *StageGraph

Graph returns the stage dependency graph.

func (*Model) MetaArgs

func (m *Model) MetaArgs() []instructions.ArgCommand

MetaArgs returns the global ARG instructions before the first FROM.

func (*Model) OnbuildInstructions

func (m *Model) OnbuildInstructions(stageIdx int) []OnbuildInstruction

OnbuildInstructions returns parsed ONBUILD commands for the given stage. Returns nil if the index is out of bounds or the stage has no ONBUILD instructions.

func (*Model) RecheckUndefinedVars

func (m *Model) RecheckUndefinedVars(stageIdx int, resolvedEnv map[string]string) []StageUndefinedVars

RecheckUndefinedVars re-runs the undefined-var analysis for the specified stage and all stages that transitively inherit from it (via FROM <stage>). It uses the provided base image environment instead of the static approximation. This is used by the async pipeline when base image env has been resolved from the registry.

func (*Model) ResolveVariable

func (m *Model) ResolveVariable(stageIndex int, name string) (string, bool)

ResolveVariable resolves a variable name in the context of a stage. Resolution precedence: BuildArgs > Stage ENV > Stage ARG > Global ARG. Returns the value and true if found, or empty string and false if not.

func (*Model) Stage

func (m *Model) Stage(index int) *instructions.Stage

Stage returns the stage at the given index (0-based). Returns nil if the index is out of bounds.

func (*Model) StageByName

func (m *Model) StageByName(name string) *instructions.Stage

StageByName returns the stage with the given name. Returns nil if no stage with that name exists. Stage names are case-insensitive per Docker semantics.

func (*Model) StageCount

func (m *Model) StageCount() int

StageCount returns the number of stages in the Dockerfile.

func (*Model) StageIndexByName

func (m *Model) StageIndexByName(name string) (int, bool)

StageIndexByName returns the index of the stage with the given name. Returns -1 and false if no stage with that name exists.

func (*Model) StageInfo

func (m *Model) StageInfo(index int) *StageInfo

StageInfo returns enhanced information for the stage at the given index. Returns nil if the index is out of bounds.

func (*Model) Stages

func (m *Model) Stages() []instructions.Stage

Stages returns all stages (read-only reference).

type OnbuildInstruction

type OnbuildInstruction struct {
	// Command is the parsed typed command (RunCommand, CopyCommand, etc.).
	Command instructions.Command

	// SourceLine is the original 1-based line number of the ONBUILD instruction
	// in the Dockerfile.
	SourceLine int
}

OnbuildInstruction represents a parsed ONBUILD trigger command.

type PackageInstall

type PackageInstall struct {
	// Manager is the package manager used.
	Manager shell.PackageManager

	// Packages is the list of packages being installed.
	Packages []string

	// Line is the 1-based line number of the RUN instruction.
	Line int
}

PackageInstall represents a package installation in a RUN command.

type ShellSetting

type ShellSetting struct {
	// Shell is the shell command array used to execute RUN instructions
	// (Docker semantics), e.g., ["/bin/bash", "-c"].
	Shell []string

	// Variant is the shell variant used for lint parsing (may be influenced
	// by inline directives like "# hadolint shell=bash").
	Variant shell.Variant

	// Source indicates where Variant came from.
	Source ShellSource

	// Line is the 0-based line number where the shell was set (for directives/instructions).
	// -1 for default shell.
	Line int
}

ShellSetting represents the active shell configuration for a stage.

type ShellSource

type ShellSource int

ShellSource indicates where the shell setting came from.

const (
	// ShellSourceDefault indicates the default shell is being used.
	ShellSourceDefault ShellSource = iota
	// ShellSourceInstruction indicates the shell was set via SHELL instruction.
	ShellSourceInstruction
	// ShellSourceDirective indicates the shell was set via a comment directive.
	ShellSourceDirective
)

type StageGraph

type StageGraph struct {
	// contains filtered or unexported fields
}

StageGraph represents the dependency graph between stages. It tracks cross-stage relationships (COPY --from and FROM stage refs) to enable reachability analysis.

func (*StageGraph) DependsOn

func (g *StageGraph) DependsOn(stageA, stageB int) bool

DependsOn returns true if stageA depends on stageB (directly or transitively). A stage depends on another if it copies from it (COPY --from) or uses it as a base image (FROM <stage>).

func (*StageGraph) DetectCycles added in v0.14.0

func (g *StageGraph) DetectCycles() [][]int

DetectCycles finds cycles in the stage dependency graph. It returns at least one cycle per strongly connected component; overlapping or shorter cycles within the same SCC may not all be reported. Each cycle is a slice of stage indices forming a directed cycle (e.g., [0, 2, 1] means stage 0 → stage 2 → stage 1 → stage 0). Returns nil if the graph is acyclic.

The algorithm uses DFS with 3-color marking. Cycles are returned in canonical form: rotated so the smallest index is first, then deduplicated.

func (*StageGraph) DirectDependencies

func (g *StageGraph) DirectDependencies(stageIndex int) []int

DirectDependencies returns the stages that stageIndex directly depends on (via COPY --from or FROM stage refs).

func (*StageGraph) DirectDependents

func (g *StageGraph) DirectDependents(stageIndex int) []int

DirectDependents returns the stages that directly depend on stageIndex.

func (*StageGraph) ExternalRefs

func (g *StageGraph) ExternalRefs(stageIndex int) []string

ExternalRefs returns the external image references in stageIndex.

func (*StageGraph) IsReachable

func (g *StageGraph) IsReachable(stageIndex, finalStageIndex int) bool

IsReachable returns true if stageIndex is reachable from finalStageIndex. A stage is reachable if:

  1. It is the final stage itself
  2. The final stage (or any reachable stage) depends on it
  3. It is a base image for a reachable stage

func (*StageGraph) StageCount

func (g *StageGraph) StageCount() int

StageCount returns the total number of stages.

func (*StageGraph) UnreachableStages

func (g *StageGraph) UnreachableStages() []int

UnreachableStages returns indices of stages that are not reachable from the final stage. These are stages that don't contribute to the final image.

type StageInfo

type StageInfo struct {
	// Index is the 0-based stage index.
	Index int

	// Stage is a reference to the BuildKit stage.
	Stage *instructions.Stage

	// BaseImageOS is the detected operating system of the base image.
	// Determined by heuristics (image name, platform, escape directive, SHELL instruction).
	BaseImageOS BaseImageOS

	// ShellSetting contains the active shell configuration including variant and source.
	ShellSetting ShellSetting

	// BaseImage contains information about the FROM image reference.
	BaseImage *BaseImageRef

	// FromArgs contains semantic analysis results for the stage's FROM instruction
	// (ARG usage in base name and platform, and default validity checks).
	FromArgs FromArgsInfo

	// Variables contains the variable scope for this stage.
	Variables *VariableScope

	// EffectiveEnv is the approximate effective environment for this stage after
	// evaluating ARG and ENV instructions (matching BuildKit's word expansion
	// environment semantics for linting).
	//
	// It is used for UndefinedVar analysis and for inheriting environment keys
	// when another stage uses this stage as its base.
	EffectiveEnv map[string]string

	// UndefinedVars contains variable references (e.g., $FOO) used in stage
	// commands that are not defined at the point of use.
	UndefinedVars []UndefinedVarRef

	// CopyFromRefs contains all COPY --from references in this stage.
	CopyFromRefs []CopyFromRef

	// OnbuildCopyFromRefs contains COPY --from references in ONBUILD instructions.
	// These are triggered when the image is used as a base for another build.
	OnbuildCopyFromRefs []CopyFromRef

	// OnbuildInstructions contains all parsed ONBUILD trigger commands for this stage.
	// Each ONBUILD expression is parsed into a typed command using BuildKit's parser.
	OnbuildInstructions []OnbuildInstruction

	// HeredocShellOverrides contains per-instruction shell overrides detected
	// from heredoc shebang lines. Rules can use this to determine the effective
	// shell for a specific RUN instruction instead of the stage-level shell.
	HeredocShellOverrides []HeredocShellOverride

	// InstalledPackages contains packages installed via system package managers.
	// Tracked from RUN commands that use apt-get, apk, yum, dnf, etc.
	InstalledPackages []PackageInstall

	// IsLastStage is true if this is the final stage in the Dockerfile.
	IsLastStage bool
}

StageInfo contains enhanced information about a build stage. It augments BuildKit's instructions.Stage with semantic analysis data.

func (*StageInfo) HasPackage

func (s *StageInfo) HasPackage(pkg string) bool

HasPackage checks if a package was installed in this stage.

func (*StageInfo) IsExternalImage

func (s *StageInfo) IsExternalImage() bool

IsExternalImage returns true if this stage's base image is an external image (not "scratch" and not a reference to another stage in the Dockerfile). This is useful for rules that need to check image tags/versions.

func (*StageInfo) IsLinux added in v0.19.0

func (s *StageInfo) IsLinux() bool

IsLinux returns true if the base image was detected as Linux.

func (*StageInfo) IsScratch added in v0.14.0

func (s *StageInfo) IsScratch() bool

IsScratch returns true if this stage uses FROM scratch as its base image.

func (*StageInfo) IsWindows added in v0.19.0

func (s *StageInfo) IsWindows() bool

IsWindows returns true if the base image was detected as Windows.

func (*StageInfo) PackageManagers

func (s *StageInfo) PackageManagers() []shell.PackageManager

PackageManagers returns the set of package managers used in this stage.

type StageUndefinedVars

type StageUndefinedVars struct {
	StageIdx int
	Undefs   []UndefinedVarRef
}

StageUndefinedVars groups undefined variable results by stage index.

type UndefinedVarRef

type UndefinedVarRef struct {
	// Name is the referenced variable name without $ or ${}.
	Name string
	// Suggest is an optional suggested variable name.
	Suggest string
	// Location is where the undefined variable was used.
	Location []parser.Range
}

UndefinedVarRef represents a variable reference (e.g., $FOO) used in a stage command that was not defined at the point of use.

type VariableScope

type VariableScope struct {
	// contains filtered or unexported fields
}

VariableScope manages ARG and ENV variable resolution for a stage. It implements Docker's variable precedence rules.

func NewGlobalScope

func NewGlobalScope() *VariableScope

NewGlobalScope creates a new global variable scope.

func NewStageScope

func NewStageScope(parent *VariableScope) *VariableScope

NewStageScope creates a new stage scope with the given parent (global scope).

func (*VariableScope) AddArg

func (s *VariableScope) AddArg(name string, value *string, location []parser.Range)

AddArg adds an ARG declaration to the scope.

func (*VariableScope) AddArgCommand

func (s *VariableScope) AddArgCommand(cmd *instructions.ArgCommand)

AddArgCommand adds all arguments from an ARG instruction.

func (*VariableScope) AddEnv

func (s *VariableScope) AddEnv(name, value string, location []parser.Range)

AddEnv adds an ENV declaration to the scope.

func (*VariableScope) AddEnvCommand

func (s *VariableScope) AddEnvCommand(cmd *instructions.EnvCommand)

AddEnvCommand adds all variables from an ENV instruction.

func (*VariableScope) Args

func (s *VariableScope) Args() []*ArgEntry

Args returns all ARG entries in declaration order.

func (*VariableScope) Envs

func (s *VariableScope) Envs() []*EnvEntry

Envs returns all ENV entries in declaration order.

func (*VariableScope) GetArg

func (s *VariableScope) GetArg(name string) *ArgEntry

GetArg returns the ARG entry for the given name, searching up the scope chain. Like HasArg, this checks existence across the entire scope chain, not resolvability - use Resolve to check if a variable is actually accessible in this stage's context.

func (*VariableScope) GetEnv

func (s *VariableScope) GetEnv(name string) *EnvEntry

GetEnv returns the ENV entry for the given name, or nil if not found.

func (*VariableScope) HasArg

func (s *VariableScope) HasArg(name string) bool

HasArg returns true if the variable is declared as an ARG anywhere in the scope chain (this scope or any parent). This checks existence across the entire scope chain, not resolvability - a global ARG will return true even if the stage hasn't redeclared it.

To check if a variable is actually resolvable in this stage's context, use Resolve(name, nil) and check the boolean result.

func (*VariableScope) Parent

func (s *VariableScope) Parent() *VariableScope

Parent returns the parent scope (nil for global scope).

func (*VariableScope) Resolve

func (s *VariableScope) Resolve(name string, buildArgs map[string]string) (string, bool)

Resolve looks up a variable by name using Docker's precedence rules. Precedence (highest first):

  1. Stage ENV (environment variables always take precedence)
  2. Stage ARG with build-arg override
  3. Stage ARG with default value (inheriting from global if no local default)

Note: Global ARGs are ONLY visible in a stage if the stage explicitly declares them with `ARG NAME`. A global `ARG FOO=1` is NOT automatically available in stage instructions until the stage redeclares it.

Returns the value and true if found, or empty string and false if not.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL