Documentation
¶
Overview ¶
Package schema models the document-structure schemas that drive MDS020 (required-structure). A schema describes what a Markdown document's front matter, filename, and heading tree must look like.
Two sources feed the same in-memory representation:
- Inline. A YAML schema block under kinds.<name>.schema: in .mdsmith.yml.
- File. A proto.md file referenced by rules.required-structure.schema: (the legacy heading-template form).
Both parse into a Schema whose Sections is a recursive tree of Scope nodes. The validator walks a document's AST against that tree, emitting diagnostics through the lint.Diagnostic shape.
See plan/146_inline-schema-in-kinds.md for the design context. Plan 156 collapses each section entry to one discriminator (`heading:` — null, string, or mapping) with a single matcher (`regex:`) and a cardinality field (`repeat: {min, max}`); see docs/reference/section-schema.md for the full grammar.
Index ¶
- Constants
- func BuildIndex(f *lint.File, sch *Schema) ([]byte, error)
- func EmbeddedTypesCUE() string
- func FormatSchemaRef(sch *Schema, key string) string
- func HeadingStem(sc *Scope) (stem string, fmvars []string, hasDigits bool)
- func LookupShortcut(name string) (string, bool)
- func MatchesHeading(sc Scope, dh DocHeading, fm map[string]any) bool
- func NonBodyDiagLine(f *lint.File) int
- func RenderExpected(expr string) string
- func RenderHint(expr string, actual any) string
- func ScopeRunIndices(scopes []Scope, currentIdx int, heads []DocHeading, ...) []int
- func ShortcutNames() []string
- func Validate(f *lint.File, sch *Schema, docFM map[string]any, fmIsCUE bool, mkDiag MakeDiag) []lint.Diagnostic
- func ValidateAcronyms(f *lint.File, sch *Schema, docFM map[string]any, mkDiag MakeDiag) []lint.Diagnostic
- func ValidateContent(f *lint.File, sch *Schema, docFM map[string]any, mkDiag MakeDiag) []lint.Diagnostic
- func ValidateCrossReferences(f *lint.File, sch *Schema, mkDiag MakeDiag) []lint.Diagnostic
- func ValidateFrontmatter(sch *Schema, fm map[string]any) error
- func ValidateFrontmatterDiags(f *lint.File, sch *Schema, docFM map[string]any, mkDiag MakeDiag) []lint.Diagnostic
- func ValidateFrontmatterSyntax(sch *Schema) error
- func ValidateIndex(f *lint.File, sch *Schema, mkDiag MakeDiag) []lint.Diagnostic
- func WriteIndex(f *lint.File, sch *Schema) error
- type AcronymRule
- type ContentEntry
- type ContentMatch
- type CrossRef
- type DocHeading
- type FileHeading
- type FileReader
- type IndexHeading
- type IndexSpec
- type MakeDiag
- type MatchTree
- type Matcher
- type Repeat
- type Schema
- type SchemaDiagnostic
- type Scope
- type ScopeMatch
Constants ¶
const ( IndexIncludeStepMap = "step-map" IndexIncludeCrossRefs = "cross-ref-graph" IndexIncludeWordCounts = "word-counts" IndexIncludeHeadingsFlat = "headings" )
Valid include keys for IndexSpec.Include.
const ( ContentKindCodeBlock = "code-block" ContentKindTable = "table" ContentKindList = "list" ContentKindParagraph = "paragraph" ContentKindUnlisted = "unlisted" )
Content-entry kind discriminators. The on-disk YAML carries one of these strings under `kind:`; the validator dispatches on the same constant.
const SectionWildcard = "..."
SectionWildcard is the literal text the file-based parser recognises in a proto.md heading row (`## ...`) as a positional slot — a heading run that matches any text zero or more times. The inline parser rejects this string when it appears as `heading:` text; authors must use the mapping form (`heading: { regex: '.+', repeat: { min: 0 } }`).
Variables ¶
This section is empty.
Functions ¶
func BuildIndex ¶
BuildIndex computes the JSON index document the IndexSpec asks for and returns its serialised bytes. The returned bytes are pretty-printed with two-space indentation so the file is reviewable when diffed. Errors from sub-builders (currently only `cross-ref-graph` can fail, on a bad regex) propagate so ValidateIndex / Fix surface them as diagnostics instead of silently shipping a partial index.
func EmbeddedTypesCUE ¶
func EmbeddedTypesCUE() string
EmbeddedTypesCUE returns the source of the shortcut library shipped at `cue/types/types.cue`. The runtime registry below is the actual lookup table; this accessor exposes the documented source for tooling (e.g. an `mdsmith help schema-types` command) and for the drift test that pins registry entries to the file contents.
func FormatSchemaRef ¶ added in v0.17.0
FormatSchemaRef builds the "source:line" suffix used by every SchemaDiagnostic so callers outside the schema package emit the same shape. An unknown source falls back to "schema".
func HeadingStem ¶ added in v0.18.0
HeadingStem splits a scope's matcher regex into its literal stem (the heading text with `\#(digits)` / `\#(fmvar(name))` interpolations removed and regexp escaping reversed) plus the `fmvar` field names and whether a `digits` capture is present. It works uniformly for proto-sugar scopes and hand-written inline `regex:` bodies, so the projector's key seam does not need to know which parser produced the scope.
func LookupShortcut ¶
LookupShortcut returns the canonical CUE expression registered under name, or (` `, false) when name is not in the registry. Callers should use resolveBareName for end-to-end shortcut handling; this helper exists for tests and for tools that want to introspect a single entry.
func MatchesHeading ¶
func MatchesHeading(sc Scope, dh DocHeading, fm map[string]any) bool
MatchesHeading reports whether sc matches dh's heading text. Exported so callers outside the validator (notably the per-scope rule walker in internal/rules/requiredstructure) reuse the same matching semantics. fm is the document's parsed front matter and must be supplied so `\#(fmvar(...))` patterns resolve correctly; pass nil only when the schema is known to use literal-only matchers.
func NonBodyDiagLine ¶ added in v0.17.0
NonBodyDiagLine returns the body-coord line value that, after lint.File.AdjustDiagnostics adds f.LineOffset, lands on the absolute first line of the file (typically the opening `---` fence of stripped front matter). It is the canonical anchor for diagnostics that do not correspond to a specific body line — schema-level compile failures, filename pattern violations, and structure diagnostics for sections that are missing entirely.
The previous "anchor at line 1" pattern landed on the first body line in front-matter-stripped mode, which engine.filterGeneratedDiags could mistakenly drop if the document body started with a generated section (e.g. a leading <?catalog?> directive). Using `1 - LineOffset` produces a non-positive body-coord that filterGeneratedDiags cannot match against any generated line range, and the engine's AdjustDiagnostics adds the offset back so the surfaced diagnostic still anchors at the file's first line.
When f.LineOffset == 0 (the non-stripped path, used by tests via lint.NewFile) the return value is just 1, matching the previous behaviour.
func RenderExpected ¶ added in v0.17.0
RenderExpected converts a raw CUE constraint expression into a user-facing "expected" string. It recognises the common shapes listed in plan 147 — string disjunctions, regex matchers, numeric ranges, non-empty strings, and bool — and falls back to the verbatim expression when nothing matches. The fallback is deliberately the raw CUE so the user can still copy/paste into the schema; an over-eager translation that drops type information would be worse than the literal expression.
func RenderHint ¶ added in v0.17.0
RenderHint returns a best-effort suggestion for fixing the violation. It only fires on a small set of shapes:
- String disjunctions: when the actual value is within Levenshtein distance 2 of one of the literal alternatives, suggest that literal.
- Integer ranges: when the actual value is just outside the declared bounds, suggest the nearest bound.
All other shapes return the empty string. A noisy hint (for instance, a regex pattern naïvely echoed at the user) is worse than no hint, so the extractor errs on the side of silence.
actual is the raw front-matter value as decoded from JSON; numbers arrive as float64 from json.Unmarshal.
func ScopeRunIndices ¶ added in v0.17.0
func ScopeRunIndices( scopes []Scope, currentIdx int, heads []DocHeading, expectedLevel, parentStart, parentEnd int, claimed map[int]bool, docFM map[string]any, ) []int
ScopeRunIndices returns the doc-heading indices that scopes[currentIdx] would claim inside the [parentStart, parentEnd) window, following the same contiguous-run semantics matchScope uses: scan forward from the first match for additional same-level matches, but stop at the first same-level heading that does not match (deeper headings inside the matched section are skipped silently).
The helper also mirrors matchScope's yield rules so per-scope walkers stay in step with structural validation:
- A broad matcher (`regex: '.+'`) yields to any later named scope whose matcher would claim the heading.
- A non-broad matcher yields to a later named scope only after its `repeat.min` has been satisfied.
When no in-level match is found the helper falls back to the first wrong-level match in the window. Used by the per-scope walkers (acronyms, content, rules) so they only visit occurrences the structural validator would also have claimed as part of the same run.
func ShortcutNames ¶
func ShortcutNames() []string
ShortcutNames returns the sorted list of registered shortcut names. Used by error messages, the docs page, and the drift test.
func Validate ¶
func Validate( f *lint.File, sch *Schema, docFM map[string]any, fmIsCUE bool, mkDiag MakeDiag, ) []lint.Diagnostic
Validate walks the document AST against sch, emitting diagnostics for missing/extra/out-of-order sections, level mismatches, frontmatter that fails the schema's CUE constraints, and filename patterns. mkDiag builds the diagnostic with the caller's rule ID.
docFM is the document's parsed front matter (nil when absent). When fmIsCUE is true, the front-matter values are themselves CUE expressions (the `cue-frontmatter` placeholder); the CUE check is skipped because the values are not concrete data.
func ValidateAcronyms ¶
func ValidateAcronyms( f *lint.File, sch *Schema, docFM map[string]any, mkDiag MakeDiag, ) []lint.Diagnostic
ValidateAcronyms flags all-caps tokens (length 2-6) that appear for the first time inside a configured scope without a parenthesised expansion. KnownSafe tokens are exempt; a missing scope list applies the check document-wide.
First-use state is per-scope. An acronym defined inside "Check" must be re-introduced when it first appears inside "Expected" — the rule treats the two scope passes independently so each named scope reads as self-contained.
Heading lines are excluded from the scan in both modes: prose rules apply to body text, and a section heading like "## OIDC configuration" should not be flagged as the first use of OIDC when the body that follows immediately spells out the expansion.
func ValidateContent ¶ added in v0.16.0
func ValidateContent( f *lint.File, sch *Schema, docFM map[string]any, mkDiag MakeDiag, ) []lint.Diagnostic
ValidateContent runs the content-entry walker for every scope in sch that declares `content:`. A content entry constrains the shape of an AST node (code block, table, list, paragraph) inside the matched section's body. Diagnostics surface alongside the heading-tree results emitted by Validate.
docFM is the document's parsed front matter; it threads through to the per-scope walker so `\#(fmvar(...))` matchers resolve when pairing a scope with its heading.
The function does its own document parse with the table extension enabled — lint.NewFile's parser is CommonMark-only, so GFM tables would otherwise appear as paragraphs. The parse is skipped when the schema declares no content entries anywhere.
func ValidateCrossReferences ¶
ValidateCrossReferences walks the document's inline text nodes and, for each cross-reference pattern, checks that every match resolves to an existing heading slug. Unresolved references produce a diagnostic at the source position of the match. Lines whose raw text matches SkipLinesMatching are exempt.
func ValidateFrontmatter ¶
ValidateFrontmatter compiles sch.Frontmatter into a CUE schema and unifies it with fm (the document's parsed front matter).
func ValidateFrontmatterDiags ¶ added in v0.17.0
func ValidateFrontmatterDiags( f *lint.File, sch *Schema, docFM map[string]any, mkDiag MakeDiag, ) []lint.Diagnostic
ValidateFrontmatterDiags exposes the per-error CUE-diagnostic walker to callers outside the validator (notably the requiredstructure rule's legacy file-schema path, which has its own heading-template parser but reuses the schema package's actionable front-matter diagnostics).
Line numbers on returned diagnostics are in the engine's body-coordinate system: they are the absolute file line of the offending front-matter key minus f.LineOffset. In front-matter-stripped mode (f.LineOffset > 0) the body- coordinate value is non-positive for keys inside the FM block; lint.File.AdjustDiagnostics then shifts it back into the absolute file line. Callers that bypass the engine — for instance unit tests inspecting the raw slice — must either run the result through f.AdjustDiagnostics or normalise the values themselves before treating them as 1-based positions.
func ValidateFrontmatterSyntax ¶
ValidateFrontmatterSyntax checks that the schema's frontmatter constraints compile as CUE. Returns nil if there are no constraints.
func ValidateIndex ¶
ValidateIndex compares the on-disk index file (if any) against the bytes BuildIndex would emit. When the file is missing or its content differs, a diagnostic asks the user to run `mdsmith fix` so the artefact stays in sync. Comparison normalises CRLF line endings to LF so a Windows checkout with `core.autocrlf=true` does not flag a semantically-identical file as stale. Read errors other than "file does not exist" surface as a distinct diagnostic. If the last Fix tried to write this index and failed, the cached I/O error is reported in place of the generic "missing / out of date" message so users can act on the real cause instead of running fix again. `mdsmith check` still respects the read-only contract: it never touches the file.
func WriteIndex ¶
WriteIndex writes the JSON index produced by BuildIndex next to the source file. Output paths are resolved relative to the source file's directory; absolute paths (including Windows drive-letter and leading-backslash forms), parent-traversal segments, and symlinks that escape the allowed root are rejected so a schema cannot trick fix into writing outside the project. Parent directories are created on demand so a nested `output:` path (e.g. `.mdsmith/index/runbook.json`) works on a clean checkout.
The allowed root is f.RootDir when set (the project root), and the source file's directory otherwise. After mkdir we EvalSymlinks the parent directory and verify it still resolves inside that root, so a `sub` directory that turns out to be a symlink to `/etc` is caught before any bytes are written.
The target file itself is also Lstat-checked: if an existing symlink sits at the index path (an in-root symlink that points outside the project — e.g. `.runbook-index.json` → `/etc/passwd`), os.WriteFile would follow it and clobber the link target. We reject the write instead. The write goes through a sibling temp file + os.Rename so the directory entry is replaced atomically and never as a symlink-follow operation.
On error WriteIndex records the failure in the package-level indexWriteErr cache keyed by f.Path so the next Check surfaces the underlying I/O error instead of repeating the generic "missing / out of date" message — otherwise a misconfiguration (e.g. `output: "."` resolving to a directory) would trap users in a fix loop with no signal about what is actually wrong. A successful write clears the entry.
Types ¶
type AcronymRule ¶
AcronymRule configures first-use acronym detection. KnownSafe is the allowlist of tokens that may appear without expansion. Scope, if non-empty, restricts the check to text inside sections whose heading text matches one of the listed names; empty Scope applies the check document-wide.
type ContentEntry ¶ added in v0.16.0
type ContentEntry struct {
// Kind names the AST shape this entry matches. One of
// `ContentKind*`. Reject unknown values at parse time.
Kind string
// Required reports whether a matching node must appear at this
// position. Defaults to true for literal entries; the parser
// rejects `required:` on `kind: unlisted` outright.
Required bool
// Lang constrains a `code-block` entry's info string. Empty means
// any language is accepted. Today the match is exact equality; a
// future plan can extend the field with a regex form.
Lang string
// Columns constrains a `table` entry's header row. Empty means any
// header is accepted. When set, the doc table's header row must
// equal Columns element-wise.
Columns []string
// Ordered, OrderedSet constrain a `list` entry's bullet style.
// OrderedSet reports whether the schema author set `ordered:`;
// when false, any bullet style passes.
Ordered bool
OrderedSet bool
// MinItems and MaxItems bound the count of items in a `list`
// entry's match. Zero means unbounded.
MinItems int
MaxItems int
}
ContentEntry describes one positional non-heading AST node that must appear inside a section's body. Each entry has a discriminator (Kind) and a small set of kind-specific constraint fields. Fields not relevant to the entry's kind are zero-valued.
type ContentMatch ¶ added in v0.18.0
type ContentMatch struct {
Entry *ContentEntry
Node ast.Node
Line int
}
ContentMatch pairs a schema ContentEntry with the AST node that satisfied it and that node's 1-based source line.
type CrossRef ¶
CrossRef declares one text-pattern → slug-template binding. The validator searches text nodes for Pattern; for each match it fills the captured groups (numeric `{n}` or named `{slug}`) into MustMatch, slugifies the result, and looks it up in the document's heading slug set. Lines whose raw text matches SkipLinesMatching are exempt — typically blockquoted stale text.
type DocHeading ¶
DocHeading is a heading collected from the document under validation.
func ExtractDocHeadings ¶
func ExtractDocHeadings(f *lint.File) []DocHeading
ExtractDocHeadings walks the document AST and collects every heading in source order, with its source line.
type FileHeading ¶
FileHeading is a heading collected from a schema markdown file.
type FileReader ¶
FileReader resolves a schema path or include reference to the underlying bytes. RootFS, when set, is used to constrain reads to the project root; otherwise reads go through os.ReadFile.
type IndexHeading ¶
type IndexHeading struct {
Level int `json:"level"`
Text string `json:"text"`
Slug string `json:"slug"`
Line int `json:"line"`
}
IndexHeading is one entry in the flat heading list emitted by the "headings" include.
type IndexSpec ¶
IndexSpec configures the index side-output. Output is the path (relative to the source file's directory) where `mdsmith fix` writes the JSON document. Include selects which sub-objects are emitted; the set is closed so downstream tools can parse the file without referencing a schema.
type MakeDiag ¶
type MakeDiag func(file string, line int, msg string) lint.Diagnostic
MakeDiag is the diagnostic constructor the validator uses. Callers supply it so the schema package stays free of rule-ID coupling.
type MatchTree ¶ added in v0.18.0
type MatchTree struct {
// Frontmatter is the document's decoded front matter, unchanged.
Frontmatter map[string]any
// Root is a synthetic node whose Children are the top-level
// section matches in document order. Root.Scope is nil.
Root *ScopeMatch
}
MatchTree is the projection-ready record of how a document's AST satisfied a composed Schema. It is produced after a successful schema match (extraction is gated on conformance) and consumed by internal/extract to build a data tree without re-matching.
Validate keeps its diagnostic-only return; BuildMatchTree is a separate walk so MDS020 is unaffected.
func BuildMatchTree ¶ added in v0.18.0
BuildMatchTree walks f against the composed schema and records the matched headings, captures, and content nodes. It assumes the document already conforms (callers gate on a clean Validate), so it uses the validator's in-order run/yield helpers without the error-recovery branches.
type Matcher ¶ added in v0.17.0
type Matcher struct {
// Regex is the body of a CUE raw-interpolation string
// (`#"..."#`). The validator compiles it as Go RE2 after
// substituting the two helpers in scope: `digits` (the literal
// string `(?P<n>[0-9]+)`) and `fmvar(name)` (the front-matter
// field `name`, regex-escaped). Backslash passes through to RE2
// without doubling; interpolation uses `\#(expr)` inside the
// pattern.
Regex string
// Repeat bounds how many consecutive matching headings this
// matcher claims. The zero value (Repeat{Set: false}) means
// "exactly one"; see Repeat.Bounds for the canonical (min, max)
// pair the validator consumes.
Repeat Repeat
// Sequential, with a `digits` named capture in Regex, asserts
// the captured `n` values are strictly increasing without gaps.
// Without a `digits` capture in the pattern the parser rejects
// `sequential: true`.
Sequential bool
}
Matcher describes how a Scope claims one or more consecutive headings. Authors set it inline as the mapping form (`heading: { regex, repeat?, sequential? }`); the bare-string sugar (`heading: "Overview"`) and the proto.md heading-row tokens (`## ?`, `## ...`, `## Step {n}`, `## {id}`) desugar to the same shape.
type Repeat ¶ added in v0.17.0
type Repeat struct {
// Set reports whether the `repeat:` key was present in the YAML
// mapping. When false, Bounds() returns (1, 1) — the matcher
// claims exactly one heading. When true, an omitted `min:`
// defaults to 0 and an omitted `max:` to unbounded (Max == 0).
Set bool
// Min is the minimum match count. Ignored when Set is false.
Min int
// Max is the maximum match count, with 0 meaning unbounded.
// Ignored when Set is false. The parser rejects `repeat: {
// max: 0 }` so Max == 0 unambiguously signals "unbounded".
Max int
}
Repeat bounds the cardinality of a Matcher's heading run.
type Schema ¶
type Schema struct {
// Frontmatter maps each front-matter key to a CUE expression that
// constrains its value. The map preserves user keys verbatim,
// including any trailing "?" optional-field marker.
Frontmatter map[string]string
// FrontmatterLines maps each front-matter key to the 1-based
// line of its constraint in the schema source, when known. The
// file-based parser populates this from the proto.md
// frontmatter via yaml.Node line metadata; the inline parser
// does not (the config loader unmarshals into typed values
// before this code sees it). Lookup is by the same key form
// stored in Frontmatter (with the optional "?" suffix
// preserved).
FrontmatterLines map[string]int
// Filename is a glob the document basename must match. Empty
// means no filename constraint. Authors set it as a top-level
// `filename:` key on the schema (plan 156 dropped the
// `require.filename:` nesting).
Filename string
// Sections holds the top-level section list at RootLevel; each
// Scope may itself nest further sections one heading level
// deeper. Inline schemas always start at H2 (RootLevel=2), so
// the document H1 is owned by first-line-heading and any
// title-bearing frontmatter field rather than represented here.
// File-based schemas can root at H1 (e.g. a `# ?` wildcard in
// proto.md, RootLevel=1) — that H1 scope is part of Sections
// and its children appear at level 2.
Sections []Scope
// Closed reports whether the root scope is strict: when true,
// unlisted top-level headings produce a diagnostic; when false,
// they are tolerated between listed sections. File-based schemas
// default to Closed=true to preserve the historical
// heading-template semantics; inline schemas default to false
// per plan 146.
Closed bool
// Source is a human-readable label (file path for file-based
// schemas, kind name for inline schemas) used in diagnostics
// referring to the schema itself.
Source string
// RootLevel is the heading level of entries in Sections.
// Inline schemas use 2 (H1 belongs to the title). File-based
// schemas adopt whatever level the topmost heading in the
// proto.md uses — usually 1 for a `# ?` wildcard, 2 when the
// file declares only `## ...` rows.
RootLevel int
// CrossReferences lists patterns whose matches in the document
// body must resolve to a heading slug. See plan 143.
CrossReferences []CrossRef
// Acronyms, if non-nil, asks the validator to flag first-use
// acronyms (length 2-6, all caps) that lack a parenthesised
// expansion. See plan 143.
Acronyms *AcronymRule
// Index, if non-nil, asks `mdsmith fix` to emit a JSON
// side-output describing the document. `mdsmith check` reports
// staleness (missing or outdated file) as a diagnostic so the
// fixer is triggered, but never writes the file itself. See
// plan 143.
Index *IndexSpec
}
Schema is the parsed representation of a single document schema. It is produced by the inline YAML parser or the proto.md file parser; both feed the same struct.
func Compose ¶ added in v0.17.1
Compose merges multiple schemas into one. The composed schema's frontmatter constraints are the union of each input's keys; for keys declared in more than one input, the CUE expressions are conjoined with `&` so a value must satisfy every constraint. Sections are merged so that scopes sharing the same heading label combine their child sections recursively; remaining scopes append in input order. The stricter `closed:` wins (any input that sets Closed=true makes the composed scope closed) and the cardinality intersects (the composed run length must satisfy every input; disjoint ranges have an empty intersection and return an error). Filename uses the first non-empty value; conflicting patterns cause an error so the caller surfaces a clear diagnostic rather than silently ignoring one constraint. Acronyms, CrossReferences, and Index slots are joined from inputs that declare them; conflicts on Index.Output return an error.
Compose returns nil when given no inputs. With a single non-nil input it returns that input unchanged.
func ParseFile ¶
func ParseFile(r *FileReader, path string) (*Schema, error)
ParseFile loads the proto.md schema at path through r and returns the parsed Schema. It expands <?include?> directives, extracts <?require?> filename constraints, and normalises the flat heading-template into a Scope tree (each H2 becomes a top-level Scope, each H3 nests beneath the previous H2, and so on).
File-based schemas default to Closed=true at the root so the historical heading-template behaviour (extras flagged) survives the migration. Inline schemas opt back to open scopes per plan 146.
Plan 156 maps the proto.md heading-row tokens to the new Matcher shape: `## ?` → `{regex: '.+'}`, `## ...` → `{regex: '.+', repeat: {min: 0}}`, `## Step {n}` → `{regex: 'Step \#(digits)'}`, `## {id}` → `{regex: '\#(fmvar(id))'}`. The `<?require?>` directive in proto.md bodies is unchanged.
Note: MDS020's file-schema path (internal/rules/requiredstructure) still routes through its legacy `parseSchema` / `parsedSchema` pipeline, not this `ParseFile`. The new shape parses correctly here and is exercised by the schema-package tests, but proto.md authors won't benefit from the new Matcher / fmvar / digits semantics until that path is migrated (a follow-up plan tracks the cutover so the `{field}` heading/body sync feature survives the move).
func ParseInline ¶
ParseInline builds a Schema from the YAML-decoded inline form found under kinds.<name>.schema: in .mdsmith.yml. The input is the raw map[string]any produced by goyaml so callers do not have to share a dependency on a specific YAML schema struct.
source is a label used in diagnostics that point back at the schema (typically "kind <name>").
Inline schemas default to open scopes (Closed: false) per plan 146. The validator's open-scope semantics still enforce required sections and listed-section ordering; only unlisted headings are tolerated.
Plan 156 collapses the section-entry vocabulary: each entry sets exactly one `heading:` key (null, string, or mapping) and the mapping carries `regex:`, `repeat:`, and `sequential:`. The `aliases:`, `required:`, scope-level `repeats:`/`sequential:`/ `min:`/`max:`, `{unlisted: true}` mapping, and schema-level `require:` shapes are gone; the parser rejects them with a "removed; see plan 156" diagnostic naming the replacement.
func (*Schema) EffectiveRootLevel ¶
EffectiveRootLevel returns the heading level of the root scope list, falling back to 2 when unset.
func (*Schema) FrontmatterCUE ¶
FrontmatterCUE returns a CUE struct literal that constrains the document front matter to the schema. The result is suitable for compiling with cuelang and unifying against a JSON-encoded document front matter. Keys with a trailing "?" are emitted as optional CUE fields with the marker stripped from the label.
type SchemaDiagnostic ¶ added in v0.17.0
type SchemaDiagnostic struct {
// Field names the offending input. For front-matter CUE errors
// it is the top-level key (or dotted path for nested values).
// For structure failures it is the formatted heading
// (e.g. "## Goal"). For filename failures it is "filename".
Field string
// Actual is the value the user wrote, rendered for display:
// strings appear quoted, scalars raw. Empty when the user
// supplied nothing concrete (e.g. a missing required section).
Actual string
// Expected describes the constraint in user vocabulary
// (e.g. `one of: "open", "in-progress", "done"`). Empty when
// the expectation is implied by Field — for instance, a
// "missing required section" diagnostic does not repeat the
// section name as the expected value.
Expected string
// Hint, when non-empty, points the reader at a likely fix
// (typically the nearest valid literal or numeric bound).
// Hints are best-effort; a noisy hint is worse than none, so
// the extractor only fires on a small set of shapes.
Hint string
// SchemaRef names the schema source and (when known) the line
// of the constraint, so the reader can locate the rule
// without parsing the message. Examples: "plan/proto.md:4",
// "inline kind schema".
SchemaRef string
}
SchemaDiagnostic is the structured form of an MDS020 violation. Every CUE-error, structure-mismatch, filename-mismatch, and require-directive failure flows through this type so the rendered message names the field, shows the value, names the constraint, and (when applicable) suggests a fix.
The type lives in the schema package because schema.Validate is the producer; internal/rules/requiredstructure is the sole consumer and would import this from anywhere else. Keeping the formatter alongside the producer avoids duplicating extractor logic across the two packages and sidesteps the import cycle that would result from defining it under internal/rules/.
The `Schema` prefix on the name disambiguates from lint.Diagnostic at import sites where both packages are visible — plan 147 names the type explicitly, and renaming to schema.Diagnostic would require touching every caller for purely cosmetic reasons.
func (SchemaDiagnostic) Format ¶ added in v0.17.0
func (d SchemaDiagnostic) Format() string
Format renders the diagnostic as the two-line message described in plan 147: the first line carries field/actual/expected, an optional hint follows in parentheses on its own indented line, and the schema reference appears on a trailing line so it stays greppable without parsing the message body.
type Scope ¶
type Scope struct {
// Heading is the diagnostic-friendly label for this scope —
// the bare-string literal for the sugar form, the regex body
// for the mapping form, or empty for the preamble. The
// validator does not match on this field; Matcher drives all
// heading claims.
Heading string
// Matcher, when non-nil, describes how this scope claims one or
// more headings. Preamble scopes leave Matcher nil because they
// have no heading text to compare against — their range is
// `[parent-start, first-child-heading)`.
Matcher *Matcher
// Sections is the recursive list of nested sections (one level
// deeper in the document tree).
Sections []Scope
// Preamble reports whether this scope describes the implicit
// section before any heading — the document's lead-in content.
// Authors write it inline as `heading: null`. A preamble scope
// has no heading text to match; its range is [parent-start,
// first-child-heading). Only valid as the first entry of its
// section list.
Preamble bool
// Closed reports whether this scope is strict: when true,
// unlisted child headings produce a diagnostic; when false,
// they are tolerated between listed sub-sections.
Closed bool
// Rules carries per-scope rule-config overrides. Each entry maps
// a rule name to a settings map. The MDS020 walker re-runs each
// named rule with these settings against the document and
// filters diagnostics to the scope's heading range.
//
// Today the override stacks on the rule's defaults, not the
// file's full effective config (defaults → kinds → file globs →
// scope). Threading the full merge through the engine is a
// tracked follow-up on plan 146; see docs/guides/schemas.md.
Rules map[string]map[string]any
// Content carries non-heading AST-node constraints inside the
// matched section: required code blocks, tables, lists, and
// paragraphs, in positional order. Plan 149 added this; entries
// follow the same out-of-order + unlisted-slot semantics the
// heading-tree validator uses.
Content []ContentEntry
}
Scope binds an AST subtree (a section) to a set of constraints and per-rule config overrides. The root scope's children are the top-level (H2) section list; their children are H3, and so on. Levels come from depth in the tree.
Plan 156 collapses the section-entry vocabulary to three orthogonal axes:
- Discriminator — `heading:` value (`null`, string, or mapping).
- Matcher — `regex:` inside the mapping form.
- Cardinality — `repeat: { min, max }` inside the mapping form.
The legacy `aliases:`, `required:`, scope-level `repeats:`/`sequential:`/`min:`/`max:`, and `{unlisted: true}` shapes are gone; the parser rejects them with a "removed; see plan 156" diagnostic naming the replacement.
type ScopeMatch ¶ added in v0.18.0
type ScopeMatch struct {
// Scope is the schema scope this match satisfied. Nil only for
// the synthetic MatchTree.Root.
Scope *Scope
// Preamble reports whether this is the `heading: null`
// no-heading section (content before the first child heading).
Preamble bool
// Heading is the document heading that matched. Zero-valued for
// the preamble and the synthetic root.
Heading DocHeading
// Captures holds every placeholder bound by this heading: named
// regex groups (the `n` digits capture) plus `{field}` fmvar
// placeholders resolved from front matter. Nil when the scope's
// heading is a plain literal.
Captures map[string]string
// Children are the matched child scopes in document order.
Children []*ScopeMatch
// Content are the matched content entries in declared order.
Content []ContentMatch
}
ScopeMatch records one matched section (or the no-heading preamble). A repeating scope produces one ScopeMatch per occurrence, all sharing the same Scope pointer so the projector can group consecutive occurrences into an array.