Documentation
¶
Overview ¶
Package parser provides AST-based parsing for Raven markdown files.
This file implements goldmark-first parsing where the markdown AST is used to identify code blocks (which are skipped) and text content (where Raven syntax like traits and references are extracted).
Package parser handles parsing markdown files.
Index ¶
- Variables
- func ExtractEmbeddedRefs(value string) []string
- func FieldValueFromYAML(value interface{}) schema.FieldValue
- func FrontmatterBounds(lines []string) (startLine int, endLine int, ok bool)
- func IsRefOnTraitLine(traitLine, refLine int) bool
- func NormalizeFenceLine(line string) string
- func ParseFenceMarker(line string) (ch byte, n int, ok bool)
- func ParseFieldValue(s string) schema.FieldValue
- func ParseTraitValue(s string) schema.FieldValue
- func RemoveInlineCode(line string) string
- func SerializeTypeDeclaration(typeName string, fields map[string]schema.FieldValue) string
- func Slugify(text string) string
- func StripTraitAnnotations(line string) string
- type ASTContent
- type EmbeddedTypeInfo
- type ExtractedRef
- type FenceState
- type Frontmatter
- type Heading
- type ParseOptions
- type ParsedDocument
- type ParsedObject
- type ParsedRef
- type ParsedTrait
- type RefExtractOptions
- type Reference
- type TraitAnnotation
- type TypeDeclaration
Constants ¶
This section is empty.
Variables ¶
var TraitHighlightPattern = regexp.MustCompile(`@([\w-]+)(?:\(([^)]*)\))?`)
TraitHighlightPattern is a regex for highlighting traits in already-parsed content. It's simpler than traitRegex because it doesn't need context validation - it's used for display purposes on content that has already been parsed. Capture groups: [1] = trait name, [2] = value (if present)
Functions ¶
func ExtractEmbeddedRefs ¶
ExtractEmbeddedRefs parses embedded refs in trait values like [[path/to/file], [[other]]]. Handles array syntax where refs are wrapped in extra brackets.
func FieldValueFromYAML ¶
func FieldValueFromYAML(value interface{}) schema.FieldValue
FieldValueFromYAML converts a YAML value to a FieldValue.
func FrontmatterBounds ¶
FrontmatterBounds returns the opening and closing frontmatter line indices. It only detects frontmatter when the first line is '---'. If frontmatter is present but unclosed, endLine is -1.
func IsRefOnTraitLine ¶
IsRefOnTraitLine returns true if a reference is on the same line as a trait. This implements the CONTENT SCOPE RULE: refs on the same line as a trait are considered associated with that trait's content.
This function is the single source of truth for trait-to-reference association. The query executor uses this same logic (matching by file_path and line_number).
func NormalizeFenceLine ¶
NormalizeFenceLine prepares a line for fence marker detection. It strips leading whitespace, blockquote prefixes, and list markers so we can detect fenced code blocks in common markdown contexts (including inside lists).
func ParseFenceMarker ¶
ParseFenceMarker checks if a line (after normalization) starts a code fence. Returns the fence character, fence length, and whether it's a valid fence.
func ParseFieldValue ¶
func ParseFieldValue(s string) schema.FieldValue
ParseFieldValue parses a single field value using the same rules as ::type() declarations.
func ParseTraitValue ¶
func ParseTraitValue(s string) schema.FieldValue
ParseTraitValue parses a trait value using strict date/datetime validation.
func RemoveInlineCode ¶
RemoveInlineCode removes inline code spans from a line, replacing them with spaces to preserve character positions for other parsing operations. Handles both single backticks (`code`) and double backticks (“code with `backtick` inside“).
func SerializeTypeDeclaration ¶
func SerializeTypeDeclaration(typeName string, fields map[string]schema.FieldValue) string
SerializeTypeDeclaration serializes a type declaration back to ::typename(fields) format.
func StripTraitAnnotations ¶
StripTraitAnnotations removes all trait annotations from a line and returns the remaining content.
CONTENT SCOPE RULE: A trait's content consists of all text on the same line as the trait annotation, with trait annotations removed.
Types ¶
type ASTContent ¶
type ASTContent struct {
Headings []Heading
Traits []TraitAnnotation
Refs []Reference
TypeDecls map[int]*EmbeddedTypeInfo // heading line -> type decl
}
ASTContent holds all Raven syntax extracted from a markdown AST.
func ExtractFromAST ¶
func ExtractFromAST(content []byte, startLine int) (*ASTContent, error)
ExtractFromAST parses markdown content with goldmark and extracts all Raven-specific syntax (headings, traits, references, type declarations).
Code blocks (fenced, indented, inline) are automatically skipped - any @traits or [[references]] inside code will not be extracted.
type EmbeddedTypeInfo ¶
type EmbeddedTypeInfo struct {
TypeName string
ID string
Fields map[string]schema.FieldValue
// Line is the 1-indexed line number of the ::type(...) declaration in the file.
// This is the declaration line (not the heading line).
Line int
}
EmbeddedTypeInfo contains simplified embedded type info for the document parser.
func ParseEmbeddedType ¶
func ParseEmbeddedType(line string, lineNumber int) *EmbeddedTypeInfo
ParseEmbeddedType parses an embedded type declaration from a line. Returns nil if the line is not a type declaration. The ID field may be empty - the caller should derive it from the heading if so.
Supports both forms:
- ::typename(field=value, ...) - with parentheses and optional fields
- ::typename - shorthand for ::typename() with no fields
type ExtractedRef ¶
ExtractedRef represents a resolved ref target and optional display text.
func ExtractRefsFromFieldValue ¶
func ExtractRefsFromFieldValue(fv schema.FieldValue, opts RefExtractOptions) []ExtractedRef
ExtractRefsFromFieldValue extracts refs from a FieldValue using the provided options.
type FenceState ¶
FenceState tracks whether we're inside a fenced code block.
func (*FenceState) UpdateFenceState ¶
func (fs *FenceState) UpdateFenceState(line string) bool
UpdateFenceState updates the fence state based on a line. Returns true if the line is a fence marker (opening or closing).
type Frontmatter ¶
type Frontmatter struct {
// ObjectType is the type field (if present).
ObjectType string
// Fields are all other fields.
Fields map[string]schema.FieldValue
// Raw is the raw frontmatter content.
Raw string
// EndLine is the line where frontmatter ends (1-indexed).
EndLine int
}
Frontmatter represents parsed frontmatter data.
func ParseFrontmatter ¶
func ParseFrontmatter(content string) (*Frontmatter, error)
ParseFrontmatter parses YAML frontmatter from markdown content. Returns nil if no frontmatter is found.
type ParseOptions ¶
type ParseOptions struct {
// ObjectsRoot is the root directory for typed objects (e.g., "objects/").
// If set, this prefix is stripped from file paths when computing object IDs.
ObjectsRoot string
// PagesRoot is the root directory for untyped pages (e.g., "pages/").
// If set, this prefix is stripped from file paths when computing object IDs.
PagesRoot string
}
ParseOptions contains options for parsing documents.
type ParsedDocument ¶
type ParsedDocument struct {
FilePath string // File path relative to vault
RawContent string // Raw markdown content
Body string // Content without frontmatter (for full-text search indexing)
Objects []*ParsedObject // All objects in this document
Traits []*ParsedTrait // All traits in this document
Refs []*ParsedRef // All references in this document
}
ParsedDocument represents a fully parsed document.
func ParseDocument ¶
func ParseDocument(content string, filePath string, vaultPath string) (*ParsedDocument, error)
ParseDocument parses a markdown document.
func ParseDocumentWithOptions ¶
func ParseDocumentWithOptions(content string, filePath string, vaultPath string, opts *ParseOptions) (*ParsedDocument, error)
ParseDocumentWithOptions parses a markdown document with custom options.
type ParsedObject ¶
type ParsedObject struct {
ID string // Unique ID (path for file-level, path#id for embedded)
ObjectType string // Type name
Fields map[string]schema.FieldValue // Fields/metadata
Heading *string // Heading text (for embedded objects)
HeadingLevel *int // Heading level (for embedded objects)
ParentID *string // Parent object ID (for embedded objects)
LineStart int // Line where this object starts
LineEnd *int // Line where this object ends (embedded only)
}
ParsedObject represents a parsed object (file-level or embedded).
type ParsedRef ¶
type ParsedRef struct {
SourceID string // Source object ID
TargetRaw string // Raw target (as written)
DisplayText *string // Display text
Line int // Line number
Start int // Start position
End int // End position
}
ParsedRef represents a parsed reference.
type ParsedTrait ¶
type ParsedTrait struct {
TraitType string // Trait type name (e.g., "due", "priority", "highlight")
Value *schema.FieldValue // Trait value (nil for boolean traits)
Content string // The content the trait annotates
ParentObjectID string // Parent object ID
Line int // Line number
}
ParsedTrait represents a parsed trait annotation.
func (*ParsedTrait) HasValue ¶
func (t *ParsedTrait) HasValue() bool
HasValue returns true if this trait has a value.
func (*ParsedTrait) ValueString ¶
func (t *ParsedTrait) ValueString() string
ValueString returns the value as a string, or empty string if no value.
type RefExtractOptions ¶
type RefExtractOptions struct {
// AllowBareStrings treats plain strings as ref targets.
AllowBareStrings bool
// AllowWikilinksInString scans string values for wikilink references.
AllowWikilinksInString bool
// AllowTripleBrackets passes allowTriple=true to the wikilink parser.
AllowTripleBrackets bool
}
RefExtractOptions controls how refs are extracted from a FieldValue.
type Reference ¶
type Reference struct {
TargetRaw string // The raw target (as written)
DisplayText *string // Display text (if different from target)
Line int // Line number where found (1-indexed)
Start int // Start position in line
End int // End position in line
}
Reference represents a parsed [wikilink] reference.
func ExtractRefs ¶
ExtractRefs extracts references from content. It automatically skips refs inside fenced code blocks and inline code spans.
type TraitAnnotation ¶
type TraitAnnotation struct {
TraitName string
// Value is the single trait value (nil for boolean traits like @highlight)
Value *schema.FieldValue
Content string // Full line content with all trait annotations removed
Line int
StartOffset int
EndOffset int
}
TraitAnnotation represents a parsed @trait() annotation.
func ParseTrait ¶
func ParseTrait(line string, lineNumber int) *TraitAnnotation
ParseTrait parses a single trait from a line (returns first match).
func ParseTraitAnnotations ¶
func ParseTraitAnnotations(line string, lineNumber int) []TraitAnnotation
ParseTraitAnnotations parses all trait annotations from a text segment.
Note: This function ignores inline code for matching by removing code spans before running the trait regex. This keeps @traits inside inline code from being parsed while preserving the original line content for display.
func (*TraitAnnotation) HasValue ¶
func (t *TraitAnnotation) HasValue() bool
HasValue returns true if this trait has a value.
func (*TraitAnnotation) ValueString ¶
func (t *TraitAnnotation) ValueString() string
ValueString returns the value as a string, or empty string if no value.
type TypeDeclaration ¶
type TypeDeclaration struct {
TypeName string
ID string
Fields map[string]schema.FieldValue
Line int
}
TypeDeclaration represents a parsed ::type() declaration.
func ParseTypeDeclaration ¶
func ParseTypeDeclaration(line string, lineNumber int) (*TypeDeclaration, error)
ParseTypeDeclaration parses a type declaration from a line. Supports both ::typename(args...) and ::typename (without parentheses).