parser

package
v0.0.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 8, 2026 License: MIT Imports: 16 Imported by: 0

Documentation

Overview

Package parser provides AST-based parsing for Raven markdown files.

This file implements goldmark-first parsing where the markdown AST is used to identify code blocks (which are skipped) and text content (where Raven syntax like traits and references are extracted).

Package parser handles parsing markdown files.

Index

Constants

This section is empty.

Variables

View Source
var TraitHighlightPattern = regexp.MustCompile(`@([\w-]+)(?:\(([^)]*)\))?`)

TraitHighlightPattern is a regex for highlighting traits in already-parsed content. It's simpler than traitRegex because it doesn't need context validation - it's used for display purposes on content that has already been parsed. Capture groups: [1] = trait name, [2] = value (if present)

Functions

func ExtractEmbeddedRefs

func ExtractEmbeddedRefs(value string) []string

ExtractEmbeddedRefs parses embedded refs in trait values like [[path/to/file], [[other]]]. Handles array syntax where refs are wrapped in extra brackets.

func FieldValueFromYAML

func FieldValueFromYAML(value interface{}) schema.FieldValue

FieldValueFromYAML converts a YAML value to a FieldValue.

func FrontmatterBounds

func FrontmatterBounds(lines []string) (startLine int, endLine int, ok bool)

FrontmatterBounds returns the opening and closing frontmatter line indices. It only detects frontmatter when the first line is '---'. If frontmatter is present but unclosed, endLine is -1.

func IsRefOnTraitLine

func IsRefOnTraitLine(traitLine, refLine int) bool

IsRefOnTraitLine returns true if a reference is on the same line as a trait. This implements the CONTENT SCOPE RULE: refs on the same line as a trait are considered associated with that trait's content.

This function is the single source of truth for trait-to-reference association. The query executor uses this same logic (matching by file_path and line_number).

func NormalizeFenceLine

func NormalizeFenceLine(line string) string

NormalizeFenceLine prepares a line for fence marker detection. It strips leading whitespace, blockquote prefixes, and list markers so we can detect fenced code blocks in common markdown contexts (including inside lists).

func ParseFenceMarker

func ParseFenceMarker(line string) (ch byte, n int, ok bool)

ParseFenceMarker checks if a line (after normalization) starts a code fence. Returns the fence character, fence length, and whether it's a valid fence.

func ParseFieldValue

func ParseFieldValue(s string) schema.FieldValue

ParseFieldValue parses a single field value using the same rules as ::type() declarations.

func ParseTraitValue

func ParseTraitValue(s string) schema.FieldValue

ParseTraitValue parses a trait value using strict date/datetime validation.

func RemoveInlineCode

func RemoveInlineCode(line string) string

RemoveInlineCode removes inline code spans from a line, replacing them with spaces to preserve character positions for other parsing operations. Handles both single backticks (`code`) and double backticks (“code with `backtick` inside“).

func SerializeTypeDeclaration

func SerializeTypeDeclaration(typeName string, fields map[string]schema.FieldValue) string

SerializeTypeDeclaration serializes a type declaration back to ::typename(fields) format.

func Slugify

func Slugify(text string) string

Slugify converts a heading text to a URL-friendly slug.

func StripTraitAnnotations

func StripTraitAnnotations(line string) string

StripTraitAnnotations removes all trait annotations from a line and returns the remaining content.

CONTENT SCOPE RULE: A trait's content consists of all text on the same line as the trait annotation, with trait annotations removed.

Types

type ASTContent

type ASTContent struct {
	Headings  []Heading
	Traits    []TraitAnnotation
	Refs      []Reference
	TypeDecls map[int]*EmbeddedTypeInfo // heading line -> type decl
}

ASTContent holds all Raven syntax extracted from a markdown AST.

func ExtractFromAST

func ExtractFromAST(content []byte, startLine int) (*ASTContent, error)

ExtractFromAST parses markdown content with goldmark and extracts all Raven-specific syntax (headings, traits, references, type declarations).

Code blocks (fenced, indented, inline) are automatically skipped - any @traits or [[references]] inside code will not be extracted.

type EmbeddedTypeInfo

type EmbeddedTypeInfo struct {
	TypeName string
	ID       string
	Fields   map[string]schema.FieldValue
	// Line is the 1-indexed line number of the ::type(...) declaration in the file.
	// This is the declaration line (not the heading line).
	Line int
}

EmbeddedTypeInfo contains simplified embedded type info for the document parser.

func ParseEmbeddedType

func ParseEmbeddedType(line string, lineNumber int) *EmbeddedTypeInfo

ParseEmbeddedType parses an embedded type declaration from a line. Returns nil if the line is not a type declaration. The ID field may be empty - the caller should derive it from the heading if so.

Supports both forms:

  • ::typename(field=value, ...) - with parentheses and optional fields
  • ::typename - shorthand for ::typename() with no fields

type ExtractedRef

type ExtractedRef struct {
	TargetRaw   string
	DisplayText *string
}

ExtractedRef represents a resolved ref target and optional display text.

func ExtractRefsFromFieldValue

func ExtractRefsFromFieldValue(fv schema.FieldValue, opts RefExtractOptions) []ExtractedRef

ExtractRefsFromFieldValue extracts refs from a FieldValue using the provided options.

type FenceState

type FenceState struct {
	InFence  bool
	FenceCh  byte
	FenceLen int
}

FenceState tracks whether we're inside a fenced code block.

func (*FenceState) UpdateFenceState

func (fs *FenceState) UpdateFenceState(line string) bool

UpdateFenceState updates the fence state based on a line. Returns true if the line is a fence marker (opening or closing).

type Frontmatter

type Frontmatter struct {
	// ObjectType is the type field (if present).
	ObjectType string

	// Fields are all other fields.
	Fields map[string]schema.FieldValue

	// Raw is the raw frontmatter content.
	Raw string

	// EndLine is the line where frontmatter ends (1-indexed).
	EndLine int
}

Frontmatter represents parsed frontmatter data.

func ParseFrontmatter

func ParseFrontmatter(content string) (*Frontmatter, error)

ParseFrontmatter parses YAML frontmatter from markdown content. Returns nil if no frontmatter is found.

type Heading

type Heading struct {
	Level int
	Text  string
	Line  int // 1-indexed
}

Heading represents a parsed heading.

type ParseOptions

type ParseOptions struct {
	// ObjectsRoot is the root directory for typed objects (e.g., "objects/").
	// If set, this prefix is stripped from file paths when computing object IDs.
	ObjectsRoot string

	// PagesRoot is the root directory for untyped pages (e.g., "pages/").
	// If set, this prefix is stripped from file paths when computing object IDs.
	PagesRoot string
}

ParseOptions contains options for parsing documents.

type ParsedDocument

type ParsedDocument struct {
	FilePath   string          // File path relative to vault
	RawContent string          // Raw markdown content
	Body       string          // Content without frontmatter (for full-text search indexing)
	Objects    []*ParsedObject // All objects in this document
	Traits     []*ParsedTrait  // All traits in this document
	Refs       []*ParsedRef    // All references in this document
}

ParsedDocument represents a fully parsed document.

func ParseDocument

func ParseDocument(content string, filePath string, vaultPath string) (*ParsedDocument, error)

ParseDocument parses a markdown document.

func ParseDocumentWithOptions

func ParseDocumentWithOptions(content string, filePath string, vaultPath string, opts *ParseOptions) (*ParsedDocument, error)

ParseDocumentWithOptions parses a markdown document with custom options.

type ParsedObject

type ParsedObject struct {
	ID           string                       // Unique ID (path for file-level, path#id for embedded)
	ObjectType   string                       // Type name
	Fields       map[string]schema.FieldValue // Fields/metadata
	Heading      *string                      // Heading text (for embedded objects)
	HeadingLevel *int                         // Heading level (for embedded objects)
	ParentID     *string                      // Parent object ID (for embedded objects)
	LineStart    int                          // Line where this object starts
	LineEnd      *int                         // Line where this object ends (embedded only)
}

ParsedObject represents a parsed object (file-level or embedded).

type ParsedRef

type ParsedRef struct {
	SourceID    string  // Source object ID
	TargetRaw   string  // Raw target (as written)
	DisplayText *string // Display text
	Line        int     // Line number
	Start       int     // Start position
	End         int     // End position
}

ParsedRef represents a parsed reference.

type ParsedTrait

type ParsedTrait struct {
	TraitType      string             // Trait type name (e.g., "due", "priority", "highlight")
	Value          *schema.FieldValue // Trait value (nil for boolean traits)
	Content        string             // The content the trait annotates
	ParentObjectID string             // Parent object ID
	Line           int                // Line number
}

ParsedTrait represents a parsed trait annotation.

func (*ParsedTrait) HasValue

func (t *ParsedTrait) HasValue() bool

HasValue returns true if this trait has a value.

func (*ParsedTrait) ValueString

func (t *ParsedTrait) ValueString() string

ValueString returns the value as a string, or empty string if no value.

type RefExtractOptions

type RefExtractOptions struct {
	// AllowBareStrings treats plain strings as ref targets.
	AllowBareStrings bool
	// AllowWikilinksInString scans string values for wikilink references.
	AllowWikilinksInString bool
	// AllowTripleBrackets passes allowTriple=true to the wikilink parser.
	AllowTripleBrackets bool
}

RefExtractOptions controls how refs are extracted from a FieldValue.

type Reference

type Reference struct {
	TargetRaw   string  // The raw target (as written)
	DisplayText *string // Display text (if different from target)
	Line        int     // Line number where found (1-indexed)
	Start       int     // Start position in line
	End         int     // End position in line
}

Reference represents a parsed [wikilink] reference.

func ExtractRefs

func ExtractRefs(content string, startLine int) []Reference

ExtractRefs extracts references from content. It automatically skips refs inside fenced code blocks and inline code spans.

type TraitAnnotation

type TraitAnnotation struct {
	TraitName string
	// Value is the single trait value (nil for boolean traits like @highlight)
	Value       *schema.FieldValue
	Content     string // Full line content with all trait annotations removed
	Line        int
	StartOffset int
	EndOffset   int
}

TraitAnnotation represents a parsed @trait() annotation.

func ParseTrait

func ParseTrait(line string, lineNumber int) *TraitAnnotation

ParseTrait parses a single trait from a line (returns first match).

func ParseTraitAnnotations

func ParseTraitAnnotations(line string, lineNumber int) []TraitAnnotation

ParseTraitAnnotations parses all trait annotations from a text segment.

Note: This function ignores inline code for matching by removing code spans before running the trait regex. This keeps @traits inside inline code from being parsed while preserving the original line content for display.

func (*TraitAnnotation) HasValue

func (t *TraitAnnotation) HasValue() bool

HasValue returns true if this trait has a value.

func (*TraitAnnotation) ValueString

func (t *TraitAnnotation) ValueString() string

ValueString returns the value as a string, or empty string if no value.

type TypeDeclaration

type TypeDeclaration struct {
	TypeName string
	ID       string
	Fields   map[string]schema.FieldValue
	Line     int
}

TypeDeclaration represents a parsed ::type() declaration.

func ParseTypeDeclaration

func ParseTypeDeclaration(line string, lineNumber int) (*TypeDeclaration, error)

ParseTypeDeclaration parses a type declaration from a line. Supports both ::typename(args...) and ::typename (without parentheses).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL