parser

package
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 11, 2026 License: MIT Imports: 6 Imported by: 0

Documentation

Overview

Package parser extracts structured content from LLM responses.

Core types:

  • Response: Contains structured data extracted from an LLM response
  • CodeBlock: A fenced code block with language and content
  • Parser: Extracts code blocks, JSON, YAML, sections, and lists

Example usage:

p := parser.NewParser()
resp := p.Parse(llmOutput)

// Access extracted code blocks
for _, block := range resp.CodeBlocks {
    fmt.Printf("Language: %s\nCode:\n%s\n", block.Language, block.Content)
}

// Access parsed JSON
for _, data := range resp.JSONBlocks {
    fmt.Printf("JSON: %v\n", data)
}

Convenience functions:

json := parser.ExtractJSON(response)
code := parser.ExtractCode(response, "go")
parsed := parser.Parse(response)

Index

Constants

This section is empty.

Variables

View Source
var (
	// PhaseMarkers matches common phase completion markers.
	PhaseMarkers = NewMarkerMatcher(
		"phase_complete",
		"phase_blocked",
	)

	// TaskMarkers matches task-level markers.
	TaskMarkers = NewMarkerMatcher(
		"task_complete",
		"task_blocked",
		"task_failed",
	)
)

Common marker matchers for orchestration systems.

Functions

func ExtractCode

func ExtractCode(response, language string) string

ExtractCode is a convenience function for code extraction.

func ExtractJSON

func ExtractJSON(response string) map[string]any

ExtractJSON is a convenience function for JSON extraction.

func GetBlockedReason added in v1.1.0

func GetBlockedReason(content string) string

GetBlockedReason extracts the reason from a phase_blocked marker. Returns empty string if no blocked marker is found.

func IsPhaseBlocked added in v1.1.0

func IsPhaseBlocked(content string) bool

IsPhaseBlocked checks if the content contains a phase blocked marker. This is a convenience function using the default PhaseMarkers matcher.

func IsPhaseComplete added in v1.1.0

func IsPhaseComplete(content string) bool

IsPhaseComplete checks if the content contains a phase completion marker. This is a convenience function using the default PhaseMarkers matcher.

Types

type CodeBlock

type CodeBlock struct {
	// Language is the language specifier after the opening fence (e.g., "go", "python").
	Language string

	// Content is the code inside the block, excluding fences.
	Content string

	// Raw is the complete block including the fences.
	Raw string
}

CodeBlock represents a fenced code block.

type Marker added in v1.1.0

type Marker struct {
	// Tag is the marker name (e.g., "phase_complete", "phase_blocked").
	Tag string

	// Value is the content between the opening and closing tags.
	Value string

	// Raw is the full matched text including tags.
	Raw string
}

Marker represents a detected XML-style marker in content. Markers are used for structured signaling in LLM responses, such as phase completion or blocking indicators.

type MarkerMatcher added in v1.1.0

type MarkerMatcher struct {
	// contains filtered or unexported fields
}

MarkerMatcher finds XML-style markers in content. It compiles regex patterns for each registered tag and caches them for efficient repeated matching.

Example markers:

<phase_complete>true</phase_complete>
<phase_blocked>reason: need clarification</phase_blocked>
<implement_complete>true</implement_complete>

func NewMarkerMatcher added in v1.1.0

func NewMarkerMatcher(tags ...string) *MarkerMatcher

NewMarkerMatcher creates a matcher for the given tag names. Tags should be provided without angle brackets.

Example:

matcher := NewMarkerMatcher("phase_complete", "phase_blocked")

func (*MarkerMatcher) AddTag added in v1.1.0

func (m *MarkerMatcher) AddTag(tag string)

AddTag adds a new tag to match. This is safe for concurrent use.

func (*MarkerMatcher) Contains added in v1.1.0

func (m *MarkerMatcher) Contains(content, tag string) bool

Contains checks if any marker with the given tag exists in content.

func (*MarkerMatcher) ContainsAny added in v1.1.0

func (m *MarkerMatcher) ContainsAny(content string) bool

ContainsAny checks if any of the registered markers exist in content.

func (*MarkerMatcher) ContainsValue added in v1.1.0

func (m *MarkerMatcher) ContainsValue(content, tag, value string) bool

ContainsValue checks if a marker with the specific tag and value exists. The value comparison is case-insensitive and trims whitespace.

func (*MarkerMatcher) FindAll added in v1.1.0

func (m *MarkerMatcher) FindAll(content string) []Marker

FindAll returns all markers found in content for all registered tags.

func (*MarkerMatcher) FindAllForTag added in v1.1.0

func (m *MarkerMatcher) FindAllForTag(content, tag string) []Marker

FindAllForTag returns all markers for a specific tag.

func (*MarkerMatcher) FindFirst added in v1.1.0

func (m *MarkerMatcher) FindFirst(content, tag string) (Marker, bool)

FindFirst returns the first marker found for the given tag. Returns false if no marker is found.

func (*MarkerMatcher) GetValue added in v1.1.0

func (m *MarkerMatcher) GetValue(content, tag string) string

GetValue extracts the value of a marker with the given tag. Returns empty string if the marker is not found.

func (*MarkerMatcher) Tags added in v1.1.0

func (m *MarkerMatcher) Tags() []string

Tags returns the list of tags this matcher looks for.

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser extracts structured content from LLM responses.

func NewParser

func NewParser() *Parser

NewParser creates a new response parser with compiled regexes.

func (*Parser) ExtractAllCode

func (p *Parser) ExtractAllCode(response string) []CodeBlock

ExtractAllCode extracts all code blocks from the response.

func (*Parser) ExtractCode

func (p *Parser) ExtractCode(response, language string) string

ExtractCode extracts the first code block with the given language. If language is empty, returns the first code block found.

func (*Parser) ExtractJSON

func (p *Parser) ExtractJSON(response string) map[string]any

ExtractJSON extracts and parses the first JSON block found. Returns nil if no valid JSON block is found.

func (*Parser) ExtractJSONArray

func (p *Parser) ExtractJSONArray(response string) []map[string]any

ExtractJSONArray extracts and parses JSON arrays from code blocks. Returns all successfully parsed arrays.

func (*Parser) ExtractList

func (p *Parser) ExtractList(response string) []string

ExtractList extracts list items from the response. Supports both - and * bullet points.

func (*Parser) ExtractNumberedList

func (p *Parser) ExtractNumberedList(response string) []string

ExtractNumberedList extracts numbered list items.

func (*Parser) ExtractSection

func (p *Parser) ExtractSection(response, title string) string

ExtractSection extracts the content of a specific section by title.

func (*Parser) ExtractYAML

func (p *Parser) ExtractYAML(response string) []map[string]any

ExtractYAML extracts and parses YAML blocks.

func (*Parser) HasCodeBlock

func (p *Parser) HasCodeBlock(response string) bool

HasCodeBlock checks if the response contains any code block.

func (*Parser) HasJSON

func (p *Parser) HasJSON(response string) bool

HasJSON checks if the response contains valid JSON.

func (*Parser) Parse

func (p *Parser) Parse(response string) *Response

Parse extracts structured content from an LLM response.

type Response

type Response struct {
	// Raw is the original response text.
	Raw string

	// Text is the response with code blocks removed.
	Text string

	// CodeBlocks contains all extracted code blocks.
	CodeBlocks []CodeBlock

	// JSONBlocks contains parsed JSON blocks.
	JSONBlocks []map[string]any

	// Sections contains extracted markdown sections by title.
	Sections map[string]string
}

Response contains structured data extracted from an LLM response.

func Parse

func Parse(response string) *Response

Parse is a convenience function using the default parser.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL