content

package
v1.0.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 23, 2026 License: MIT Imports: 3 Imported by: 0

Documentation

Overview

Package content defines interfaces for parsing and processing document content.

Index

Constants

This section is empty.

Variables

View Source
var ErrSectionNotFound = fmt.Errorf("section not found")

ErrSectionNotFound is returned when a section heading cannot be matched.

Functions

This section is empty.

Types

type MarkdownProcessor

type MarkdownProcessor struct{}

MarkdownProcessor parses markdown documents by ATX heading structure.

func (*MarkdownProcessor) CanProcess

func (m *MarkdownProcessor) CanProcess(path, contentType string) bool

func (*MarkdownProcessor) Name

func (m *MarkdownProcessor) Name() string

func (*MarkdownProcessor) Outline

func (m *MarkdownProcessor) Outline(text string, maxDepth, maxSections int) OutlineResult

func (*MarkdownProcessor) ReadSection

func (m *MarkdownProcessor) ReadSection(text, sectionName string) (string, error)

type NoOpSummarizer

type NoOpSummarizer struct{}

NoOpSummarizer is the default — relies on agent-submitted summaries only.

func (*NoOpSummarizer) Summarize

func (n *NoOpSummarizer) Summarize(ctx context.Context, domain, path, content string) (string, error)

type OutlineResult

type OutlineResult struct {
	Sections  []Section `json:"sections"`
	Truncated bool      `json:"truncated,omitempty"`
}

OutlineResult holds the parsed structure of a document.

type Processor

type Processor interface {
	// Name returns the processor identifier (e.g. "markdown").
	Name() string

	// CanProcess returns true if this processor handles the given path/content type.
	CanProcess(path, contentType string) bool

	// Outline extracts the heading structure from content.
	// maxDepth limits heading levels (0 = no limit).
	// maxSections caps the number of returned sections (0 = no limit).
	Outline(content string, maxDepth, maxSections int) OutlineResult

	// ReadSection extracts the content under a named heading.
	// Uses case-insensitive substring matching on heading text.
	ReadSection(content, sectionName string) (string, error)
}

Processor parses document content into structural components. Implementations exist for different content formats (markdown, etc.).

type Section

type Section struct {
	Heading string `json:"heading"`
	Level   int    `json:"level"`
	Line    int    `json:"line"`       // 1-based line number
	Size    int    `json:"size_chars"` // characters in this section
}

Section represents a structural division of a document (e.g. a heading).

type Summarizer

type Summarizer interface {
	// Summarize returns a summary for the given content.
	// Returns empty string if summarization is not available.
	Summarize(ctx context.Context, domain, path, content string) (string, error)
}

Summarizer generates summaries for document content. Implementations can use LLM APIs, extractive methods, TF-IDF, etc.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL