content

package

v1.0.1 Latest Latest Go to latest Published: Mar 23, 2026 License: MIT Imports: 3 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/dmoose/doctrove

Links

Open Source Insights

Documentation ¶

Overview ¶

Package content defines interfaces for parsing and processing document content.

Index ¶

Variables
type MarkdownProcessor
type NoOpSummarizer
- func (n *NoOpSummarizer) Summarize(ctx context.Context, domain, path, content string) (string, error)
type OutlineResult
type Processor
type Section
type Summarizer

Constants ¶

This section is empty.

Variables ¶

View Source

var ErrSectionNotFound = fmt.Errorf("section not found")

ErrSectionNotFound is returned when a section heading cannot be matched.

Functions ¶

This section is empty.

Types ¶

type MarkdownProcessor ¶

type MarkdownProcessor struct{}

MarkdownProcessor parses markdown documents by ATX heading structure.

func (*MarkdownProcessor) CanProcess ¶

func (m *MarkdownProcessor) CanProcess(path, contentType string) bool

func (*MarkdownProcessor) Name ¶

func (m *MarkdownProcessor) Name() string

func (*MarkdownProcessor) Outline ¶

func (m *MarkdownProcessor) Outline(text string, maxDepth, maxSections int) OutlineResult

func (*MarkdownProcessor) ReadSection ¶

func (m *MarkdownProcessor) ReadSection(text, sectionName string) (string, error)

type NoOpSummarizer ¶

type NoOpSummarizer struct{}

NoOpSummarizer is the default — relies on agent-submitted summaries only.

func (*NoOpSummarizer) Summarize ¶

func (n *NoOpSummarizer) Summarize(ctx context.Context, domain, path, content string) (string, error)

type OutlineResult ¶

type OutlineResult struct {
	Sections  []Section `json:"sections"`
	Truncated bool      `json:"truncated,omitempty"`
}

OutlineResult holds the parsed structure of a document.

type Processor ¶

type Processor interface {
	// Name returns the processor identifier (e.g. "markdown").
	Name() string

	// CanProcess returns true if this processor handles the given path/content type.
	CanProcess(path, contentType string) bool

	// Outline extracts the heading structure from content.
	// maxDepth limits heading levels (0 = no limit).
	// maxSections caps the number of returned sections (0 = no limit).
	Outline(content string, maxDepth, maxSections int) OutlineResult

	// ReadSection extracts the content under a named heading.
	// Uses case-insensitive substring matching on heading text.
	ReadSection(content, sectionName string) (string, error)
}

Processor parses document content into structural components. Implementations exist for different content formats (markdown, etc.).

type Section ¶

type Section struct {
	Heading string `json:"heading"`
	Level   int    `json:"level"`
	Line    int    `json:"line"`       // 1-based line number
	Size    int    `json:"size_chars"` // characters in this section
}

Section represents a structural division of a document (e.g. a heading).

type Summarizer ¶

type Summarizer interface {
	// Summarize returns a summary for the given content.
	// Returns empty string if summarization is not available.
	Summarize(ctx context.Context, domain, path, content string) (string, error)
}

Summarizer generates summaries for document content. Implementations can use LLM APIs, extractive methods, TF-IDF, etc.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL