Documentation
¶
Overview ¶
Package parsers provides a registry for language-specific parser plugins. Parsers extract metadata and chunk code for various programming languages.
Package parsers provides language-specific parsing plugins for code analysis.
Index ¶
- Variables
- type ParserRegistry
- type RSSFeedMetadata
- type RSSItemMetadata
- type RSSParser
- func (p *RSSParser) CanHandle(path string, info fs.FileInfo) bool
- func (p *RSSParser) Chunk(content string, path string, opts *schema.CodeChunkingOptions) ([]schema.CodeChunk, error)
- func (p *RSSParser) Extensions() []string
- func (p *RSSParser) ExtractMetadata(content string, path string) (schema.FileMetadata, error)
- func (p *RSSParser) ExtractUsedSymbols(content string) []string
- func (p *RSSParser) IsGenerated(content string, path string) bool
- func (p *RSSParser) Name() string
Constants ¶
This section is empty.
Variables ¶
var ErrPluginNotFound = errors.New("language plugin not found")
ErrPluginNotFound is returned when a plugin is not found
Functions ¶
This section is empty.
Types ¶
type ParserRegistry ¶
type ParserRegistry interface {
// RegisterParser adds a new parser plugin to the registry.
RegisterParser(plugin schema.ParserPlugin) error
// GetParser returns the parser for the given language name.
GetParser(language string) (schema.ParserPlugin, error)
// GetParserForFile returns the appropriate parser for the given file.
GetParserForFile(path string, info fs.FileInfo) (schema.ParserPlugin, error)
// GetParserForExtension returns the parser for the given file extension.
GetParserForExtension(ext string) (schema.ParserPlugin, error)
// GetAllParsers returns all registered parsers.
GetAllParsers() []schema.ParserPlugin
}
ParserRegistry tracks registered language plugins and provides methods to look up parsers by language, file path, or extension.
func NewRegistry ¶
func NewRegistry(logger *slog.Logger) ParserRegistry
NewRegistry creates a new language plugin registry
func RegisterLanguagePlugins ¶
func RegisterLanguagePlugins(logger *slog.Logger) (ParserRegistry, error)
RegisterLanguagePlugins initializes and populates a language registry with all built-in parser plugins (Go, TypeScript, Markdown, JSON, YAML, etc.).
type RSSFeedMetadata ¶ added in v0.15.0
type RSSFeedMetadata struct {
Title string `json:"title"` // Feed title
Link string `json:"link"` // Feed website URL
Language string `json:"language"` // Feed language
Description string `json:"description"` // Feed description
}
RSSFeedMetadata represents metadata extracted from an RSS feed channel.
type RSSItemMetadata ¶ added in v0.15.0
type RSSItemMetadata struct {
Title string `json:"title"` // Item title
Link string `json:"link"` // Item URL
PubDate time.Time `json:"pub_date"` // Publication date
Author string `json:"author"` // Author name
Categories []string `json:"categories"` // Categories/tags
GUID string `json:"guid"` // Unique identifier
Description string `json:"description"` // Short description
Content string `json:"content"` // Full content
}
RSSItemMetadata represents metadata extracted from an RSS feed item.
type RSSParser ¶ added in v0.15.0
type RSSParser struct{}
RSSParser implements the ParserPlugin interface for RSS/Atom feed content. It handles RSS 2.0, Atom 1.0, and JSON feeds, treating each feed item as a document.
func NewRSSParser ¶ added in v0.15.0
func NewRSSParser() *RSSParser
NewRSSParser creates a new RSS parser instance.
func (*RSSParser) CanHandle ¶ added in v0.15.0
CanHandle determines if this parser can handle the given file.
func (*RSSParser) Chunk ¶ added in v0.15.0
func (p *RSSParser) Chunk(content string, path string, opts *schema.CodeChunkingOptions) ([]schema.CodeChunk, error)
Chunk divides RSS content into processable chunks. For RSS feeds, the entire item content is treated as a single chunk.
func (*RSSParser) Extensions ¶ added in v0.15.0
Extensions returns the file extensions this parser handles.
func (*RSSParser) ExtractMetadata ¶ added in v0.15.0
ExtractMetadata extracts metadata from RSS feed content. Returns basic file metadata; actual RSS metadata extraction happens in the loader.
func (*RSSParser) ExtractUsedSymbols ¶ added in v0.15.0
ExtractUsedSymbols returns nil as RSS feeds don't have symbol references.
func (*RSSParser) IsGenerated ¶ added in v0.15.0
IsGenerated returns false as RSS feeds are typically not auto-generated code.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package html provides an HTML parser plugin for transforming HTML content into clean Markdown suitable for LLM consumption and RAG applications.
|
Package html provides an HTML parser plugin for transforming HTML content into clean Markdown suitable for LLM consumption and RAG applications. |
|
core.go - Main plugin file with goldmark integration
|
core.go - Main plugin file with goldmark integration |
|
extractor - Fixed version
|
extractor - Fixed version |