Documentation
¶
Overview ¶
Document Query - A simple HTML Document query library
Index ¶
- func DefaultNodeFilter(n *html.Node) bool
- func FindAll(root *html.Node, selector string) []*html.Node
- func FindDirectChild(n *html.Node, selector string) *html.Node
- func FindOne(root *html.Node, selector string) *html.Node
- func GetAttr(n *html.Node, key string) string
- func GetHref(n *html.Node) string
- func HasChild(n *html.Node, pattern string) bool
- func InnerText(n *html.Node, recurse bool) string
- func InnerTextWithFilter(n *html.Node, recurse bool, filter NodeFilter) string
- func RawInnerText(n *html.Node, recurse bool) string
- func Traverse(n *html.Node, ms []Matcher)
- type MatchFunc
- type MatchHandlerFunc
- type Matcher
- type NodeFilter
- type NodeMatcher
- type RecursiveNodeMatcher
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func DefaultNodeFilter ¶ added in v0.1.5
DefaultNodeFilter skips anchor elements with aria-label attributes (headerlink anchors commonly found on documentation sites).
func FindAll ¶ added in v0.1.5
FindAll returns all elements in the subtree matching the selector. Searches the entire subtree depth-first.
func FindDirectChild ¶ added in v0.1.5
FindDirectChild returns the first direct child element matching the selector, or nil if none is found. Only checks immediate children, not deeper descendants.
func FindOne ¶ added in v0.1.5
FindOne returns the first element in the subtree matching the selector, or nil if none is found. Searches the entire subtree depth-first.
func GetAttr ¶ added in v0.1.5
GetAttr returns the value of the named attribute, or "" if not found.
func InnerTextWithFilter ¶ added in v0.1.5
func InnerTextWithFilter(n *html.Node, recurse bool, filter NodeFilter) string
InnerTextWithFilter extracts text content from a node, skipping child nodes for which filter returns false. When recurse is true, element children are recursively processed and surrounded by spaces.
Types ¶
type MatchFunc ¶
func NewMatchFunc ¶
type MatchHandlerFunc ¶
type NodeFilter ¶ added in v0.1.5
NodeFilter returns true to include a child node, false to skip it.
type NodeMatcher ¶
type NodeMatcher struct {
// contains filtered or unexported fields
}
func NewNodeMatcher ¶
func NewNodeMatcher( matchFunc MatchFunc, handler MatchHandlerFunc, children ...Matcher, ) *NodeMatcher
func (*NodeMatcher) Handler ¶
func (m *NodeMatcher) Handler(n *html.Node)
func (*NodeMatcher) SubMatchers ¶
func (m *NodeMatcher) SubMatchers() []Matcher
type RecursiveNodeMatcher ¶ added in v0.1.5
type RecursiveNodeMatcher struct {
// contains filtered or unexported fields
}
RecursiveNodeMatcher handles recursive pattern matching for nested structures like "ul > li > ol > li" where the pattern can repeat at different nesting levels
func NewRecursiveNodeMatcher ¶ added in v0.1.5
func NewRecursiveNodeMatcher( pattern string, handler MatchHandlerFunc, recursive bool, children ...Matcher, ) *RecursiveNodeMatcher
NewRecursiveNodeMatcher creates a new recursive matcher pattern: space-separated pattern like "ul > li > ol > li" or "ul li ol li" handler: function to call when the full pattern matches recursive: if true, pattern restarts after completion for nested structures children: additional matchers to apply after pattern completion
func (*RecursiveNodeMatcher) Handler ¶ added in v0.1.5
func (m *RecursiveNodeMatcher) Handler(n *html.Node)
func (*RecursiveNodeMatcher) Match ¶ added in v0.1.5
func (m *RecursiveNodeMatcher) Match(n *html.Node) bool
func (*RecursiveNodeMatcher) SubMatchers ¶ added in v0.1.5
func (m *RecursiveNodeMatcher) SubMatchers() []Matcher