parser

package
v0.0.0-...-85e6ade Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 17, 2026 License: MIT Imports: 6 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type JSONOptions

type JSONOptions struct {
	IncludeAttributes  bool
	IncludeTextContent bool
	PrettyPrint        bool
	TrimWhitespace     bool
}

JSONOptions controls the behavior of HTML to JSON conversion.

func DefaultJSONOptions

func DefaultJSONOptions() JSONOptions

DefaultJSONOptions returns the default JSON conversion options.

type Parser

type Parser struct {
	// contains filtered or unexported fields
}

Parser represents an HTML document that can be queried with XPath.

func LoadFromURL

func LoadFromURL(url string) (*Parser, error)

LoadFromURL creates a Parser by fetching HTML from a URL. Deprecated: Use gtmlp.ParseURL() instead, which includes HTTP client functionality.

func New

func New(htmlContent string) (*Parser, error)

New creates a new Parser from an HTML string.

func (*Parser) HTML

func (p *Parser) HTML() string

HTML returns the original HTML string.

func (*Parser) Root

func (p *Parser) Root() *html.Node

Root returns the root HTML node.

func (*Parser) String

func (p *Parser) String() string

String returns a string representation of the parser.

func (*Parser) ToJSON

func (p *Parser) ToJSON() ([]byte, error)

ToJSON converts the HTML document to JSON format using default options.

func (*Parser) ToJSONWithOptions

func (p *Parser) ToJSONWithOptions(opts JSONOptions) ([]byte, error)

ToJSONWithOptions converts the HTML document to JSON with custom options.

func (*Parser) ToMap

func (p *Parser) ToMap(opts JSONOptions) map[string]any

ToMap converts the parser document to a map structure.

func (*Parser) URL

func (p *Parser) URL() string

URL returns the URL if the HTML was loaded from a URL.

func (*Parser) WithSuppressErrors

func (p *Parser) WithSuppressErrors() *Parser

WithSuppressErrors enables error suppression for XPath queries. When enabled, XPath errors return nil instead of error values.

func (*Parser) XPath

func (p *Parser) XPath(expr string) (*Selection, error)

XPath executes an XPath expression and returns the first matching node. Returns nil if no match is found.

func (*Parser) XPathAll

func (p *Parser) XPathAll(expr string) ([]*Selection, error)

XPathAll executes an XPath expression and returns all matching nodes.

type ReturnType

type ReturnType string

ReturnType defines how to extract content from nodes.

const (
	// ReturnTypeText returns plain text content.
	ReturnTypeText ReturnType = "text"
	// ReturnTypeHTML returns HTML content.
	ReturnTypeHTML ReturnType = "html"
)

type Selection

type Selection struct {
	// contains filtered or unexported fields
}

Selection represents a selected HTML node.

func SelectionFromNode

func SelectionFromNode(node *html.Node) *Selection

SelectionFromNode creates a Selection from an html.Node.

func SelectionsFromNodes

func SelectionsFromNodes(nodes []*html.Node) []*Selection

SelectionsFromNodes creates a slice of Selections from a slice of html.Nodes.

func (*Selection) Attr

func (s *Selection) Attr(name string) string

Attr returns the value of an attribute on the selected node.

func (*Selection) AttrOr

func (s *Selection) AttrOr(name, defaultValue string) string

AttrOr returns the value of an attribute, or a default value if not found.

func (*Selection) Children

func (s *Selection) Children() []*Selection

Children returns all direct children of the current selection.

func (*Selection) Content

func (s *Selection) Content(returnType ReturnType) string

Content returns the content based on the specified return type.

func (*Selection) Each

func (s *Selection) Each(fn func(int, *Selection))

Each iterates over all child element nodes and calls the given function.

func (*Selection) EvaluateXPath

func (s *Selection) EvaluateXPath(expr string) (any, error)

EvaluateXPath evaluates a raw XPath expression and returns the result.

func (*Selection) Find

func (s *Selection) Find(expr string) (*Selection, error)

Find executes an XPath expression relative to the current selection. Returns nil if no match is found.

func (*Selection) FindAll

func (s *Selection) FindAll(expr string) ([]*Selection, error)

FindAll executes an XPath expression relative to the current selection.

func (*Selection) FirstChild

func (s *Selection) FirstChild() *Selection

FirstChild returns the first child element (not text node).

func (*Selection) HTML

func (s *Selection) HTML() string

HTML returns the outer HTML of the selected node.

func (*Selection) InnerHTML

func (s *Selection) InnerHTML() string

InnerHTML returns the inner HTML of the selected node (excluding the outer tag).

func (*Selection) LastChild

func (s *Selection) LastChild() *Selection

LastChild returns the last child element (not text node).

func (*Selection) NextSibling

func (s *Selection) NextSibling() *Selection

NextSibling returns the next sibling element.

func (*Selection) Parent

func (s *Selection) Parent() *Selection

Parent returns the parent node of the current selection.

func (*Selection) PrevSibling

func (s *Selection) PrevSibling() *Selection

PrevSibling returns the previous sibling element.

func (*Selection) Text

func (s *Selection) Text() string

Text returns the text content of the selected node.

func (*Selection) TextTrimmed

func (s *Selection) TextTrimmed() string

TextTrimmed returns the trimmed text content of the selected node.

func (*Selection) ToJSON

func (s *Selection) ToJSON(opts JSONOptions) ([]byte, error)

ToJSON converts a selection to JSON.

func (*Selection) ToMap

func (s *Selection) ToMap(opts JSONOptions) map[string]any

ToMap converts the selection to a map structure.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL