textquery

package
v1.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 6, 2026 License: AGPL-3.0, AGPL-3.0-or-later Imports: 12 Imported by: 0

Documentation

Index

Constants

View Source
const (
	ModeCSS   = "css"
	ModeXPath = "xpath"
	ModeRegex = "regex"
	ModeForm  = "form"
	ModeJQ    = "jq"
)

Mode constants for extraction languages.

Variables

This section is empty.

Functions

func ConvertYAMLToJSON

func ConvertYAMLToJSON(v any) any

ConvertYAMLToJSON recursively converts YAML-parsed values to JSON-compatible types. yaml.v3 produces map[string]any for mappings, but may produce other map types for non-string keys. Exported for reuse by pkg/shape.

func DetectMode

func DetectMode(ct string) string

DetectMode returns the appropriate extraction mode for a content-type header.

Types

type Engine

type Engine struct {
	// contains filtered or unexported fields
}

Engine dispatches text extraction queries to mode-specific handlers.

func NewEngine

func NewEngine() *Engine

NewEngine creates a new text query engine.

func (*Engine) Query

func (e *Engine) Query(body []byte, contentType, expression, mode string, maxResults int) (*QueryResult, error)

Query extracts data from a body using the specified mode and expression. If mode is empty, it is auto-detected from the content type.

func (*Engine) ValidateExpression

func (e *Engine) ValidateExpression(expression, mode string) error

ValidateExpression checks if an expression is valid for the given mode.

type QueryResult

type QueryResult struct {
	Values []any    `json:"values"`
	Count  int      `json:"count"`
	Mode   string   `json:"mode"`
	Errors []string `json:"errors,omitempty"`
}

QueryResult holds extraction results from a single body.

func QueryCSS

func QueryCSS(body []byte, expression string, maxResults int) (*QueryResult, error)

QueryCSS extracts text content from HTML using CSS selectors.

func QueryForm

func QueryForm(body []byte, expression string, maxResults int) (*QueryResult, error)

QueryForm extracts values from form-urlencoded bodies by key name. Expression "*" or "." returns all key-value pairs as a map. A specific key name returns the values for that key.

func QueryRegex

func QueryRegex(body []byte, expression string, maxResults int) (*QueryResult, error)

QueryRegex extracts matches from text using Go regular expressions. When the regex has capture groups, returns the first capture group per match. When it has no capture groups, returns the full match.

func QueryXPath

func QueryXPath(body []byte, ct, expression string, maxResults int) (*QueryResult, error)

QueryXPath extracts text content from XML or HTML using XPath expressions. Uses xmlquery for XML content types and htmlquery for HTML.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL