extract

package
v0.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 10, 2026 License: MIT Imports: 7 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func TransformValue

func TransformValue(raw, transform, targetType string) (any, string)

TransformValue applies an optional transform and type coercion to a raw string.

Types

type HealthStats

type HealthStats struct {
	TotalRecords    int
	TotalFields     int
	PopulatedFields int
	EmptyFields     int
}

HealthStats tracks extraction quality across a batch of pages.

func ComputeHealth

func ComputeHealth(result *Result) *HealthStats

ComputeHealth analyzes an extraction result and returns health stats.

func (*HealthStats) NeedsReInference

func (h *HealthStats) NeedsReInference(threshold float64) bool

NeedsReInference returns true if the success rate is below the threshold.

func (*HealthStats) SuccessRate

func (h *HealthStats) SuccessRate() float64

SuccessRate returns the percentage of fields that were populated.

type Record

type Record = map[string]any

Record is a single extracted data row.

type Result

type Result struct {
	Records  []Record
	Fields   []string // ordered field names
	Warnings []string
}

Result holds the extraction output.

func Apply

func Apply(strat *strategy.ExtractionStrategy, html []byte) (*Result, error)

Apply applies an extraction strategy to an HTML page and returns structured records.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL