Documentation
¶
Overview ¶
Package ltxmlharvest provides a MathWebSearch harvester for documents outputted by latexml
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HarvestFS ¶
func HarvestFS(fsys fs.FS, accept func(path string) bool, uri func(path string) string, writer func(path string, harvest Harvest) error, logger *log.Logger)
HarvestFS recursively harvests all files in fs.FS. Each directory will be grouped into a single harvest.
func HarvestReader ¶
HarvestReader harvests a single reader and writes the output to writer
Types ¶
type Harvest ¶
type Harvest []HarvestFragment
Harvest represents a single harvest. It implements sort.Interface
func HarvestFragments ¶
HarvestFragments executes jobs and writes them to logger
func (Harvest) MarshalXML ¶
MarshalXML marshals this harvest into xml form
type HarvestFormula ¶
type HarvestFormula struct {
// ID of this formula
ID string
// Dual (Content + Presentation) MathML contained in this document
// Content and Presentation should be linked using "xref" attributes.
// May use "m" and "mws" namespaces.
DualMathML string
// Content MathML corresponding to the DualMathML above.
// Must use the "m" namespace.
ContentMathML string
}
HarvestFormula represents a single formula found within the harvest
func ReadFormula ¶
func ReadFormula(math *etree.Element) (HarvestFormula, error)
ReadFormula parses a formula based on element
type HarvestFragment ¶
type HarvestFragment struct {
// ID is an internal, but unique, id of this harvest fragment
// typically just the running id of this fragment
ID string
// URI is the URI of the corresponding document
URI string
// XHTMLContent of this document, substiuting "math" + id for formulae
XHTMLContent string
// List of formulae within the harvest
Formulae []HarvestFormula
}
HarvestFragment represents a single document fragment within a harvest
func (HarvestFragment) MarshalXML ¶
func (frag HarvestFragment) MarshalXML(e *xml.Encoder, start xml.StartElement) error
MarshalXML marshals this document into xml
type Job ¶
type Job struct {
Reader func() (io.ReadCloser, error)
URI string
}
Job describes a job for the harvester
func JobFromFile ¶
JobFromFile creates a new Job from a file and a uribase