page

package
v0.2.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 27, 2018 License: MIT Imports: 6 Imported by: 1

Documentation

Index

Constants

View Source
const (
	// MIMEType defines the mime-type of page XML files.
	// See: https://github.com/PRImA-Research-Lab/PAGE-XML
	MIMEType = "application/alto+xml"
)

Variables

This section is empty.

Functions

This section is empty.

Types

type Page

type Page struct {
	// contains filtered or unexported fields
}

Page represents an open page XML file.

func Open

func Open(path string) (Page, error)

Open opens a page XML file

func (Page) FindRegionByRefID

func (p Page) FindRegionByRefID(refID string) (Region, bool)

FindRegionByRefID returns the region with the given refID.

func (Page) FindRegionsByGroupID

func (p Page) FindRegionsByGroupID(groupID string) []Region

FindRegionsByGroupID returns all regions with the given group ID.

func (Page) Regions

func (p Page) Regions() []Region

Regions returns all regions in the page XML file.

type Region

type Region struct {
	GroupID, RefID string
	// contains filtered or unexported fields
}

Region defines a text region in the page XML file.

func (Region) FindLineByID

func (r Region) FindLineByID(id string) (TextLine, bool)

FindLineByID searches for a line with the given ID.

func (Region) Lines

func (r Region) Lines() []TextLine

Lines Returns all lines in a region.

func (Region) TextEquivUnicodeAt

func (r Region) TextEquivUnicodeAt(i int) (string, bool)

TextEquivUnicodeAt returns the i-th TextEquiv/Unicode entry (indexing is zero-based).

type TextLine

type TextLine struct {
	ID string
	// contains filtered or unexported fields
}

TextLine represents a line of text in the page XML file.

func (TextLine) FindWordByID

func (l TextLine) FindWordByID(id string) (Word, bool)

FindWordByID searches for a line with the given ID.

func (TextLine) TextEquivUnicodeAt

func (l TextLine) TextEquivUnicodeAt(i int) (string, bool)

TextEquivUnicodeAt returns the i-th TextEquiv/Unicode element (the indexing is zero-based).

func (TextLine) Words

func (l TextLine) Words() []Word

Words returns all words in a line.

type Word

type Word struct {
	ID string
	// contains filtered or unexported fields
}

Word represents a word on a line.

func (Word) TextEquivUnicodeAt

func (w Word) TextEquivUnicodeAt(i int) (string, bool)

TextEquivUnicodeAt returns the i-th TextEquiv/Unicode element (the indexing is zero-based).

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL