Documentation
¶
Overview ¶
Package lex is used for general 'container' classes such as entry, transcription, lemma, etc.
The main unit here is the entry. The entry contains everything related to a lexicon entry: orthography, transcriptions, lemma, compound parts, sources/references, et cetera. It is implemented as a go struct, and it can automatically be mapped into a JSON object.
The Entry struct is defined here: https://godoc.org/github.com/stts-se/pronlex/lex#Entry
A few JSON examples:
// Minimal example (English)
{
strn: "things",
transcriptions: [
{
strn: "' T I N z"
}
]
}
// Entry "things" from the CMU (US English) lexicon
{
id: 112326,
lexRef: {
DBRef: "en_am_cmu_lex",
LexName: "en-us.cmu"
},
strn: "things",
language: "en-us",
partOfSpeech: "",
morphology: "",
wordParts: "",
lemma: {
id: 0,
strn: "",
reading: "",
paradigm: ""
},
transcriptions: [
{
id: 120059,
entryId: 112326,
strn: "' T I N z",
language: "",
sources: [ ]
}
],
status: {
id: 112326,
name: "imported",
source: "cmu",
timestamp: "2017-09-20T13:13:21Z",
current: true
},
entryValidations: [ ],
preferred: false,
tag: ""
}
// Entry "hästar" from the Swedish demo lexicon
{
id: 6,
lexRef: {
DBRef: "wikispeech_lexserver_testdb",
LexName: "sv"
},
strn: "hästar",
language: "sv",
partOfSpeech: "NN",
morphology: "NEU IND PLU",
wordParts: "hästar",
lemma: {
id: 4,
strn: "häst",
reading: "",
paradigm: ""
},
transcriptions: [
{
id: 9,
entryId: 6,
strn: "" h E . s t a r",
language: "sv",
sources: [ ]
}
],
status: {
id: 6,
name: "demo",
source: "auto",
timestamp: "2017-09-22T08:43:32Z",
current: true
},
entryValidations: [ ],
preferred: false,
tag: ""
}
Index ¶
Constants ¶
This section is empty.
Variables ¶
var SourceDelimiter = " : "
SourceDelimiter is used to split a string of sevaral sources into a slice
Functions ¶
This section is empty.
Types ¶
type DBRef ¶
type DBRef string
DBRef a database reference string (for mariadb: the database name; for sqlite: the database filename without extension)
type Entry ¶
type Entry struct {
ID int64 `json:"id,omitempty"`
LexRef LexRef `json:"lexRef,omitempty"`
Strn string `json:"strn"`
Language string `json:"language,omitempty"`
PartOfSpeech string `json:"partOfSpeech,omitempty"`
Morphology string `json:"morphology,omitempty"`
WordParts string `json:"wordParts,omitempty"`
Lemma Lemma `json:"lemma,omitempty"`
Transcriptions []Transcription `json:"transcriptions"`
EntryStatus EntryStatus `json:"status,omitempty"` // TODO Probably should be a slice of statuses?
EntryValidations []EntryValidation `json:"entryValidations,omitempty"`
// Preferred flag: 1=true, 0=false; schema triggers only one preferred per orthographic word
//Preferred int64 `json:"preferred"`
Preferred bool `json:"preferred,omitempty"`
Tag string `json:"tag,omitempty"`
Comments []EntryComment `json:"comments,omitempty"`
}
Entry defines a lexical entry. It does not correspond one-to-one to the entry db table, since it contains data also from associated tables (Lemma, Tag, Transcription, EntryValidations). The Tag field holds an arbitrary, optional, lower case string to disambiguate between different lex.Entries charing the same othograpy. Two different lex.Entries cannot have identical lex.Entry.Tags (the database should not allow this).
type EntryComment ¶ added in v0.4.1
type EntryComment struct {
ID int64 `json:"id,omitempty"`
EntryID int64 `json:"entryId,omitempty"`
Source string `json:"source,omitempty"`
Label string `json:"label,omitempty"`
Comment string `json:"comment,omitempty"`
}
func (EntryComment) String ¶ added in v0.4.1
func (c EntryComment) String() string
type EntryFileWriter ¶
EntryFileWriter outputs formated entries to an io.Writer. Example usage:
bf := bufio.NewWriter(f)
defer bf.Flush()
bfx := lex.EntriesFileWriter{bf}
dbapi.LookUp(db, q, bfx)
func (*EntryFileWriter) Size ¶
func (w *EntryFileWriter) Size() int
Size returns the size of the EntryFileWriter content
func (*EntryFileWriter) Write ¶
func (w *EntryFileWriter) Write(e Entry) error
Write is used to write one lex.Entry at a time to a file
type EntrySliceWriter ¶
type EntrySliceWriter struct {
Entries []Entry
}
EntrySliceWriter is a container for returning Entries from a LookUp call to the db Example usage:
var q := dbapi.Query{ ... }
var esw lex.EntrySliceWriter
err := dbapi.LookUp(db, q, &esw)
[...] esw.Entries // process Entries
func (*EntrySliceWriter) Size ¶
func (w *EntrySliceWriter) Size() int
Size returns the size of the EntryFileWriter content
func (*EntrySliceWriter) Write ¶
func (w *EntrySliceWriter) Write(e Entry) error
Write is used to write one lex.Entry at a time to a file
type EntryStatus ¶
type EntryStatus struct {
ID int64 `json:"id,omitempty"`
Name string `json:"name,omitempty"`
Source string `json:"source,omitempty"`
//EntryID int64 `json:"entryId"`
//Timestamp int64 `json:"timestamp"`
Timestamp string `json:"timestamp,omitempty"`
Current bool `json:"current,omitempty"`
}
EntryStatus associates a status to an Entry. The status has a name (such as 'ok') and a source (a string identifying who or what generated the status)
type EntryValidation ¶
type EntryValidation struct {
ID int64 `json:"id,omitempty"`
// Lower case name of level of severity
Level string `json:"level"`
RuleName string `json:"ruleName"`
Message string `json:"Message"`
Timestamp string `json:"timestamp"`
}
EntryValidation associates a validation result to an Entry
func (EntryValidation) String ¶
func (ev EntryValidation) String() string
type EntryWriter ¶
EntryWriter is an interface defining things to which one can write an Entry. See EntrySliceWriter, for returning a slice of Entry, and EntryFileWriter, for writing Entries to file.
type Lemma ¶
type Lemma struct {
ID int64 `json:"id,omitempty"`
Strn string `json:"strn,omitempty"`
Reading string `json:"reading,omitempty"`
Paradigm string `json:"paradigm,omitempty"`
}
Lemma corresponds to a row of the lemma db table
type LexRef ¶
type LexRef struct {
DBRef DBRef `json:"dbRef,omitempty"`
LexName LexName `json:"lexName,omitempty"`
}
LexRef a lexicon reference specified by DBRef and LexName
func ParseLexRef ¶
ParseLexRef is used to parse a lexicon reference string into a LexRef struct
var fullLexName = "pronlex:sv-se-nst" var lexRef, _ = ParseLexRef(fullLexName) // lexRef.DBRef = pronlex // lexRef.LexName = sv-se-nst
*
type LexRefWithInfo ¶
LexRefWithInfo is a lexicon reference (LexRef) with additional info (SymbolSetName)
func NewLexRefWithInfo ¶ added in v0.4.1
func NewLexRefWithInfo(lexDB string, lexName string, symbolSetName string) LexRefWithInfo
NewLexRefWithInfo creates a lexicon reference with symbol set, from (downcased) input strings
type Transcription ¶
type Transcription struct {
ID int64 `json:"id,omitempty"`
EntryID int64 `json:"entryId,omitempty"`
Strn string `json:"strn"`
Language string `json:"language,omitempty"`
Sources []string `json:"sources,omitempty"`
}
Transcription corresponds to the transcription db table
func (*Transcription) AddSource ¶
func (t *Transcription) AddSource(s string) error
AddSource ... adds a source string at the beginning of the Transcription.Sources slice. If the source is already present, AddSource silently ignores to add the already existing source. AddSource returns an error when the input string contains the SourceDelimiter string.
func (Transcription) SourcesString ¶
func (t Transcription) SourcesString() string
SourcesString returns the []string items of Transcription.Sources as a string, where the items are delimited by SourceDelimiter
type TranscriptionSlice ¶
type TranscriptionSlice []Transcription
TranscriptionSlice is used for soring according to ascending id
func (TranscriptionSlice) Len ¶
func (a TranscriptionSlice) Len() int
func (TranscriptionSlice) Less ¶
func (a TranscriptionSlice) Less(i, j int) bool
func (TranscriptionSlice) Swap ¶
func (a TranscriptionSlice) Swap(i, j int)