raw

package
v2.1.1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 19, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func NewFile

func NewFile() corpus.File

NewFile returns a new File instance

func NewIterator

func NewIterator(scanner *bufio.Scanner) corpus.Iterator

NewIterator creates a new RawIterator from a scanner

func NewPayload

func NewPayload(line int, content string) corpus.Payload

NewPayload returns a new Payload from a line in the raw corpus. Since raw corpus files don't include line numbers, the line number must be provided separately.

func NewRawCorpus

func NewRawCorpus(filePath string) corpus.Corpus

NewRawCorpus returns a new raw corpus instance

Types

type File

type File struct {
	// contains filtered or unexported fields
}

File implements the corpus.File interface for raw corpus files. Raw corpus files are local files, so no caching is needed.

func (*File) CacheDir

func (f *File) CacheDir() string

CacheDir returns empty string since raw files don't use caching

func (*File) FilePath

func (f *File) FilePath() string

FilePath returns the path to the raw corpus file

func (*File) WithCacheDir

func (f *File) WithCacheDir(cacheDir string) corpus.File

WithCacheDir is a no-op for raw files since they don't use caching

func (*File) WithFileName

func (f *File) WithFileName(fileName string) corpus.File

WithFileName sets the file path for the raw corpus file

type Payload

type Payload struct {
	// contains filtered or unexported fields
}

Payload implements the corpus.Payload interface for raw corpus files. Raw corpus files contain one payload per line without line numbers.

func (*Payload) Content

func (p *Payload) Content() string

Content returns the payload content

func (*Payload) LineNumber

func (p *Payload) LineNumber() int

LineNumber returns the line number of the payload

func (*Payload) SetContent

func (p *Payload) SetContent(content string)

SetContent sets the content of the payload

func (*Payload) SetLineNumber

func (p *Payload) SetLineNumber(line int)

SetLineNumber sets the line number of the payload

type RawCorpus

type RawCorpus struct {
	// contains filtered or unexported fields
}

RawCorpus represents a corpus from a raw text file with one payload per line.

func (*RawCorpus) CloseIterator

func (c *RawCorpus) CloseIterator() error

CloseIterator closes the underlying file the iterator is using.

func (*RawCorpus) FetchCorpusFile

func (c *RawCorpus) FetchCorpusFile() corpus.File

FetchCorpusFile returns a File interface for the raw corpus file. Since raw files are local, no downloading is needed.

func (*RawCorpus) GetIterator

func (c *RawCorpus) GetIterator(cache corpus.File) corpus.Iterator

GetIterator returns an iterator for the corpus. Call CloseIterator to close the underlying file when done.

func (*RawCorpus) Language

func (c *RawCorpus) Language() string

Language returns the language of the corpus

func (*RawCorpus) LocalPath

func (c *RawCorpus) LocalPath() string

LocalPath implements corpus.Corpus.

func (*RawCorpus) Size

func (c *RawCorpus) Size() string

Size returns the size of the corpus

func (*RawCorpus) Source

func (c *RawCorpus) Source() string

Source returns the source of the corpus

func (*RawCorpus) URL

func (c *RawCorpus) URL() string

URL returns the file path (used as URL for raw corpus)

func (*RawCorpus) WithLanguage

func (c *RawCorpus) WithLanguage(lang string) corpus.Corpus

WithLanguage sets the language of the corpus (informational only for raw corpus)

func (*RawCorpus) WithSize

func (c *RawCorpus) WithSize(size string) corpus.Corpus

WithSize sets the size of the corpus (informational only for raw corpus)

func (*RawCorpus) WithSource

func (c *RawCorpus) WithSource(source string) corpus.Corpus

WithSource sets the source of the corpus (informational only for raw corpus)

func (*RawCorpus) WithURL

func (c *RawCorpus) WithURL(url string) corpus.Corpus

WithURL sets the file path for the raw corpus

func (*RawCorpus) WithYear

func (c *RawCorpus) WithYear(year string) corpus.Corpus

WithYear sets the year of the corpus (informational only for raw corpus)

func (*RawCorpus) Year

func (c *RawCorpus) Year() string

Year returns the year of the corpus

type RawIterator

type RawIterator struct {
	// contains filtered or unexported fields
}

RawIterator implements the Iterator interface for raw corpus files. It reads one payload per line and automatically generates line numbers.

func (*RawIterator) HasNext

func (r *RawIterator) HasNext() bool

HasNext returns true if there is another line in the corpus

func (*RawIterator) Next

func (r *RawIterator) Next() corpus.Payload

Next returns the next payload from the corpus

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL