Documentation
¶
Overview ¶
Package index provides interfaces for indexing documents metadata and retrieving this metadata back from the index. Currently, there is only one implementation to those interfaces, using Bleve.
Index ¶
- Constants
- func CreateAuthorsIndex(path string) bleve.Index
- func CreateAuthorsMapping() mapping.IndexMapping
- func CreateDocumentsIndex(path string) bleve.Index
- func CreateDocumentsMapping() mapping.IndexMapping
- func MigrateAuthors(oldIndex, newIndex bleve.Index, batchSize int) error
- func MigrateDocuments(oldIndex, newIndex bleve.Index, batchSize int) error
- type Author
- type BleveIndexer
- func (b *BleveIndexer) AddFile(file string) (string, error)
- func (b *BleveIndexer) AddLibrary(batchSize int, forceIndexing bool) error
- func (b *BleveIndexer) Author(slug, lang string) (Author, error)
- func (b *BleveIndexer) Close() error
- func (b *BleveIndexer) Count() (uint64, error)
- func (b *BleveIndexer) Document(slug string) (Document, error)
- func (b *BleveIndexer) DocumentByID(ID string) (Document, error)
- func (b *BleveIndexer) Documents(IDs []string, sortBy []string) ([]Document, error)
- func (b *BleveIndexer) IndexAuthor(author Author) error
- func (b *BleveIndexer) IndexingProgress() (Progress, error)
- func (b *BleveIndexer) Languages() ([]string, error)
- func (b *BleveIndexer) LatestDocs(limit int) ([]Document, error)
- func (b *BleveIndexer) RemoveFile(file string) error
- func (b *BleveIndexer) SameAuthors(slugID string, quantity int) ([]Document, error)
- func (b *BleveIndexer) SameSeries(slugID string, quantity int) ([]Document, error)
- func (b *BleveIndexer) SameSubjects(slugID string, quantity int) ([]Document, error)
- func (b *BleveIndexer) Search(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
- func (b *BleveIndexer) SearchByAuthor(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
- func (b *BleveIndexer) SearchBySeries(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
- func (b *BleveIndexer) Slug(document Document, batchSlugs map[string]struct{}) string
- func (b *BleveIndexer) Subjects() ([]string, error)
- func (b *BleveIndexer) TotalWordCount(IDs []string) (float64, error)
- type Document
- type Progress
- type SearchFields
Constants ¶
const AuthorVersion = "1"
AuthorVersion identifies the mapping used for indexing authors. Any changes in the mapping requires an increase of version, to signal that a new index needs to be created.
const DocumentVersion = "v10"
DocumentVersion identifies the mapping used for indexing documents. Any changes in the mapping requires an increase of version, to signal that a new index needs to be created.
Variables ¶
This section is empty.
Functions ¶
func CreateAuthorsIndex ¶ added in v4.16.0
func CreateAuthorsMapping ¶ added in v4.16.0
func CreateAuthorsMapping() mapping.IndexMapping
func CreateDocumentsIndex ¶ added in v4.16.0
func CreateDocumentsMapping ¶ added in v4.16.0
func CreateDocumentsMapping() mapping.IndexMapping
func MigrateAuthors ¶ added in v4.16.0
MigrateAuthors migrates all authors from a legacy index to a new index. It searches only for items with Type = "author" and deletes them immediately after migration to avoid pagination issues and free up disk space.
func MigrateDocuments ¶ added in v4.16.0
MigrateDocuments migrates all documents from a legacy index to a new documents index in batches. It always loads the first 1000 documents to avoid pagination issues, and deletes them immediately after migration to avoid using much disk space.
Types ¶
type Author ¶ added in v4.3.0
type Author struct {
Slug string
Name string
BirthName string
DataSourceID string
RetrievedOn time.Time
WikipediaLink map[string]string
InstanceOf float64
Description map[string]string
DateOfBirth precisiondate.PrecisionDate
DateOfDeath precisiondate.PrecisionDate
Website string
DataSourceImage string
Gender float64
Pseudonyms []string
}
func (Author) BirthNameIncludesName ¶ added in v4.7.0
func (Author) YearOfBirthAbs ¶ added in v4.7.0
func (Author) YearOfDeathAbs ¶ added in v4.7.0
type BleveIndexer ¶
type BleveIndexer struct {
// contains filtered or unexported fields
}
func NewBleve ¶
func NewBleve(documentsIndex bleve.Index, authorsIndex bleve.Index, fs afero.Fs, libraryPath string, read map[string]metadata.Reader) *BleveIndexer
NewBleve creates a new BleveIndexer instance using the passed parameters
func (*BleveIndexer) AddFile ¶
func (b *BleveIndexer) AddFile(file string) (string, error)
AddFile adds a file to the index
func (*BleveIndexer) AddLibrary ¶
func (b *BleveIndexer) AddLibrary(batchSize int, forceIndexing bool) error
AddLibrary scans <libraryPath> for documents and adds them to the index in batches of <batchSize> if they haven't been previously indexed or if <forceIndexing> is true
func (*BleveIndexer) Author ¶ added in v4.3.0
func (b *BleveIndexer) Author(slug, lang string) (Author, error)
func (*BleveIndexer) Count ¶
func (b *BleveIndexer) Count() (uint64, error)
Count returns the number of indexed documents
func (*BleveIndexer) DocumentByID ¶ added in v4.9.0
func (b *BleveIndexer) DocumentByID(ID string) (Document, error)
func (*BleveIndexer) Documents ¶
func (b *BleveIndexer) Documents(IDs []string, sortBy []string) ([]Document, error)
func (*BleveIndexer) IndexAuthor ¶ added in v4.7.0
func (b *BleveIndexer) IndexAuthor(author Author) error
func (*BleveIndexer) IndexingProgress ¶
func (b *BleveIndexer) IndexingProgress() (Progress, error)
func (*BleveIndexer) Languages ¶ added in v4.16.0
func (b *BleveIndexer) Languages() ([]string, error)
Languages returns a list of all unique languages in the indexed documents using faceted search.
func (*BleveIndexer) LatestDocs ¶ added in v4.7.0
func (b *BleveIndexer) LatestDocs(limit int) ([]Document, error)
func (*BleveIndexer) RemoveFile ¶
func (b *BleveIndexer) RemoveFile(file string) error
RemoveFile removes a file from the index
func (*BleveIndexer) SameAuthors ¶
func (b *BleveIndexer) SameAuthors(slugID string, quantity int) ([]Document, error)
SameAuthors returns an array of metadata of documents by the same authors which does not belong to the same collection
func (*BleveIndexer) SameSeries ¶
func (b *BleveIndexer) SameSeries(slugID string, quantity int) ([]Document, error)
SameSeries returns an array of metadata of documents in the same series
func (*BleveIndexer) SameSubjects ¶
func (b *BleveIndexer) SameSubjects(slugID string, quantity int) ([]Document, error)
SameSubjects returns an array of metadata of documents by other authors, which have similar subjects as the passed one and does not belong to the same collection They are sorted by subjects matching and date, the closest to the publishing date of the reference document first
func (*BleveIndexer) Search ¶
func (b *BleveIndexer) Search(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
Search look for documents which match the passed keywords and filters. Returns a maximum <resultsPerPage> documents, offset by <page>
func (*BleveIndexer) SearchByAuthor ¶ added in v4.3.0
func (b *BleveIndexer) SearchByAuthor(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
func (*BleveIndexer) SearchBySeries ¶ added in v4.9.0
func (b *BleveIndexer) SearchBySeries(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
func (*BleveIndexer) Slug ¶
func (b *BleveIndexer) Slug(document Document, batchSlugs map[string]struct{}) string
As Bleve index is not updated until the batch is executed, we need to store the slugs processed in the current batch in memory to also compare the current doc slug against them.
func (*BleveIndexer) Subjects ¶ added in v4.18.0
func (b *BleveIndexer) Subjects() ([]string, error)
Subjects returns a list of all unique subjects in the indexed documents using faceted search. Uses Subjects field (now keyword field) for faceting to get complete subject names. Subject names are normalized to have only the first letter capitalized.
func (*BleveIndexer) TotalWordCount ¶ added in v4.15.0
func (b *BleveIndexer) TotalWordCount(IDs []string) (float64, error)
TotalWordCount returns the sum of word counts for the given document IDs