Documentation
¶
Overview ¶
Package index provides interfaces for indexing documents metadata and retrieving this metadata back from the index. Currently, there is only one implementation to those interfaces, using Bleve.
Index ¶
- Constants
- Variables
- func CreateAuthorsIndex(path string) bleve.Index
- func CreateAuthorsMapping() mapping.IndexMapping
- func CreateDocumentsIndex(path string) bleve.Index
- func CreateDocumentsMapping() mapping.IndexMapping
- func MigrateAuthors(oldIndex, newIndex bleve.Index, batchSize int) error
- func MigrateDocuments(oldIndex, newIndex bleve.Index, batchSize int) error
- func NeedsReindexForIllustratedConfig(documentsIndex bleve.Index, currentMinSize float64) (bool, error)
- type Author
- type BleveIndexer
- func (b *BleveIndexer) AddLibrary(batchSize int, forceIndexing bool) error
- func (b *BleveIndexer) Author(slug, lang string) (Author, error)
- func (b *BleveIndexer) Close() error
- func (b *BleveIndexer) Count() (uint64, error)
- func (b *BleveIndexer) Cover(slug string, coverMaxWidth int) ([]byte, error)
- func (b *BleveIndexer) DeleteDocument(slug string) error
- func (b *BleveIndexer) Document(slug string) (Document, error)
- func (b *BleveIndexer) DocumentByID(ID string) (Document, error)
- func (b *BleveIndexer) Documents(slugs []string) (map[string]Document, error)
- func (b *BleveIndexer) File(slug string) (*IndexedFile, error)
- func (b *BleveIndexer) IndexAuthor(author Author) error
- func (b *BleveIndexer) IndexingProgress() (Progress, error)
- func (b *BleveIndexer) Languages() ([]string, error)
- func (b *BleveIndexer) LatestDocs(limit int) ([]Document, error)
- func (b *BleveIndexer) NewFile(fileName string, contents []byte) (string, error)
- func (b *BleveIndexer) SameAuthors(slugID string, quantity int) ([]Document, error)
- func (b *BleveIndexer) SameSeries(slugID string, quantity int) ([]Document, error)
- func (b *BleveIndexer) SameSubjects(slugID string, quantity int) ([]Document, error)
- func (b *BleveIndexer) Search(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
- func (b *BleveIndexer) SearchByAuthor(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
- func (b *BleveIndexer) SearchBySeries(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
- func (b *BleveIndexer) Slug(document Document, batchSlugs map[string]struct{}) string
- func (b *BleveIndexer) StartFileWatcher()
- func (b *BleveIndexer) Subjects() (map[string][]string, error)
- func (b *BleveIndexer) TotalWordCount(slugs []string) (float64, error)
- type Config
- type Document
- type IndexedFile
- type Progress
- type SearchFields
Constants ¶
const AuthorVersion = "1"
AuthorVersion identifies the mapping used for indexing authors. Any changes in the mapping requires an increase of version, to signal that a new index needs to be created.
const DocumentVersion = "v11"
DocumentVersion identifies the mapping used for indexing documents. Any changes in the mapping requires an increase of version, to signal that a new index needs to be created.
Variables ¶
var ErrDocumentNotFound = errors.New("document not found")
ErrDocumentNotFound is returned when a document cannot be found by slug.
Functions ¶
func CreateAuthorsIndex ¶ added in v4.16.0
func CreateAuthorsMapping ¶ added in v4.16.0
func CreateAuthorsMapping() mapping.IndexMapping
func CreateDocumentsIndex ¶ added in v4.16.0
func CreateDocumentsMapping ¶ added in v4.16.0
func CreateDocumentsMapping() mapping.IndexMapping
func MigrateAuthors ¶ added in v4.16.0
MigrateAuthors migrates all authors from a legacy index to a new index. It searches only for items with Type = "author" and deletes them immediately after migration to avoid pagination issues and free up disk space.
func MigrateDocuments ¶ added in v4.16.0
MigrateDocuments migrates all documents from a legacy index to a new documents index in batches. It always loads the first 1000 documents to avoid pagination issues, and deletes them immediately after migration to avoid using much disk space.
func NeedsReindexForIllustratedConfig ¶ added in v4.20.0
func NeedsReindexForIllustratedConfig(documentsIndex bleve.Index, currentMinSize float64) (bool, error)
NeedsReindexForIllustratedConfig reports whether the documents index must be rebuilt because the stored illustrated-min-size config differs from currentMinSize (or is missing).
Types ¶
type Author ¶ added in v4.3.0
type Author struct {
Slug string
Name string
BirthName string
DataSourceID string
RetrievedOn time.Time
WikipediaLink map[string]string
InstanceOf float64
Description map[string]string
DateOfBirth precisiondate.PrecisionDate
DateOfDeath precisiondate.PrecisionDate
Website string
DataSourceImage string
Gender float64
Pseudonyms []string
}
func (Author) BirthNameIncludesName ¶ added in v4.7.0
func (Author) YearOfBirthAbs ¶ added in v4.7.0
func (Author) YearOfDeathAbs ¶ added in v4.7.0
type BleveIndexer ¶
type BleveIndexer struct {
// contains filtered or unexported fields
}
func NewBleve ¶
func NewBleve(documentsIndex bleve.Index, authorsIndex bleve.Index, fs afero.Fs, libraryPath string, read map[string]metadata.Reader, cfg Config) *BleveIndexer
NewBleve creates a new BleveIndexer instance using the passed parameters
func (*BleveIndexer) AddLibrary ¶
func (b *BleveIndexer) AddLibrary(batchSize int, forceIndexing bool) error
AddLibrary scans <libraryPath> for documents and adds them to the index in batches of <batchSize> if they haven't been previously indexed or if <forceIndexing> is true
func (*BleveIndexer) Author ¶ added in v4.3.0
func (b *BleveIndexer) Author(slug, lang string) (Author, error)
func (*BleveIndexer) Count ¶
func (b *BleveIndexer) Count() (uint64, error)
Count returns the number of indexed documents
func (*BleveIndexer) Cover ¶ added in v4.21.0
func (b *BleveIndexer) Cover(slug string, coverMaxWidth int) ([]byte, error)
Cover returns the cover image for the document identified by slug, resized to at most coverMaxWidth pixels wide.
func (*BleveIndexer) DeleteDocument ¶ added in v4.21.0
func (b *BleveIndexer) DeleteDocument(slug string) error
DeleteDocument removes the document identified by slug from the index and deletes its file from the filesystem.
func (*BleveIndexer) DocumentByID ¶ added in v4.9.0
func (b *BleveIndexer) DocumentByID(ID string) (Document, error)
@deprecated Remove after migration
func (*BleveIndexer) Documents ¶
func (b *BleveIndexer) Documents(slugs []string) (map[string]Document, error)
Documents returns documents for the given slugs in a single search. Missing or invalid slugs are omitted.
func (*BleveIndexer) File ¶ added in v4.21.0
func (b *BleveIndexer) File(slug string) (*IndexedFile, error)
File returns the raw document payload and metadata for the given slug.
func (*BleveIndexer) IndexAuthor ¶ added in v4.7.0
func (b *BleveIndexer) IndexAuthor(author Author) error
func (*BleveIndexer) IndexingProgress ¶
func (b *BleveIndexer) IndexingProgress() (Progress, error)
func (*BleveIndexer) Languages ¶ added in v4.16.0
func (b *BleveIndexer) Languages() ([]string, error)
Languages returns a list of all unique languages in the indexed documents using faceted search.
func (*BleveIndexer) LatestDocs ¶ added in v4.7.0
func (b *BleveIndexer) LatestDocs(limit int) ([]Document, error)
func (*BleveIndexer) NewFile ¶ added in v4.21.0
func (b *BleveIndexer) NewFile(fileName string, contents []byte) (string, error)
NewFile writes the given contents to the library as fileName, indexes it, and returns the document slug.
func (*BleveIndexer) SameAuthors ¶
func (b *BleveIndexer) SameAuthors(slugID string, quantity int) ([]Document, error)
SameAuthors returns an array of metadata of documents by the same authors which does not belong to the same collection
func (*BleveIndexer) SameSeries ¶
func (b *BleveIndexer) SameSeries(slugID string, quantity int) ([]Document, error)
SameSeries returns an array of metadata of documents in the same series
func (*BleveIndexer) SameSubjects ¶
func (b *BleveIndexer) SameSubjects(slugID string, quantity int) ([]Document, error)
SameSubjects returns an array of metadata of documents by other authors, which have similar subjects as the passed one and does not belong to the same collection They are sorted by subjects matching and date, the closest to the publishing date of the reference document first
func (*BleveIndexer) Search ¶
func (b *BleveIndexer) Search(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
Search look for documents which match the passed keywords and filters. Returns a maximum <resultsPerPage> documents, offset by <page>
func (*BleveIndexer) SearchByAuthor ¶ added in v4.3.0
func (b *BleveIndexer) SearchByAuthor(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
func (*BleveIndexer) SearchBySeries ¶ added in v4.9.0
func (b *BleveIndexer) SearchBySeries(searchFields SearchFields, page, resultsPerPage int) (result.Paginated[[]Document], error)
func (*BleveIndexer) Slug ¶
func (b *BleveIndexer) Slug(document Document, batchSlugs map[string]struct{}) string
As Bleve index is not updated until the batch is executed, we need to store the slugs processed in the current batch in memory to also compare the current doc slug against them.
func (*BleveIndexer) StartFileWatcher ¶ added in v4.21.0
func (b *BleveIndexer) StartFileWatcher()
StartFileWatcher starts watching the library path for file changes and updates the index. It blocks until the process exits. Call it in a goroutine.
func (*BleveIndexer) Subjects ¶ added in v4.18.0
func (b *BleveIndexer) Subjects() (map[string][]string, error)
Subjects returns subject groups: each slug with all display names that map to it. Uses Subjects field for faceting; names are normalized (first letter capitalized). Grouping uses slug.Make so variants like "cronica" and "crónica" share one slug (slug transliterates accents).
func (*BleveIndexer) TotalWordCount ¶ added in v4.15.0
func (b *BleveIndexer) TotalWordCount(slugs []string) (float64, error)
TotalWordCount returns the sum of word counts for the documents matching the given slugs.
type Config ¶ added in v4.20.0
type Config struct {
// IllustratedMinAmount is the minimum number of illustrations (excluding cover) for a document to be considered illustrated.
IllustratedMinAmount int
// IllustratedMinSize is the minimum size in megapixels for an image to count as an illustration.
IllustratedMinSize float64
}
Config holds indexer configuration.
type Document ¶
type IndexedFile ¶ added in v4.21.0
IndexedFile holds the bytes and metadata for a document download.