Documentation
¶
Overview ¶
Package normalisers provides implementations of the Normaliser interface for various document formats. Each normaliser knows how to extract text content from a specific MIME type.
Normalisers are registered with the NormaliserRegistry at startup.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Registry ¶
type Registry struct {
// contains filtered or unexported fields
}
Registry manages normaliser registrations.
func NewRegistry ¶
func NewRegistry() *Registry
NewRegistry creates a new normaliser registry with default normalisers.
func (*Registry) Normalise ¶
func (r *Registry) Normalise(ctx context.Context, raw *domain.RawDocument) (*driven.NormaliseResult, error)
Normalise transforms a raw document using the best matching normaliser.
func (*Registry) Register ¶
func (r *Registry) Register(n driven.Normaliser)
Register adds a normaliser to the registry.
func (*Registry) SupportedMIMETypes ¶
SupportedMIMETypes returns all MIME types that can be normalised.
Directories
¶
| Path | Synopsis |
|---|---|
|
Package docx provides a Normaliser implementation for Microsoft Word DOCX files.
|
Package docx provides a Normaliser implementation for Microsoft Word DOCX files. |
|
Package github provides normalisers for GitHub-specific content types.
|
Package github provides normalisers for GitHub-specific content types. |
|
Package html provides a Normaliser implementation for HTML documents.
|
Package html provides a Normaliser implementation for HTML documents. |
|
Package markdown provides a Normaliser implementation for Markdown files.
|
Package markdown provides a Normaliser implementation for Markdown files. |
|
Package pdf provides a Normaliser implementation for PDF files.
|
Package pdf provides a Normaliser implementation for PDF files. |
|
Package plaintext provides a Normaliser implementation for plain text files.
|
Package plaintext provides a Normaliser implementation for plain text files. |
Click to show internal directories.
Click to hide internal directories.