normalisers

package
v0.2.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2025 License: Apache-2.0 Imports: 15 Imported by: 0

Documentation

Overview

Package normalisers provides implementations of the Normaliser interface for various document formats. Each normaliser knows how to extract text content from a specific MIME type.

Normalisers are registered with the NormaliserRegistry at startup.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Registry

type Registry struct {
	// contains filtered or unexported fields
}

Registry manages normaliser registrations.

func NewRegistry

func NewRegistry() *Registry

NewRegistry creates a new normaliser registry with default normalisers.

func (*Registry) Normalise

func (r *Registry) Normalise(ctx context.Context, raw *domain.RawDocument) (*driven.NormaliseResult, error)

Normalise transforms a raw document using the best matching normaliser.

func (*Registry) Register

func (r *Registry) Register(n driven.Normaliser)

Register adds a normaliser to the registry.

func (*Registry) SupportedMIMETypes

func (r *Registry) SupportedMIMETypes() []string

SupportedMIMETypes returns all MIME types that can be normalised.

Directories

Path Synopsis
Package docx provides a Normaliser implementation for Microsoft Word DOCX files.
Package docx provides a Normaliser implementation for Microsoft Word DOCX files.
Package github provides normalisers for GitHub-specific content types.
Package github provides normalisers for GitHub-specific content types.
Package html provides a Normaliser implementation for HTML documents.
Package html provides a Normaliser implementation for HTML documents.
Package markdown provides a Normaliser implementation for Markdown files.
Package markdown provides a Normaliser implementation for Markdown files.
Package notion provides normalisers for Notion documents.
Package notion provides normalisers for Notion documents.
Package pdf provides a Normaliser implementation for PDF files.
Package pdf provides a Normaliser implementation for PDF files.
Package plaintext provides a Normaliser implementation for plain text files.
Package plaintext provides a Normaliser implementation for plain text files.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL