Documentation
¶
Overview ¶
Package rasterization converts document pages to images. It provides a generic Rasterizer interface and a Registry that dispatches by file extension, supporting PDFs now and extensible to spreadsheets, presentations, and other document types.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Cache ¶
type Cache struct {
// contains filtered or unexported fields
}
Cache renders document pages on demand and caches the resulting JPEGs on the filesystem. The cache is ephemeral — lost on container restart, recomputed when accessed again.
func NewCache ¶
NewCache creates a Cache backed by the given Registry. The cacheDir is created if it does not exist.
type OfficeImageExtractor ¶ added in v1.3.1
type OfficeImageExtractor struct{}
OfficeImageExtractor extracts embedded images from Office Open XML documents (docx, pptx, xlsx). These formats are ZIP archives with media files at predictable paths.
func NewOfficeImageExtractor ¶ added in v1.3.1
func NewOfficeImageExtractor() *OfficeImageExtractor
NewOfficeImageExtractor creates an OfficeImageExtractor.
func (*OfficeImageExtractor) Close ¶ added in v1.3.1
func (o *OfficeImageExtractor) Close() error
Close is a no-op — OfficeImageExtractor holds no persistent state.
type PdfiumRasterizer ¶
type PdfiumRasterizer struct {
// contains filtered or unexported fields
}
PdfiumRasterizer renders PDF pages using the PDFium library via WebAssembly (Wazero). No CGO or system libraries are required — the PDFium WASM binary is embedded in the go-pdfium module.
The underlying pdfium.Pdfium instance wraps a single WASM module whose internal state is not safe for concurrent use: parallel calls corrupt the module's memory and function tables, after which every subsequent call fails until the process restarts. mu serialises all calls into the instance.
func NewPdfiumRasterizer ¶
func NewPdfiumRasterizer() (*PdfiumRasterizer, error)
NewPdfiumRasterizer initialises PDFium via the Wazero WebAssembly runtime and returns a Rasterizer for PDF files.
func (*PdfiumRasterizer) Close ¶
func (r *PdfiumRasterizer) Close() error
Close releases all PDFium resources.
type Rasterizer ¶
type Rasterizer interface {
io.Closer
// PageCount returns the number of renderable pages in the document.
PageCount(path string) (int, error)
// Render returns the given 1-based page as an image.
Render(path string, page int) (image.Image, error)
}
Rasterizer converts document pages to images. For PDFs this means pages; for spreadsheets, sheets; for presentations, slides.
type Registry ¶
type Registry struct {
// contains filtered or unexported fields
}
Registry maps file extensions to Rasterizer implementations.
func (*Registry) For ¶
func (r *Registry) For(ext string) (Rasterizer, bool)
For returns the Rasterizer for the given extension, or nil and false if none is registered.
func (*Registry) Register ¶
func (r *Registry) Register(ext string, rasterizer Rasterizer)
Register associates a file extension (e.g. ".pdf") with a Rasterizer.
type StandaloneImage ¶ added in v1.3.1
type StandaloneImage struct{}
StandaloneImage is a Rasterizer for plain image files. Each file is treated as a single "page".
func NewStandaloneImage ¶ added in v1.3.1
func NewStandaloneImage() *StandaloneImage
NewStandaloneImage creates a StandaloneImage rasterizer.
func (*StandaloneImage) Close ¶ added in v1.3.1
func (s *StandaloneImage) Close() error
Close is a no-op.