Documentation
¶
Overview ¶
Package ocr provides OCR (Optical Character Recognition) functionality using Tesseract. This is used to extract text from images for full-text search.
Index ¶
- Variables
- func FormatOutput(result *Result, format string) (string, error)
- func GetLanguageName(code string) string
- type Box
- type Client
- func (c *Client) ExtractText(ctx context.Context, image []byte, mimeType string) (string, error)
- func (c *Client) ExtractTextToHOCR(ctx context.Context, image []byte, mimeType string) (string, error)
- func (c *Client) ExtractTextWithLayout(ctx context.Context, image []byte, mimeType string) (*Result, error)
- func (c *Client) GetAvailableLanguages(ctx context.Context) ([]string, error)
- func (c *Client) GetVersion(ctx context.Context) (string, error)
- func (c *Client) IsAvailable(ctx context.Context) bool
- func (c *Client) IsSupported(mimeType string) bool
- type Config
- type Line
- type Result
- type Word
Constants ¶
This section is empty.
Variables ¶
var SupportedMimeTypes = []string{
"image/png",
"image/jpeg",
"image/jpg",
"image/gif",
"image/bmp",
"image/webp",
}
Supported image MIME types for OCR.
Functions ¶
func FormatOutput ¶
FormatOutput formats the OCR output.
func GetLanguageName ¶
GetLanguageName returns the full name of a language code.
Types ¶
type Box ¶
type Box struct {
X int32 `json:"x"`
Y int32 `json:"y"`
Width int32 `json:"width"`
Height int32 `json:"height"`
}
Box represents a bounding box.
type Client ¶
type Client struct {
// contains filtered or unexported fields
}
Client provides OCR functionality.
func (*Client) ExtractText ¶
ExtractText extracts text from an image using Tesseract OCR.
func (*Client) ExtractTextToHOCR ¶
func (c *Client) ExtractTextToHOCR(ctx context.Context, image []byte, mimeType string) (string, error)
ExtractTextToHOCR extracts text with hOCR format (HTML with position info).
func (*Client) ExtractTextWithLayout ¶
func (c *Client) ExtractTextWithLayout(ctx context.Context, image []byte, mimeType string) (*Result, error)
ExtractTextWithLayout extracts text with layout information (boxes).
func (*Client) GetAvailableLanguages ¶
GetAvailableLanguages returns the list of available languages.
func (*Client) GetVersion ¶
GetVersion returns the Tesseract version.
func (*Client) IsAvailable ¶
IsAvailable checks if Tesseract is available.
func (*Client) IsSupported ¶
IsSupported checks if a MIME type is supported for OCR.
type Config ¶
type Config struct {
// TesseractPath is the path to the tesseract executable
TesseractPath string
// DataPath is the path to the tessdata directory (optional)
DataPath string
// Languages are the languages to use for OCR (e.g., "chi_sim+eng")
Languages string
}
Config holds the OCR configuration.
func ConfigFromEnv ¶
func ConfigFromEnv() *Config
ConfigFromEnv creates OCR config from environment variables.
func DefaultConfig ¶
func DefaultConfig() *Config
DefaultConfig returns the default OCR configuration.
type Line ¶
type Line struct {
BoundingBox *Box `json:"bounding_box,omitempty"`
Text string `json:"text"`
Words []Word `json:"words,omitempty"`
}
Line represents a line of text.
type Result ¶
type Result struct {
Text string `json:"text"`
Languages string `json:"languages,omitempty"`
Words []Word `json:"words,omitempty"`
Lines []Line `json:"lines,omitempty"`
Confidence float64 `json:"confidence,omitempty"`
}
Result represents the OCR result with metadata.
func (*Result) MarshalJSON ¶
MarshalJSON implements custom JSON marshaling.