Documentation
¶
Overview ¶
Package doctool provides document parsing tools for agents. It extracts text content from common file formats so agents can process file attachments received through messenger platforms.
Problem: Users send files (PDFs, Word docs, CSVs, config files) that agents need to read and understand. Without this tool, file content would be opaque to the agent. This package converts documents into plain text that fits within the LLM context window.
Supported formats:
- .txt, .md, .json, .yaml, .yml, .xml, .log — read as-is
- .csv — parsed into a readable markdown table
- .docx — extracted via ZIP + XML parsing (no external dependencies)
- .pdf — multi-strategy extraction: 1. pdftotext (poppler-utils) — fast, handles forms and text PDFs 2. Tesseract OCR — handles scanned/image-based PDFs 3. dslipak/pdf (pure Go) — fallback when system tools are unavailable
Safety guards:
- File size capped at 10 MB
- 30-second parse timeout (prevents hung I/O on network mounts or complex PDFs)
- PDF parsing runs in a goroutine with context deadline (dslipak/pdf ignores context)
- Output truncated at 32 KB to limit LLM context consumption
Dependencies:
- github.com/dslipak/pdf — pure-Go PDF reader (fallback, no system deps)
- pdftotext (poppler-utils) — optional, for reliable text PDF extraction
- pdftoppm (poppler-utils) — optional, for PDF-to-image conversion (OCR pipeline)
- tesseract — optional, for OCR on scanned/image PDFs
- Install on macOS: brew install poppler tesseract
- Install on Debian: apt-get install poppler-utils tesseract-ocr
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ToolProvider ¶
type ToolProvider struct{}
ToolProvider wraps the document parsing tool and satisfies the tools.ToolProviders interface.
func NewToolProvider ¶
func NewToolProvider() *ToolProvider
NewToolProvider creates a ToolProvider for the document parsing tool.
func (*ToolProvider) GetTools ¶
func (p *ToolProvider) GetTools() []tool.Tool
GetTools returns the document parsing tool.