Documentation
¶
Overview ¶
Package table provides domain entities for PDF table extraction.
This is the Domain layer in DDD/Clean Architecture. It contains the core business logic for representing extracted tables.
Package table provides domain entities for PDF table extraction.
Package table provides domain entities for PDF table extraction.
Index ¶
- type Cell
- type Rectangle
- type Table
- func (t *Table) CellCount() int
- func (t *Table) GetCell(row, col int) *Cell
- func (t *Table) GetColumn(col int) []*Cell
- func (t *Table) GetRow(row int) []*Cell
- func (t *Table) HasMergedCells() bool
- func (t *Table) IsEmpty() bool
- func (t *Table) NonEmptyCellCount() int
- func (t *Table) SetCell(row, col int, cell *Cell) error
- func (t *Table) String() string
- func (t *Table) ToStringGrid() [][]string
- func (t *Table) Validate() error
- type TextAlign
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Cell ¶
type Cell struct {
Text string // Text content (may contain newlines)
Row int // Row index (0-based)
Column int // Column index (0-based)
RowSpan int // Number of rows this cell spans (1 = no merge)
ColSpan int // Number of columns this cell spans (1 = no merge)
Bounds Rectangle // Bounding rectangle
TextAlign TextAlign // Text alignment within cell
}
Cell represents a single cell in an extracted table.
A cell contains:
- Text content (potentially multi-line)
- Position within the table (row, column)
- Span information for merged cells
- Bounding rectangle
- Text alignment
Cells are value objects in DDD - they are compared by value, not identity.
func NewCell ¶
NewCell creates a new Cell with the given text and position.
By default, cells have RowSpan=1, ColSpan=1 (not merged), and TextAlign=AlignLeft.
func NewCellWithBounds ¶
NewCellWithBounds creates a new Cell with text, position, and bounds.
func (*Cell) IsMerged ¶
IsMerged returns true if this cell is merged (spans multiple rows or columns).
func (*Cell) WithAlignment ¶
WithAlignment returns a new Cell with the specified text alignment.
func (*Cell) WithColSpan ¶
WithColSpan returns a new Cell with the specified column span.
func (*Cell) WithRowSpan ¶
WithRowSpan returns a new Cell with the specified row span.
type Rectangle ¶
type Rectangle struct {
X float64 // Bottom-left X coordinate
Y float64 // Bottom-left Y coordinate
Width float64 // Width
Height float64 // Height
}
Rectangle represents a rectangular bounding box.
This is a value object in DDD - it's immutable and compared by value. Coordinates follow PDF convention: origin at bottom-left, Y increases upward.
func NewRectangle ¶
NewRectangle creates a new Rectangle.
type Table ¶
type Table struct {
Rows [][]*Cell // Cells organized by row (row-major order)
RowCount int // Number of rows
ColCount int // Number of columns
PageNum int // Page number where table was found (0-based)
Bounds Rectangle // Bounding rectangle
Method string // "Lattice" or "Stream"
}
Table represents an extracted table with cell content.
A table is a rich domain entity that encapsulates:
- Cell content organized in rows and columns
- Metadata (page number, bounds, extraction method)
- Behavior for accessing and manipulating cells
Tables are aggregates in DDD - they are the root entity that manages a collection of cells.
This represents the output of Phase 2.7 (Table Extraction). Input is TableRegion from Phase 2.6 (Table Detection).
func NewTable ¶
NewTable creates a new Table with the specified dimensions.
All cells are initialized to empty cells with proper row/column indices.
Parameters:
- rowCount: Number of rows
- colCount: Number of columns
Returns an error if dimensions are invalid (< 1).
func (*Table) GetCell ¶
GetCell returns the cell at the specified row and column.
Returns nil if the position is out of bounds.
func (*Table) GetColumn ¶
GetColumn returns all cells in the specified column.
Returns nil if column is out of bounds.
func (*Table) GetRow ¶
GetRow returns all cells in the specified row.
Returns nil if row is out of bounds.
func (*Table) HasMergedCells ¶
HasMergedCells returns true if any cell is merged (spans multiple rows/cols).
func (*Table) NonEmptyCellCount ¶
NonEmptyCellCount returns the number of non-empty cells.
func (*Table) SetCell ¶
SetCell sets the cell at the specified row and column.
Returns an error if the position is out of bounds.
func (*Table) ToStringGrid ¶
ToStringGrid converts the table to a simple 2D string array.
This is useful for export formats (CSV, JSON) that don't support merged cells or formatting.
For merged cells, the text appears in the top-left cell, and merged positions contain empty strings.