table

package
v0.6.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 24, 2026 License: MIT Imports: 2 Imported by: 0

Documentation

Overview

Package table provides domain entities for PDF table extraction.

This is the Domain layer in DDD/Clean Architecture. It contains the core business logic for representing extracted tables.

Package table provides domain entities for PDF table extraction.

Package table provides domain entities for PDF table extraction.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type Cell

type Cell struct {
	Text      string    // Text content (may contain newlines)
	Row       int       // Row index (0-based)
	Column    int       // Column index (0-based)
	RowSpan   int       // Number of rows this cell spans (1 = no merge)
	ColSpan   int       // Number of columns this cell spans (1 = no merge)
	Bounds    Rectangle // Bounding rectangle
	TextAlign TextAlign // Text alignment within cell
}

Cell represents a single cell in an extracted table.

A cell contains:

  • Text content (potentially multi-line)
  • Position within the table (row, column)
  • Span information for merged cells
  • Bounding rectangle
  • Text alignment

Cells are value objects in DDD - they are compared by value, not identity.

func NewCell

func NewCell(text string, row, col int) *Cell

NewCell creates a new Cell with the given text and position.

By default, cells have RowSpan=1, ColSpan=1 (not merged), and TextAlign=AlignLeft.

func NewCellWithBounds

func NewCellWithBounds(text string, row, col int, bounds Rectangle) *Cell

NewCellWithBounds creates a new Cell with text, position, and bounds.

func (*Cell) IsEmpty

func (c *Cell) IsEmpty() bool

IsEmpty returns true if the cell has no text content.

func (*Cell) IsMerged

func (c *Cell) IsMerged() bool

IsMerged returns true if this cell is merged (spans multiple rows or columns).

func (*Cell) String

func (c *Cell) String() string

String returns a string representation of the cell.

func (*Cell) WithAlignment

func (c *Cell) WithAlignment(align TextAlign) *Cell

WithAlignment returns a new Cell with the specified text alignment.

func (*Cell) WithColSpan

func (c *Cell) WithColSpan(colSpan int) *Cell

WithColSpan returns a new Cell with the specified column span.

func (*Cell) WithRowSpan

func (c *Cell) WithRowSpan(rowSpan int) *Cell

WithRowSpan returns a new Cell with the specified row span.

type Rectangle

type Rectangle struct {
	X      float64 // Bottom-left X coordinate
	Y      float64 // Bottom-left Y coordinate
	Width  float64 // Width
	Height float64 // Height
}

Rectangle represents a rectangular bounding box.

This is a value object in DDD - it's immutable and compared by value. Coordinates follow PDF convention: origin at bottom-left, Y increases upward.

func NewRectangle

func NewRectangle(x, y, width, height float64) Rectangle

NewRectangle creates a new Rectangle.

func (Rectangle) Bottom

func (r Rectangle) Bottom() float64

Bottom returns the Y coordinate of the bottom edge.

func (Rectangle) Contains

func (r Rectangle) Contains(x, y float64) bool

Contains checks if a point (x, y) is inside the rectangle.

func (Rectangle) Left

func (r Rectangle) Left() float64

Left returns the X coordinate of the left edge.

func (Rectangle) Right

func (r Rectangle) Right() float64

Right returns the X coordinate of the right edge.

func (Rectangle) String

func (r Rectangle) String() string

String returns a string representation of the rectangle.

func (Rectangle) Top

func (r Rectangle) Top() float64

Top returns the Y coordinate of the top edge.

type Table

type Table struct {
	Rows     [][]*Cell // Cells organized by row (row-major order)
	RowCount int       // Number of rows
	ColCount int       // Number of columns
	PageNum  int       // Page number where table was found (0-based)
	Bounds   Rectangle // Bounding rectangle
	Method   string    // "Lattice" or "Stream"
}

Table represents an extracted table with cell content.

A table is a rich domain entity that encapsulates:

  • Cell content organized in rows and columns
  • Metadata (page number, bounds, extraction method)
  • Behavior for accessing and manipulating cells

Tables are aggregates in DDD - they are the root entity that manages a collection of cells.

This represents the output of Phase 2.7 (Table Extraction). Input is TableRegion from Phase 2.6 (Table Detection).

func NewTable

func NewTable(rowCount, colCount int) (*Table, error)

NewTable creates a new Table with the specified dimensions.

All cells are initialized to empty cells with proper row/column indices.

Parameters:

  • rowCount: Number of rows
  • colCount: Number of columns

Returns an error if dimensions are invalid (< 1).

func (*Table) CellCount

func (t *Table) CellCount() int

CellCount returns the total number of cells in the table.

func (*Table) GetCell

func (t *Table) GetCell(row, col int) *Cell

GetCell returns the cell at the specified row and column.

Returns nil if the position is out of bounds.

func (*Table) GetColumn

func (t *Table) GetColumn(col int) []*Cell

GetColumn returns all cells in the specified column.

Returns nil if column is out of bounds.

func (*Table) GetRow

func (t *Table) GetRow(row int) []*Cell

GetRow returns all cells in the specified row.

Returns nil if row is out of bounds.

func (*Table) HasMergedCells

func (t *Table) HasMergedCells() bool

HasMergedCells returns true if any cell is merged (spans multiple rows/cols).

func (*Table) IsEmpty

func (t *Table) IsEmpty() bool

IsEmpty returns true if all cells are empty.

func (*Table) NonEmptyCellCount

func (t *Table) NonEmptyCellCount() int

NonEmptyCellCount returns the number of non-empty cells.

func (*Table) SetCell

func (t *Table) SetCell(row, col int, cell *Cell) error

SetCell sets the cell at the specified row and column.

Returns an error if the position is out of bounds.

func (*Table) String

func (t *Table) String() string

String returns a string representation of the table (for debugging).

func (*Table) ToStringGrid

func (t *Table) ToStringGrid() [][]string

ToStringGrid converts the table to a simple 2D string array.

This is useful for export formats (CSV, JSON) that don't support merged cells or formatting.

For merged cells, the text appears in the top-left cell, and merged positions contain empty strings.

func (*Table) Validate

func (t *Table) Validate() error

Validate checks if the table structure is valid.

A valid table must have:

  • At least 1 row and 1 column
  • All rows have the same number of columns
  • All cells have correct row/column indices

Returns an error describing the first validation failure, or nil if valid.

type TextAlign

type TextAlign int

TextAlign represents text alignment within a cell.

const (
	// AlignLeft indicates text is left-aligned.
	AlignLeft TextAlign = iota
	// AlignCenter indicates text is center-aligned.
	AlignCenter
	// AlignRight indicates text is right-aligned.
	AlignRight
)

func (TextAlign) String

func (ta TextAlign) String() string

String returns a string representation of the text alignment.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL