Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
var TestBoundaryPositions = PDFTablePositions{
OriginalDateStart: 0, OriginalDateEnd: 2,
ReceiptStart: 3, ReceiptEnd: 5,
DetailStart: 6, DetailEnd: 8,
ARSAmountStart: 9, ARSAmountEnd: 11,
USDAmountStart: 12, USDAmountEnd: 13,
}
TestBoundaryPositions defines positions that match exactly with TestBoundaryText.
var TestBoundaryRow = Row{ RawText: TestBoundaryText, RawOriginalDate: "abc", RawReceiptNumber: "deF", RawDetailWithMaybeInstallments: "GHI", RawAmountARS: "JKL", RawAmountUSD: "MN", }
TestBoundaryRow represents a row with text exactly matching column boundaries.
var TestBoundaryText = "abcdeFGHIJKLMN"
TestBoundaryText is the raw text that produces TestBoundaryRow when parsed. It's exactly at column ends to test boundary conditions.
var TestCardMovementRow = Row{ RawText: TestCardMovementText, RawOriginalDate: "23 Diciem. 30", RawReceiptNumber: "111111 *", RawDetailWithMaybeInstallments: "AN ESTABLISHMENT A C.07/12", RawAmountARS: "11.111,11", RawAmountUSD: "", }
TestCardMovementRow represents a typical credit card movement row with all fields populated.
var TestCardMovementText = "23 Diciem. 30 111111 * AN ESTABLISHMENT A C.07/12 11.111,11 "
TestCardMovementText is the raw text that produces TestCardMovementRow when parsed.
var TestSaldoAnteriorRow = Row{ RawText: TestSaldoAnteriorText, RawOriginalDate: "", RawReceiptNumber: "", RawDetailWithMaybeInstallments: "SALDO ANTERIOR", RawAmountARS: "222.111,66", RawAmountUSD: "110,00", }
TestSaldoAnteriorRow represents a "SALDO ANTERIOR" row with ARS and USD amounts.
var TestSaldoAnteriorText = " SALDO ANTERIOR 222.111,66 110,00 "
TestSaldoAnteriorText is the raw text that produces TestSaldoAnteriorRow when parsed.
var TestShortText = "short text"
TestShortText is the raw text that produces TestShortTextRow when parsed.
var TestShortTextRow = Row{ RawText: TestShortText, RawOriginalDate: "short text", RawReceiptNumber: "", RawDetailWithMaybeInstallments: "", RawAmountARS: "", RawAmountUSD: "", }
TestShortTextRow represents a row with text too short to contain all fields.
var TestTablePositions = PDFTablePositions{
OriginalDateStart: 0,
OriginalDateEnd: 12,
ReceiptStart: 14,
ReceiptEnd: 22,
DetailStart: 24,
DetailEnd: 74,
ARSAmountStart: 76,
ARSAmountEnd: 91,
USDAmountStart: 93,
USDAmountEnd: 110,
}
TestTablePositions defines the standard table positions used in tests. These positions match the Santander PDF format used in most test cases.
var TestWhitespaceRow = Row{ RawText: TestWhitespaceText, RawOriginalDate: "01 Enero", RawReceiptNumber: "1 123456", RawDetailWithMaybeInstallments: "* Detalle con espacios y\tcaracteres!@# 1.234,5", RawAmountARS: "78,90", RawAmountUSD: "", }
TestWhitespaceRow represents a row with various whitespace and special characters.
var TestWhitespaceText = " 01 Enero 01 123456 * Detalle con espacios y\tcaracteres!@# 1.234,56 78,90 "
TestWhitespaceText is the raw text that produces TestWhitespaceRow when parsed.
Functions ¶
This section is empty.
Types ¶
type FakeTableIterator ¶
type FakeTableIterator struct {
// contains filtered or unexported fields
}
FakeTableIterator is a test-friendly implementation of a table row iterator that allows preloading with a slice of Rows to be returned in sequence.
func NewFakeTableIterator ¶
func NewFakeTableIterator(rows []Row) *FakeTableIterator
NewFakeTableIterator creates a new FakeTableIterator with the given rows.
func (*FakeTableIterator) Next ¶
func (f *FakeTableIterator) Next() (Row, bool)
Next returns the next Row in the sequence and a boolean indicating if there are more rows to return.
func (*FakeTableIterator) NextUtilRegexIsMatched ¶
func (f *FakeTableIterator) NextUtilRegexIsMatched(regex *regexp.Regexp) (Row, bool)
NextUtilRegexIsMatched returns the next Row whose RawText matches the given regex.
type PDFTablePositions ¶
type PDFTablePositions struct {
OriginalDateStart int
OriginalDateEnd int
ReceiptStart int
ReceiptEnd int
DetailStart int
DetailEnd int
ARSAmountStart int
ARSAmountEnd int
USDAmountStart int
USDAmountEnd int
}
PDFTablePositions are the known positions of the columns of the table of the PDF. They start at 0, so the first character is at position 0. All the positions are inclusive
type RealTableIterator ¶
type RealTableIterator struct {
// contains filtered or unexported fields
}
RealTableIterator is used to iterate the rows of a PDF table. The idea is to share this iterator between different functions, so it can continue where the old one finished. The "Real" prefix is because couldn't think of a better name to distinguish it from the interface and fake.
func NewRealTableIterator ¶
func NewRealTableIterator(rows []Row) *RealTableIterator
func (*RealTableIterator) Next ¶
func (it *RealTableIterator) Next() (Row, bool)
Next implements the TableIterator interface. It returns the next Row in the sequence and a boolean indicating if there are more rows to return.
func (*RealTableIterator) NextUtilRegexIsMatched ¶
func (it *RealTableIterator) NextUtilRegexIsMatched(regex *regexp.Regexp) (Row, bool)
NextUtilRegexIsMatched implements the TableIterator interface. It iterates through the rows until a row matches the regex and returns it.
type Row ¶
type Row struct {
// Full text of the row
RawText string
// The columns of the row
RawOriginalDate string
RawReceiptNumber string
RawDetailWithMaybeInstallments string
RawAmountARS string
RawAmountUSD string
}
func (Row) MatchesMovementWithoutYearAndMonth ¶
type RowFactory ¶
type RowFactory struct {
// contains filtered or unexported fields
}
RowFactory is responsible for creating Row instances with specific table position configurations.
func NewRowFactory ¶
func NewRowFactory(positions PDFTablePositions) *RowFactory
NewRowFactory creates a new RowFactory with the given table positions.
func (*RowFactory) CreateRow ¶
func (f *RowFactory) CreateRow(rawText string) Row
CreateRow creates a new Row instance using the factory's stored positions and the provided text.
type TableIterator ¶
type TableIterator interface {
// Next returns the next Row in the sequence and a boolean indicating if there are more rows to return.
Next() (Row, bool)
// NextUtilRegexIsMatched iterates through the rows until a row matches the regex and returns it
NextUtilRegexIsMatched(regex *regexp.Regexp) (Row, bool)
}
type TableIteratorFactory ¶
type TableIteratorFactory struct {
// contains filtered or unexported fields
}
TableIteratorFactory is responsible for creating table iterators with specific configurations.
func NewTableIteratorFactory ¶
func NewTableIteratorFactory(rowFactory *RowFactory) *TableIteratorFactory
func (*TableIteratorFactory) CreateIterator ¶
func (f *TableIteratorFactory) CreateIterator(docIterator pdfwrapper.DocumentIterator) *RealTableIterator
CreateIterator creates a new RealTableIterator instance using the factory's configuration and the provided document iterator. It pre-parses all rows from the DocumentIterator during construction for efficient iteration.