testgen

command
v0.29.4 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Nov 24, 2025 License: MIT Imports: 10 Imported by: 0

README

Test Document Generator

A command-line tool to generate test PDF documents and their corresponding .txt sidecar files for testing godocs functionality.

Usage

# Run the generator
go run cmd/testgen/main.go

# Or build and run
go build -o testgen cmd/testgen/main.go
./testgen

Generated Test Documents

The tool creates test documents in the testdocs/ directory:

1. Empty PDF (1-empty.pdf)
  • Purpose: Test handling of PDFs with no text content
  • Content: Empty PDF with a single blank page
  • Sidecar: Empty 1-empty.txt file
2. Hello World PDF (2-hello.pdf)
  • Purpose: Test basic text extraction
  • Content: Simple "Hello World" text with a short description
  • Sidecar: 2-hello.txt with the same text content
3. Diagram PDF (3-diagram.pdf)
  • Purpose: Test handling of PDFs with graphics/diagrams
  • Content: System architecture flowchart diagram
  • Sidecar: 3-diagram.txt with description of the diagram
4. Long Text PDF (4-longtext.pdf)
  • Purpose: Test handling of documents with large amounts of text
  • Content: Go source code example (demonstrates code formatting)
  • Sidecar: 4-longtext.txt with the same source code

Use Cases

  • Ingestion Testing: Copy files to the ingress folder to test document ingestion
  • OCR Testing: Test OCR functionality with different document types
  • Sidecar Testing: Verify sidecar .txt file reading and writing
  • Search Testing: Test full-text search with various content types
  • Performance Testing: Generate multiple documents for load testing

Sidecar .txt Files

Each PDF has a corresponding .txt file that contains:

  • For text-based PDFs: The actual text content
  • For diagrams: A description of the diagram
  • For empty PDFs: An empty file

This allows testing the USE_SIDECAR_TXT feature where godocs uses pre-existing .txt files instead of running OCR.

Directory Structure

testdocs/
├── 1-empty.pdf
├── 1-empty.txt
├── 2-hello.pdf
├── 2-hello.txt
├── 3-diagram.pdf
├── 3-diagram.txt
├── 4-longtext.pdf
└── 4-longtext.txt

Note: The testdocs/ directory is git-ignored and will not be committed to version control.

Dependencies

  • github.com/jung-kurt/gofpdf - PDF generation library

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL