Documentation
ΒΆ
Overview ΒΆ
Package filesql provides a file-based SQL driver implementation that enables querying CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using SQLite3 SQL syntax.
filesql allows you to treat structured text files as SQL databases without any data import or transformation steps. It uses SQLite3 as an in-memory database engine, providing full SQL capabilities including JOINs, aggregations, window functions, and CTEs.
Features ΒΆ
- Query CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using standard SQL
- Automatic handling of compressed files (gzip, bzip2, xz, zstandard)
- Support for multiple input sources (files, directories, io.Reader, embed.FS)
- Efficient streaming for large files with configurable chunk sizes
- Cross-platform compatibility (Linux, macOS, Windows)
- Optional auto-save functionality to persist changes
Basic Usage ΒΆ
The simplest way to use filesql is with the Open or OpenContext functions:
db, err := filesql.Open("data.csv") if err != nil { log.Fatal(err) } defer db.Close() rows, err := db.Query("SELECT * FROM data WHERE age > 25") if err != nil { log.Fatal(err) } defer rows.Close()
Advanced Usage ΒΆ
For more complex scenarios, use the Builder pattern:
builder := filesql.NewBuilder(). AddPath("users.csv"). AddPath("orders.tsv"). EnableAutoSave("./output") validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close()
Table Naming ΒΆ
Table names are automatically derived from file paths:
- "users.csv" becomes table "users"
- "data.tsv.gz" becomes table "data"
- "/path/to/logs.ltsv" becomes table "logs"
- "sales.xlsx" with multiple sheets becomes tables "sales_Sheet1", "sales_Sheet2", etc.
Data Modifications ΒΆ
INSERT, UPDATE, and DELETE operations affect only the in-memory database. Original files remain unchanged unless auto-save is enabled. To persist changes manually, use the DumpDatabase function.
SQL Syntax ΒΆ
Since filesql uses SQLite3 as its underlying engine, all SQL syntax follows SQLite3's SQL dialect. This includes support for:
- Common Table Expressions (CTEs)
- Window functions
- JSON functions
- Date and time functions
- And all other SQLite3 features
Column Name Handling ΒΆ
Column names are handled with case-sensitive comparison for duplicate detection, maintaining backward compatibility. Headers with identical names after trimming whitespace (regardless of case differences) are considered duplicates and will result in an error.
For complete SQL syntax documentation, see: https://www.sqlite.org/lang.html
Index ΒΆ
- Constants
- Variables
- func DumpDatabase(db *sql.DB, outputDir string, opts ...DumpOptions) error
- func Open(paths ...string) (*sql.DB, error)
- func OpenContext(ctx context.Context, paths ...string) (*sql.DB, error)
- type ChunkSize
- type CompressionFactory
- func (f *CompressionFactory) CreateHandlerForFile(path string) CompressionHandler
- func (f *CompressionFactory) CreateReaderForFile(path string) (io.Reader, func() error, error)
- func (f *CompressionFactory) CreateWriterForFile(path string, compressionType CompressionType) (io.Writer, func() error, error)
- func (f *CompressionFactory) DetectCompressionType(path string) CompressionType
- func (f *CompressionFactory) GetBaseFileType(path string) FileType
- func (f *CompressionFactory) RemoveCompressionExtension(path string) string
- type CompressionHandler
- type CompressionType
- type DBBuilder
- func (b *DBBuilder) AddFS(filesystem fs.FS) *DBBuilder
- func (b *DBBuilder) AddPath(path string) *DBBuilder
- func (b *DBBuilder) AddPaths(paths ...string) *DBBuilder
- func (b *DBBuilder) AddReader(reader io.Reader, tableName string, fileType FileType) *DBBuilder
- func (b *DBBuilder) Build(ctx context.Context) (*DBBuilder, error)
- func (b *DBBuilder) DisableAutoSave() *DBBuilder
- func (b *DBBuilder) EnableAutoSave(outputDir string, options ...DumpOptions) *DBBuilder
- func (b *DBBuilder) EnableAutoSaveOnCommit(outputDir string, options ...DumpOptions) *DBBuilder
- func (b *DBBuilder) Open(ctx context.Context) (*sql.DB, error)
- func (b *DBBuilder) SetDefaultChunkSize(size int) *DBBuilder
- type DumpOptions
- type ErrorContext
- type FileType
- type MemoryInfo
- type MemoryLimit
- func (ml *MemoryLimit) CheckMemoryUsage() MemoryStatus
- func (ml *MemoryLimit) CreateMemoryError(operation string) error
- func (ml *MemoryLimit) Disable()
- func (ml *MemoryLimit) Enable()
- func (ml *MemoryLimit) GetMemoryInfo() MemoryInfo
- func (ml *MemoryLimit) IsEnabled() bool
- func (ml *MemoryLimit) SetWarningThreshold(threshold float64)
- func (ml *MemoryLimit) ShouldReduceChunkSize(chunkSize int) (bool, int)
- type MemoryPool
- func (mp *MemoryPool) ForceGC()
- func (mp *MemoryPool) GetByteBuffer() []byte
- func (mp *MemoryPool) GetRecordSlice() []Record
- func (mp *MemoryPool) GetStringSlice() []string
- func (mp *MemoryPool) PutByteBuffer(buf []byte)
- func (mp *MemoryPool) PutRecordSlice(slice []Record)
- func (mp *MemoryPool) PutStringSlice(slice []string)
- type MemoryStatus
- type OutputFormat
- type Record
- type TableName
Examples ΒΆ
- DBBuilder (Chaining)
- DBBuilder (ErrorHandling)
- DBBuilder.AddFS
- DBBuilder.AddFS (EmbedFS)
- DBBuilder.AddPath
- DBBuilder.AddPaths
- DBBuilder.AddReader
- DBBuilder.AddReader (Compressed)
- DBBuilder.AddReader (Multiple)
- DBBuilder.Build
- DBBuilder.EnableAutoSave
- DBBuilder.EnableAutoSaveOnCommit
- DBBuilder.Open
- DumpDatabase
- DumpDatabase (DataProcessing)
- DumpDatabase (MultipleFormats)
- DumpDatabase (WithOptions)
- DumpOptions.FileExtension
- NewBuilder
- Open
- Open (AdvancedSQL)
- Open (CompressionSupport)
- Open (Constraints)
- Open (CustomerInsights)
- Open (ErrorHandling)
- Open (FinancialDataAnalysis)
- Open (MultipleFiles)
- Open (PerformanceOptimization)
- Open (SalesAnalysis)
- Open (WebLogAnalysis)
- OpenContext
Constants ΒΆ
const ( // DefaultRowsPerChunk is the default number of rows per chunk DefaultRowsPerChunk = 1000 // DefaultChunkSize is the default chunk size (rows); alias for clarity DefaultChunkSize = DefaultRowsPerChunk // MinChunkSize is the minimum allowed rows per chunk MinChunkSize = 1 // ValidationPeekSize is the size used for validation peek operations ValidationPeekSize = 1 )
Processing constants (rows-based)
const ( // MaxSampleSize limits how many values to sample for type inference MaxSampleSize = 1000 // MinConfidenceThreshold is the minimum percentage of values that must match a type MinConfidenceThreshold = 0.8 // EarlyTerminationThreshold is the percentage of text values that triggers early termination EarlyTerminationThreshold = 0.5 // MinDatetimeLength is the minimum reasonable length for datetime values MinDatetimeLength = 4 // MaxDatetimeLength is the maximum reasonable length for datetime values MaxDatetimeLength = 35 // SamplingStratificationFactor determines when to use stratified vs simple sampling SamplingStratificationFactor = 3 // MinRealThreshold is the minimum percentage of real values needed to classify as REAL MinRealThreshold = 0.1 )
Type inference constants
Variables ΒΆ
var ( // ErrEmptyData indicates that the data source contains no records ErrEmptyData = errors.New("filesql: empty data source") // ErrUnsupportedFormat indicates an unsupported file format ErrUnsupportedFormat = errors.New("filesql: unsupported file format") // ErrInvalidData indicates malformed or invalid data ErrInvalidData = errors.New("filesql: invalid data format") // ErrNoTables indicates no tables found in database ErrNoTables = errors.New("filesql: no tables found in database") // ErrFileNotFound indicates file not found ErrFileNotFound = errors.New("filesql: file not found") // ErrPermissionDenied indicates permission denied ErrPermissionDenied = errors.New("filesql: permission denied") // ErrMemoryLimit indicates memory limit exceeded ErrMemoryLimit = errors.New("filesql: memory limit exceeded") // ErrContextCancelled indicates context was cancelled ErrContextCancelled = errors.New("filesql: context cancelled") )
Standard error messages and error creation functions for consistency
Functions ΒΆ
func DumpDatabase ΒΆ
func DumpDatabase(db *sql.DB, outputDir string, opts ...DumpOptions) error
DumpDatabase saves all database tables to files in the specified directory.
Basic usage:
err := filesql.DumpDatabase(db, "./output")
This will save all tables as CSV files in the output directory.
Advanced usage with options:
// Default: Export as CSV files err := DumpDatabase(db, "./output") // Export as TSV files with gzip compression options := NewDumpOptions(). WithFormat(OutputFormatTSV). WithCompression(CompressionGZ) err := DumpDatabase(db, "./output", options)
Example ΒΆ
ExampleDumpDatabase demonstrates exporting modified data
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) // createTempTestData creates temporary CSV files for the example func createTempTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_example") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2020-01-15 2,Bob Smith,1,85000,2019-03-22 3,Charlie Brown,1,80000,2021-06-10 4,David Wilson,1,75000,2022-02-28 5,Eve Davis,2,70000,2020-09-15 6,Frank Miller,2,65000,2021-11-30 7,Grace Lee,3,60000,2019-12-05 8,Henry Taylor,3,55000,2022-04-18` err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget,manager_id 1,Engineering,1000000,1 2,Marketing,800000,5 3,Sales,600000,7 4,HR,400000,9` err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { tmpDir := createTempTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, filepath.Join(tmpDir, "employees.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Modify data in memory _, err = db.ExecContext(ctx, ` UPDATE employees SET salary = salary * 1.10 WHERE department_id = 1 `) if err != nil { log.Fatal(err) } // Add a new employee _, err = db.ExecContext(ctx, ` INSERT INTO employees (id, name, department_id, salary, hire_date) VALUES (99, 'New Employee', 2, 60000, '2024-01-01') `) if err != nil { log.Fatal(err) } // Create output directory outputDir := filepath.Join(tmpDir, "output") err = os.MkdirAll(outputDir, 0750) if err != nil { log.Fatal(err) } // Export modified data err = filesql.DumpDatabase(db, outputDir) if err != nil { log.Fatal(err) } // Verify export by reading the exported file exportedFile := filepath.Join(outputDir, "employees.csv") if _, err := os.Stat(exportedFile); err != nil { log.Fatal("Exported file not found:", err) } // Count records in exported file db2, err := filesql.OpenContext(ctx, exportedFile) if err != nil { log.Fatal(err) } defer db2.Close() var count int err = db2.QueryRowContext(ctx, "SELECT COUNT(*) FROM employees").Scan(&count) if err != nil { log.Fatal(err) } fmt.Printf("Original file: 8 employees\n") fmt.Printf("Modified and exported: %d employees\n", count) // Extract just the filename for consistent output (normalize path separators for cross-platform compatibility) exportPath := strings.Replace(exportedFile, tmpDir, "/tmp/filesql_example*", 1) exportPath = strings.ReplaceAll(exportPath, "\\", "/") // Convert Windows backslashes to forward slashes fmt.Printf("Export location: %s\n", exportPath) }
Output: Original file: 8 employees Modified and exported: 9 employees Export location: /tmp/filesql_example*/output/employees.csv
Example (DataProcessing) ΒΆ
package main import ( "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { tempDir := filepath.Join(os.TempDir(), "filesql_processing_example") if err := os.MkdirAll(tempDir, 0750); err != nil { log.Fatal(err) } defer os.RemoveAll(tempDir) // Open CSV file db, err := filesql.Open(filepath.Join("testdata", "sample.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Process data with SQL _, err = db.Exec(` UPDATE sample SET age = age + 1 WHERE name LIKE '%John%' `) if err != nil { log.Fatal(err) } // Add aggregated data _, err = db.Exec(` INSERT INTO sample (id, name, age, email) SELECT 999, 'Summary: ' || COUNT(*), AVG(age), 'summary@example.com' FROM sample WHERE id < 999 `) if err != nil { log.Fatal(err) } // Export processed data in different formats for different use cases // 1. TSV for spreadsheet import options := filesql.NewDumpOptions().WithFormat(filesql.OutputFormatTSV) spreadsheetDir := filepath.Join(tempDir, "for_spreadsheet") if err := filesql.DumpDatabase(db, spreadsheetDir, options); err != nil { log.Fatal(err) } fmt.Println("Exported TSV for spreadsheet import") // 2. Compressed CSV for archival options = filesql.NewDumpOptions(). WithFormat(filesql.OutputFormatCSV). WithCompression(filesql.CompressionGZ) archiveDir := filepath.Join(tempDir, "for_archive") if err := filesql.DumpDatabase(db, archiveDir, options); err != nil { log.Fatal(err) } fmt.Println("Exported compressed CSV for archival") // 3. LTSV for log analysis options = filesql.NewDumpOptions().WithFormat(filesql.OutputFormatLTSV) logDir := filepath.Join(tempDir, "for_logs") if err := filesql.DumpDatabase(db, logDir, options); err != nil { log.Fatal(err) } fmt.Println("Exported LTSV for log analysis") // Show what was created dirs := []string{"for_spreadsheet", "for_archive", "for_logs"} for _, dir := range dirs { files, err := filepath.Glob(filepath.Join(tempDir, dir, "*")) if err != nil { log.Fatal(err) } for _, file := range files { fmt.Printf("%s: %s\n", dir, filepath.Base(file)) } } }
Output: Exported TSV for spreadsheet import Exported compressed CSV for archival Exported LTSV for log analysis for_spreadsheet: sample.tsv for_archive: sample.csv.gz for_logs: sample.ltsv
Example (MultipleFormats) ΒΆ
package main import ( "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { tempDir := filepath.Join(os.TempDir(), "filesql_formats_example") if err := os.MkdirAll(tempDir, 0750); err != nil { log.Fatal(err) } defer os.RemoveAll(tempDir) // Open CSV file and modify data db, err := filesql.Open(filepath.Join("testdata", "sample.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Add some data to demonstrate functionality _, err = db.Exec("INSERT INTO sample (id, name, age, email) VALUES (4, 'Alice Brown', 28, 'alice@example.com')") if err != nil { log.Fatal(err) } // Demonstrate different compression options compressionTypes := []struct { name string compression filesql.CompressionType extension string }{ {"No compression", filesql.CompressionNone, ""}, {"Gzip compression", filesql.CompressionGZ, ".gz"}, {"XZ compression", filesql.CompressionXZ, ".xz"}, {"Zstd compression", filesql.CompressionZSTD, ".zst"}, } for _, ct := range compressionTypes { fmt.Printf("%s:\n", ct.name) options := filesql.NewDumpOptions(). WithFormat(filesql.OutputFormatCSV). WithCompression(ct.compression) outputDir := filepath.Join(tempDir, "compression_"+ct.compression.String()) if err := filesql.DumpDatabase(db, outputDir, options); err != nil { log.Fatal(err) } files, err := filepath.Glob(filepath.Join(outputDir, "*")) if err != nil { log.Fatal(err) } for _, file := range files { fmt.Printf(" %s\n", filepath.Base(file)) } } }
Output: No compression: sample.csv Gzip compression: sample.csv.gz XZ compression: sample.csv.xz Zstd compression: sample.csv.zst
Example (WithOptions) ΒΆ
package main import ( "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Create a temporary directory for output tempDir := filepath.Join(os.TempDir(), "filesql_dump_example") if err := os.MkdirAll(tempDir, 0750); err != nil { log.Fatal(err) } defer os.RemoveAll(tempDir) // Open CSV file db, err := filesql.Open(filepath.Join("testdata", "sample.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Example 1: Default CSV output (no options) fmt.Println("Example 1: Default CSV output") csvDir := filepath.Join(tempDir, "csv_output") if err := filesql.DumpDatabase(db, csvDir); err != nil { log.Fatal(err) } // List output files files1, err := filepath.Glob(filepath.Join(csvDir, "*")) if err != nil { log.Fatal(err) } for _, file := range files1 { fmt.Printf("Created: %s\n", filepath.Base(file)) } // Example 2: TSV output with gzip compression fmt.Println("\nExample 2: TSV output with gzip compression") tsvDir := filepath.Join(tempDir, "tsv_output") options := filesql.NewDumpOptions(). WithFormat(filesql.OutputFormatTSV). WithCompression(filesql.CompressionGZ) if err := filesql.DumpDatabase(db, tsvDir, options); err != nil { log.Fatal(err) } files2, err := filepath.Glob(filepath.Join(tsvDir, "*")) if err != nil { log.Fatal(err) } for _, file := range files2 { fmt.Printf("Created: %s\n", filepath.Base(file)) } // Example 3: LTSV output with zstd compression fmt.Println("\nExample 3: LTSV output with zstd compression") ltsvDir := filepath.Join(tempDir, "ltsv_output") options3 := filesql.NewDumpOptions(). WithFormat(filesql.OutputFormatLTSV). WithCompression(filesql.CompressionZSTD) if err := filesql.DumpDatabase(db, ltsvDir, options3); err != nil { log.Fatal(err) } files3, err := filepath.Glob(filepath.Join(ltsvDir, "*")) if err != nil { log.Fatal(err) } for _, file := range files3 { fmt.Printf("Created: %s\n", filepath.Base(file)) } }
Output: Example 1: Default CSV output Created: sample.csv Example 2: TSV output with gzip compression Created: sample.tsv.gz Example 3: LTSV output with zstd compression Created: sample.ltsv.zst
func Open ΒΆ
Open creates an SQL database from CSV, TSV, or LTSV files.
Quick start:
db, err := filesql.Open("data.csv") if err != nil { return err } defer db.Close() rows, err := db.Query("SELECT * FROM data WHERE age > 25")
Parameters:
- paths: One or more file paths or directories
- Files: "users.csv", "products.tsv", "logs.ltsv"
- Compressed: "data.csv.gz", "archive.tsv.bz2"
- Directories: "/data/" (loads all CSV/TSV/LTSV files recursively)
Table names:
- "users.csv" β table "users"
- "data.tsv.gz" β table "data"
- "/path/to/sales.csv" β table "sales"
Note: Original files are never modified. Changes exist only in memory. To save changes, use DumpDatabase() function.
Example with multiple files:
// Open a single CSV file db, err := filesql.Open("data/users.csv") if err != nil { log.Fatal(err) } defer db.Close() // Complex query with JOINs, aggregation, and window functions rows, err := db.Query(` SELECT u.name, u.department, u.salary, AVG(u.salary) OVER (PARTITION BY u.department) as dept_avg_salary, RANK() OVER (PARTITION BY u.department ORDER BY u.salary DESC) as salary_rank, COUNT(*) OVER (PARTITION BY u.department) as dept_size FROM users u WHERE u.salary > ( SELECT AVG(salary) * 0.8 FROM users WHERE department = u.department ) ORDER BY u.department, u.salary DESC `) if err != nil { log.Fatal(err) } defer rows.Close() // Process results for rows.Next() { var name, dept string var salary, deptAvg float64 var rank, deptSize int if err := rows.Scan(&name, &dept, &salary, &deptAvg, &rank, &deptSize); err != nil { log.Fatal(err) } fmt.Printf("%s (%s): $%.2f (Rank: %d/%d, Dept Avg: $%.2f)\n", name, dept, salary, rank, deptSize, deptAvg) }
Example ΒΆ
ExampleOpen demonstrates how to use filesql.Open() with complex SQL queries. This example shows advanced SQL features including JOINs, window functions, subqueries, and aggregations on CSV data loaded into an in-memory SQLite database.
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Create temporary test data files tmpDir := createTempTestData() defer os.RemoveAll(tmpDir) // Open the database with multiple files db, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"), filepath.Join(tmpDir, "departments.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Complex query demonstrating multiple SQL features: // - JOINs between tables // - Window functions (RANK, AVG, COUNT) // - Subqueries with correlated conditions // - CASE statements // - Grouping and ordering query := ` SELECT e.name, d.name as department_name, e.salary, d.budget, RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank_in_dept, AVG(e.salary) OVER (PARTITION BY e.department_id) as dept_avg_salary, COUNT(*) OVER (PARTITION BY e.department_id) as dept_employee_count, CASE WHEN e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) * 1.2 THEN 'High Performer' WHEN e.salary < (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) * 0.8 THEN 'Below Average' ELSE 'Average' END as performance_category, ROUND(e.salary / d.budget * 100, 2) as salary_budget_percentage FROM employees e JOIN departments d ON e.department_id = d.id WHERE e.salary > ( SELECT AVG(salary) * 0.7 FROM employees e2 WHERE e2.department_id = e.department_id ) AND d.budget > 500000 ORDER BY d.name, e.salary DESC LIMIT 10 ` rows, err := db.QueryContext(context.Background(), query) if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("Employee Analysis Report:") fmt.Println("========================") for rows.Next() { var name, deptName, perfCategory string var salary, budget, deptAvgSalary, salaryBudgetPct float64 var salaryRank, deptEmpCount int err := rows.Scan(&name, &deptName, &salary, &budget, &salaryRank, &deptAvgSalary, &deptEmpCount, &perfCategory, &salaryBudgetPct) if err != nil { log.Fatal(err) } fmt.Printf("%-15s | %-12s | $%7.0f | Rank: %d/%d | %s\n", name, deptName, salary, salaryRank, deptEmpCount, perfCategory) } if err = rows.Err(); err != nil { log.Fatal(err) } } // createTempTestData creates temporary CSV files for the example func createTempTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_example") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2020-01-15 2,Bob Smith,1,85000,2019-03-22 3,Charlie Brown,1,80000,2021-06-10 4,David Wilson,1,75000,2022-02-28 5,Eve Davis,2,70000,2020-09-15 6,Frank Miller,2,65000,2021-11-30 7,Grace Lee,3,60000,2019-12-05 8,Henry Taylor,3,55000,2022-04-18` err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget,manager_id 1,Engineering,1000000,1 2,Marketing,800000,5 3,Sales,600000,7 4,HR,400000,9` err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir }
Output: Employee Analysis Report: ======================== Alice Johnson | Engineering | $ 95000 | Rank: 1/4 | Average Bob Smith | Engineering | $ 85000 | Rank: 2/4 | Average Charlie Brown | Engineering | $ 80000 | Rank: 3/4 | Average David Wilson | Engineering | $ 75000 | Rank: 4/4 | Average Eve Davis | Marketing | $ 70000 | Rank: 1/2 | Average Frank Miller | Marketing | $ 65000 | Rank: 2/2 | Average Grace Lee | Sales | $ 60000 | Rank: 1/2 | Average Henry Taylor | Sales | $ 55000 | Rank: 2/2 | Average
Example (AdvancedSQL) ΒΆ
ExampleOpen_advancedSQL demonstrates advanced SQL features available in SQLite3
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) func main() { tmpDir := createAdvancedTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() fmt.Println("=== Advanced SQL Features ===") // Window functions with RANK() and ROW_NUMBER() fmt.Println("\n1. Window Functions - Employee Rankings by Department:") rows, err := db.QueryContext(ctx, ` SELECT e.name, d.name as department, e.salary, RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank, ROW_NUMBER() OVER (ORDER BY e.salary DESC) as overall_rank FROM employees e JOIN departments d ON e.department_id = d.id ORDER BY e.department_id, salary_rank `) if err != nil { log.Fatal(err) } fmt.Printf("%-15s %-12s %-8s %-10s %s\n", "Name", "Department", "Salary", "Dept Rank", "Overall Rank") fmt.Println(strings.Repeat("-", 65)) for rows.Next() { var name, department string var salary float64 var salaryRank, overallRank int if err := rows.Scan(&name, &department, &salary, &salaryRank, &overallRank); err != nil { log.Fatal(err) } fmt.Printf("%-15s %-12s $%-7.0f %-10d %d\n", name, department, salary, salaryRank, overallRank) } _ = rows.Close() // Ignore close error in test cleanup // Common Table Expressions (CTE) fmt.Println("\n2. Common Table Expressions - Department Analysis:") rows, err = db.QueryContext(ctx, ` WITH dept_stats AS ( SELECT d.name as department, COUNT(e.id) as employee_count, AVG(e.salary) as avg_salary, MAX(e.salary) as max_salary, MIN(e.salary) as min_salary FROM departments d LEFT JOIN employees e ON d.id = e.department_id GROUP BY d.id, d.name ), company_avg AS ( SELECT AVG(salary) as company_avg_salary FROM employees ) SELECT ds.department, ds.employee_count, ds.avg_salary, ca.company_avg_salary, ds.avg_salary - ca.company_avg_salary as salary_diff, CASE WHEN ds.avg_salary > ca.company_avg_salary THEN 'Above Average' WHEN ds.avg_salary < ca.company_avg_salary THEN 'Below Average' ELSE 'At Average' END as comparison FROM dept_stats ds CROSS JOIN company_avg ca WHERE ds.employee_count > 0 ORDER BY ds.avg_salary DESC `) if err != nil { log.Fatal(err) } fmt.Printf("%-12s %-5s %-10s %-12s %-10s %s\n", "Department", "Count", "Avg Salary", "Company Avg", "Difference", "Comparison") fmt.Println(strings.Repeat("-", 75)) for rows.Next() { var department, comparison string var employeeCount int var avgSalary, companyAvg, salaryDiff float64 if err := rows.Scan(&department, &employeeCount, &avgSalary, &companyAvg, &salaryDiff, &comparison); err != nil { log.Fatal(err) } fmt.Printf("%-12s %-5d $%-9.0f $%-11.0f $%-9.0f %s\n", department, employeeCount, avgSalary, companyAvg, salaryDiff, comparison) } _ = rows.Close() // Ignore close error in test cleanup // JSON operations (if data contains JSON) fmt.Println("\n3. Text Functions - Name Analysis:") rows, err = db.QueryContext(ctx, ` SELECT name, LENGTH(name) as name_length, UPPER(SUBSTR(name, 1, 1)) || LOWER(SUBSTR(name, 2)) as formatted_name, INSTR(name, ' ') as space_position, CASE WHEN INSTR(name, ' ') > 0 THEN SUBSTR(name, 1, INSTR(name, ' ') - 1) ELSE name END as first_name FROM employees WHERE LENGTH(name) > 8 ORDER BY name_length DESC `) if err != nil { log.Fatal(err) } fmt.Printf("%-15s %-6s %-15s %-8s %s\n", "Name", "Length", "Formatted", "Space@", "First Name") fmt.Println(strings.Repeat("-", 60)) for rows.Next() { var name, formattedName, firstName string var nameLength, spacePos int if err := rows.Scan(&name, &nameLength, &formattedName, &spacePos, &firstName); err != nil { log.Fatal(err) } fmt.Printf("%-15s %-6d %-15s %-8d %s\n", name, nameLength, formattedName, spacePos, firstName) } _ = rows.Close() // Ignore close error in test cleanup } // createAdvancedTestData creates test data for advanced SQL examples func createAdvancedTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_advanced_example_*") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2023-01-15 2,Bob Smith,2,85000,2023-02-20 3,Charlie Brown,1,80000,2023-03-10 4,David Wilson,1,75000,2023-04-05 5,Eve Davis,2,65000,2023-05-15 6,Frank Miller,3,70000,2023-06-01` employeesFile := filepath.Join(tmpDir, "employees.csv") err = os.WriteFile(employeesFile, []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget 1,Engineering,500000 2,Sales,300000 3,Marketing,200000 4,HR,150000` departmentsFile := filepath.Join(tmpDir, "departments.csv") err = os.WriteFile(departmentsFile, []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir }
Output: === Advanced SQL Features === 1. Window Functions - Employee Rankings by Department: Name Department Salary Dept Rank Overall Rank ----------------------------------------------------------------- Alice Johnson Engineering $95000 1 1 Charlie Brown Engineering $80000 2 3 David Wilson Engineering $75000 3 4 Bob Smith Sales $85000 1 2 Eve Davis Sales $65000 2 6 Frank Miller Marketing $70000 1 5 2. Common Table Expressions - Department Analysis: Department Count Avg Salary Company Avg Difference Comparison --------------------------------------------------------------------------- Engineering 3 $83333 $78333 $5000 Above Average Sales 2 $75000 $78333 $-3333 Below Average Marketing 1 $70000 $78333 $-8333 Below Average 3. Text Functions - Name Analysis: Name Length Formatted Space@ First Name ------------------------------------------------------------ Alice Johnson 13 Alice johnson 6 Alice Charlie Brown 13 Charlie brown 8 Charlie David Wilson 12 David wilson 6 David Frank Miller 12 Frank miller 6 Frank Bob Smith 9 Bob smith 4 Bob Eve Davis 9 Eve davis 4 Eve
Example (CompressionSupport) ΒΆ
ExampleOpen_compressionSupport demonstrates working with compressed files
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) func main() { tmpDir := createCompressedTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() // Open compressed files seamlessly db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() fmt.Println("=== Compression Support Demo ===") fmt.Println("Successfully loaded compressed files:") // List all tables from compressed files rows, err := db.QueryContext(ctx, ` SELECT name, sql FROM sqlite_master WHERE type='table' ORDER BY name `) if err != nil { log.Fatal(err) } for rows.Next() { var tableName, createSQL string if err := rows.Scan(&tableName, &createSQL); err != nil { log.Fatal(err) } // Count records in each table var count int countQuery := "SELECT COUNT(*) FROM " + tableName if err := db.QueryRowContext(ctx, countQuery).Scan(&count); err != nil { log.Fatal(err) } fmt.Printf("- %s: %d records\n", tableName, count) } _ = rows.Close() // Ignore close error in test cleanup // Demonstrate querying across compressed files fmt.Println("\nCross-file analysis from compressed data:") analysisRows, err := db.QueryContext(ctx, ` SELECT 'logs' as source_table, COUNT(*) as total_records, MIN(timestamp) as earliest, MAX(timestamp) as latest FROM logs UNION ALL SELECT 'products' as source_table, COUNT(*) as total_records, 'N/A' as earliest, 'N/A' as latest FROM products ORDER BY source_table `) if err != nil { log.Fatal(err) } fmt.Printf("%-12s %-8s %-19s %s\n", "Table", "Records", "Earliest", "Latest") fmt.Println(strings.Repeat("-", 60)) for analysisRows.Next() { var sourceTable, earliest, latest string var totalRecords int if err := analysisRows.Scan(&sourceTable, &totalRecords, &earliest, &latest); err != nil { log.Fatal(err) } fmt.Printf("%-12s %-8d %-19s %s\n", sourceTable, totalRecords, earliest, latest) } _ = analysisRows.Close() // Ignore close error in test cleanup } // createCompressedTestData creates test data with compressed files func createCompressedTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_compressed_example_*") if err != nil { log.Fatal(err) } logsData := `timestamp,level,message,user_id 2024-01-01 10:00:00,INFO,User login,1001 2024-01-01 11:30:00,INFO,Order created,1002 2024-01-01 12:15:00,ERROR,Payment failed,1003 2024-01-01 13:45:00,INFO,User logout,1001 2024-01-01 14:00:00,INFO,System backup completed,0` logsFile := filepath.Join(tmpDir, "logs.csv") err = os.WriteFile(logsFile, []byte(logsData), 0600) if err != nil { log.Fatal(err) } productsData := `id,name,category,price,in_stock 1,Laptop Pro,Electronics,1299.99,true 2,Office Chair,Furniture,299.99,true 3,Wireless Mouse,Electronics,49.99,false` productsFile := filepath.Join(tmpDir, "products.csv") err = os.WriteFile(productsFile, []byte(productsData), 0600) if err != nil { log.Fatal(err) } return tmpDir }
Output: === Compression Support Demo === Successfully loaded compressed files: - logs: 5 records - products: 3 records Cross-file analysis from compressed data: Table Records Earliest Latest ------------------------------------------------------------ logs 5 2024-01-01 10:00:00 2024-01-01 14:00:00 products 3 N/A N/A
Example (Constraints) ΒΆ
ExampleOpen_constraints demonstrates the constraint that modifications don't affect original files
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) // createTempTestData creates temporary CSV files for the example func createTempTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_example") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2020-01-15 2,Bob Smith,1,85000,2019-03-22 3,Charlie Brown,1,80000,2021-06-10 4,David Wilson,1,75000,2022-02-28 5,Eve Davis,2,70000,2020-09-15 6,Frank Miller,2,65000,2021-11-30 7,Grace Lee,3,60000,2019-12-05 8,Henry Taylor,3,55000,2022-04-18` err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget,manager_id 1,Engineering,1000000,1 2,Marketing,800000,5 3,Sales,600000,7 4,HR,400000,9` err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { tmpDir := createTempTestData() defer os.RemoveAll(tmpDir) db, err := filesql.Open(filepath.Join(tmpDir, "employees.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Show original data count var originalCount int err = db.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&originalCount) if err != nil { log.Fatal(err) } fmt.Printf("Original employee count: %d\n", originalCount) // Insert new data (only affects in-memory database) _, err = db.ExecContext(context.Background(), "INSERT INTO employees (id, name, department_id, salary, hire_date) VALUES (99, 'Test User', 1, 50000, '2023-01-01')") if err != nil { log.Fatal(err) } // Show in-memory count var memoryCount int err = db.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&memoryCount) if err != nil { log.Fatal(err) } fmt.Printf("In-memory count after INSERT: %d\n", memoryCount) // Verify original file is unchanged by reopening db2, err := filesql.Open(filepath.Join(tmpDir, "employees.csv")) if err != nil { log.Fatal(err) } defer db2.Close() var fileCount int err = db2.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&fileCount) if err != nil { log.Fatal(err) } fmt.Printf("File-based count (unchanged): %d\n", fileCount) }
Output: Original employee count: 8 In-memory count after INSERT: 9 File-based count (unchanged): 8
Example (CustomerInsights) ΒΆ
ExampleOpen_customerInsights demonstrates customer behavior analysis
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) // createSalesTestData creates test data for sales analysis examples func createSalesTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_sales_example") if err != nil { log.Fatal(err) } salesData := `order_id,customer_id,product_name,category,quantity,unit_price,order_date,region 1,101,Laptop Pro,Electronics,2,1299.99,2024-01-15,North 2,102,Wireless Mouse,Electronics,1,29.99,2024-01-16,South 3,103,Office Chair,Furniture,1,299.99,2024-01-17,East 4,101,USB Cable,Electronics,3,12.99,2024-01-18,North 5,104,Standing Desk,Furniture,1,599.99,2024-01-19,West 6,105,Bluetooth Speaker,Electronics,2,79.99,2024-01-20,South 7,106,Coffee Table,Furniture,1,199.99,2024-01-21,East 8,102,Keyboard,Electronics,1,89.99,2024-01-22,South 9,107,Monitor 24inch,Electronics,1,249.99,2024-01-23,North 10,103,Desk Lamp,Furniture,2,39.99,2024-01-24,East` err = os.WriteFile(filepath.Join(tmpDir, "sales.csv"), []byte(salesData), 0600) if err != nil { log.Fatal(err) } customersData := `customer_id,name,email,city,registration_date 101,John Doe,john@example.com,New York,2023-06-01 102,Jane Smith,jane@example.com,Los Angeles,2023-07-15 103,Bob Johnson,bob@example.com,Chicago,2023-08-20 104,Alice Brown,alice@example.com,Houston,2023-09-10 105,Charlie Wilson,charlie@example.com,Phoenix,2023-10-05 106,Diana Lee,diana@example.com,Philadelphia,2023-11-12 107,Frank Miller,frank@example.com,San Antonio,2023-12-03` err = os.WriteFile(filepath.Join(tmpDir, "customers.csv"), []byte(customersData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { tmpDir := createSalesTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() // Customer lifetime value and behavior analysis query := ` SELECT c.name, c.city, COUNT(s.order_id) as total_orders, SUM(s.quantity * s.unit_price) as lifetime_value, AVG(s.quantity * s.unit_price) as avg_order_value, MIN(s.order_date) as first_purchase, MAX(s.order_date) as last_purchase, julianday(MAX(s.order_date)) - julianday(MIN(s.order_date)) as days_active, COUNT(DISTINCT s.category) as categories_purchased FROM customers c JOIN sales s ON c.customer_id = s.customer_id GROUP BY c.customer_id, c.name, c.city HAVING total_orders > 1 ORDER BY lifetime_value DESC ` rows, err := db.QueryContext(ctx, query) if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("Customer Insights (Multi-Purchase Customers):") fmt.Println("===========================================") fmt.Printf("%-12s %-12s %-7s %-10s %-10s %-12s %-12s %-6s %s\n", "Name", "City", "Orders", "LTV", "Avg Order", "First Buy", "Last Buy", "Days", "Categories") fmt.Println(strings.Repeat("-", 100)) for rows.Next() { var name, city, firstPurchase, lastPurchase string var totalOrders, daysActive, categoriesPurchased int var lifetimeValue, avgOrderValue float64 err := rows.Scan(&name, &city, &totalOrders, &lifetimeValue, &avgOrderValue, &firstPurchase, &lastPurchase, &daysActive, &categoriesPurchased) if err != nil { log.Fatal(err) } fmt.Printf("%-12s %-12s %-7d $%-9.2f $%-9.2f %-12s %-12s %-6d %d\n", name, city, totalOrders, lifetimeValue, avgOrderValue, firstPurchase, lastPurchase, daysActive, categoriesPurchased) } }
Output: Customer Insights (Multi-Purchase Customers): =========================================== Name City Orders LTV Avg Order First Buy Last Buy Days Categories ---------------------------------------------------------------------------------------------------- John Doe New York 2 $2638.95 $1319.47 2024-01-15 2024-01-18 3 1 Bob Johnson Chicago 2 $379.97 $189.99 2024-01-17 2024-01-24 7 1 Jane Smith Los Angeles 2 $119.98 $59.99 2024-01-16 2024-01-22 6 1
Example (ErrorHandling) ΒΆ
ExampleOpen_errorHandling demonstrates proper error handling patterns
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) // createTempTestData creates temporary CSV files for the example func createTempTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_example") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2020-01-15 2,Bob Smith,1,85000,2019-03-22 3,Charlie Brown,1,80000,2021-06-10 4,David Wilson,1,75000,2022-02-28 5,Eve Davis,2,70000,2020-09-15 6,Frank Miller,2,65000,2021-11-30 7,Grace Lee,3,60000,2019-12-05 8,Henry Taylor,3,55000,2022-04-18` err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget,manager_id 1,Engineering,1000000,1 2,Marketing,800000,5 3,Sales,600000,7 4,HR,400000,9` err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { // Example 1: Handling non-existent files gracefully _, err := filesql.Open("nonexistent.csv") if err != nil { fmt.Printf("Expected error for non-existent file: %v\n", err) } // Example 2: Context timeout handling ctx, cancel := context.WithTimeout(context.Background(), 1*time.Nanosecond) // Very short timeout defer cancel() // This will likely timeout tmpDir := createTempTestData() defer os.RemoveAll(tmpDir) time.Sleep(10 * time.Millisecond) // Ensure timeout triggers _, err = filesql.OpenContext(ctx, tmpDir) if err != nil { // Extract the core error message (context deadline exceeded) errMsg := err.Error() if strings.Contains(errMsg, "context deadline exceeded") { fmt.Printf("Expected timeout error: %s\n", "context deadline exceeded") } else { fmt.Printf("Expected timeout error: %v\n", err) } } // Example 3: Successful operation with proper error checking ctx2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second) defer cancel2() db, err := filesql.OpenContext(ctx2, tmpDir) if err != nil { fmt.Printf("Unexpected error: %v\n", err) return } defer db.Close() // Test query with error handling rows, err := db.QueryContext(ctx2, "SELECT COUNT(*) FROM employees") if err != nil { fmt.Printf("Query error: %v\n", err) return } defer rows.Close() var count int if rows.Next() { if err := rows.Scan(&count); err != nil { fmt.Printf("Scan error: %v\n", err) return } fmt.Printf("Successfully counted %d employees\n", count) } if err := rows.Err(); err != nil { fmt.Printf("Rows iteration error: %v\n", err) return } }
Output: Expected error for non-existent file: failed to load file: path does not exist: nonexistent.csv Expected timeout error: context deadline exceeded Successfully counted 8 employees
Example (FinancialDataAnalysis) ΒΆ
ExampleOpen_financialDataAnalysis demonstrates financial data processing
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) func main() { tmpDir := createFinancialTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() fmt.Println("=== Financial Data Analysis ===") // Monthly revenue trend fmt.Println("\n1. Monthly Revenue Trend:") rows, err := db.QueryContext(ctx, ` SELECT strftime('%Y-%m', transaction_date) as month, COUNT(*) as transaction_count, SUM(amount) as total_revenue, AVG(amount) as avg_transaction, MAX(amount) as largest_transaction FROM transactions WHERE type = 'sale' GROUP BY month ORDER BY month `) if err != nil { log.Fatal(err) } fmt.Printf("%-8s %-6s %-10s %-8s %s\n", "Month", "Count", "Revenue", "Average", "Largest") fmt.Println(strings.Repeat("-", 50)) for rows.Next() { var month string var count int var revenue, average, largest float64 if err := rows.Scan(&month, &count, &revenue, &average, &largest); err != nil { log.Fatal(err) } fmt.Printf("%-8s %-6d $%-9.2f $%-7.2f $%.2f\n", month, count, revenue, average, largest) } _ = rows.Close() // Ignore close error in test cleanup // Expense category breakdown fmt.Println("\n2. Expense Categories:") rows, err = db.QueryContext(ctx, ` SELECT category, COUNT(*) as transaction_count, SUM(ABS(amount)) as total_expense, ROUND(SUM(ABS(amount)) * 100.0 / ( SELECT SUM(ABS(amount)) FROM transactions WHERE type = 'expense' ), 2) as percentage FROM transactions WHERE type = 'expense' GROUP BY category ORDER BY total_expense DESC `) if err != nil { log.Fatal(err) } fmt.Printf("%-15s %-6s %-12s %-10s\n", "Category", "Count", "Total", "Percentage") fmt.Println(strings.Repeat("-", 45)) for rows.Next() { var category string var count int var expense, percentage float64 if err := rows.Scan(&category, &count, &expense, &percentage); err != nil { log.Fatal(err) } fmt.Printf("%-15s %-6d $%-11.2f %-10.2f%%\n", category, count, expense, percentage) } _ = rows.Close() // Ignore close error in test cleanup // Cash flow summary fmt.Println("\n3. Cash Flow Summary:") var totalIncome, totalExpenses, netIncome float64 err = db.QueryRowContext(ctx, ` SELECT SUM(CASE WHEN type = 'sale' THEN amount ELSE 0 END) as total_income, SUM(CASE WHEN type = 'expense' THEN ABS(amount) ELSE 0 END) as total_expenses, SUM(CASE WHEN type = 'sale' THEN amount ELSE -ABS(amount) END) as net_income FROM transactions `).Scan(&totalIncome, &totalExpenses, &netIncome) if err != nil { log.Fatal(err) } fmt.Printf("Total Income: $%.2f\n", totalIncome) fmt.Printf("Total Expenses: $%.2f\n", totalExpenses) fmt.Printf("Net Income: $%.2f\n", netIncome) fmt.Printf("Profit Margin: %.2f%%\n", (netIncome/totalIncome)*100) } // createFinancialTestData creates test data for financial analysis examples func createFinancialTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_financial_example_*") if err != nil { log.Fatal(err) } transactionData := `transaction_id,transaction_date,type,category,amount,description 1,2024-01-15,sale,Product Sales,1500.00,Sale of premium product 2,2024-01-20,sale,Product Sales,850.00,Sale of standard product 3,2024-01-25,sale,Service,1200.00,Consulting service 4,2024-01-10,expense,Office Supplies,-150.00,Office equipment purchase 5,2024-01-18,expense,Marketing,-250.00,Social media advertising 6,2024-02-05,sale,Product Sales,1200.00,Sale of premium product 7,2024-02-15,sale,Service,1000.00,Training service 8,2024-02-08,expense,Office Supplies,-200.00,Stationery purchase` transactionFile := filepath.Join(tmpDir, "transactions.csv") err = os.WriteFile(transactionFile, []byte(transactionData), 0600) if err != nil { log.Fatal(err) } return tmpDir }
Output: === Financial Data Analysis === 1. Monthly Revenue Trend: Month Count Revenue Average Largest -------------------------------------------------- 2024-01 3 $3550.00 $1183.33 $1500.00 2024-02 2 $2200.00 $1100.00 $1200.00 2. Expense Categories: Category Count Total Percentage --------------------------------------------- Office Supplies 2 $350.00 58.33 % Marketing 1 $250.00 41.67 % 3. Cash Flow Summary: Total Income: $5750.00 Total Expenses: $600.00 Net Income: $5150.00 Profit Margin: 89.57%
Example (MultipleFiles) ΒΆ
ExampleOpen_multipleFiles demonstrates opening multiple files and directories
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) // createTempTestData creates temporary CSV files for the example func createTempTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_example") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2020-01-15 2,Bob Smith,1,85000,2019-03-22 3,Charlie Brown,1,80000,2021-06-10 4,David Wilson,1,75000,2022-02-28 5,Eve Davis,2,70000,2020-09-15 6,Frank Miller,2,65000,2021-11-30 7,Grace Lee,3,60000,2019-12-05 8,Henry Taylor,3,55000,2022-04-18` err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget,manager_id 1,Engineering,1000000,1 2,Marketing,800000,5 3,Sales,600000,7 4,HR,400000,9` err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { tmpDir := createTempTestData() defer os.RemoveAll(tmpDir) // Open database with multiple paths (files and directories) db, err := filesql.Open(tmpDir) if err != nil { log.Fatal(err) } defer db.Close() // Query to show all available tables rows, err := db.QueryContext(context.Background(), "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name") if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("Available tables:") for rows.Next() { var tableName string if err := rows.Scan(&tableName); err != nil { log.Fatal(err) } fmt.Printf("- %s\n", tableName) } }
Output: Available tables: - departments - employees
Example (PerformanceOptimization) ΒΆ
ExampleOpen_performanceOptimization demonstrates techniques for handling large datasets efficiently
package main import ( "context" "fmt" "log" "os" "path/filepath" "time" "github.com/nao1215/filesql" ) func main() { tmpDir := createLargeTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() // Technique 1: Use LIMIT and OFFSET for pagination fmt.Println("=== Performance Optimization Techniques ===") fmt.Println("\n1. Pagination with LIMIT and OFFSET:") pageSize := 3 offset := 0 for page := 1; page <= 2; page++ { rows, err := db.QueryContext(ctx, ` SELECT customer_id, name, total_orders FROM customer_summary ORDER BY total_orders DESC LIMIT ? OFFSET ? `, pageSize, offset) if err != nil { log.Fatal(err) } fmt.Printf("Page %d:\n", page) for rows.Next() { var customerID int var name string var totalOrders int if err := rows.Scan(&customerID, &name, &totalOrders); err != nil { log.Fatal(err) } fmt.Printf(" - %s (ID: %d, Orders: %d)\n", name, customerID, totalOrders) } _ = rows.Close() // Ignore close error in test cleanup offset += pageSize } // Technique 2: Use indexes by querying with WHERE clauses on sorted columns fmt.Println("\n2. Efficient filtering with indexes:") rows, err := db.QueryContext(ctx, ` SELECT name, email, registration_date FROM customer_summary WHERE total_spent > 1000 ORDER BY total_spent DESC `) if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("High-value customers:") for rows.Next() { var name, email, regDate string if err := rows.Scan(&name, &email, ®Date); err != nil { log.Fatal(err) } fmt.Printf(" - %s (%s) - Registered: %s\n", name, email, regDate) } // Technique 3: Aggregate queries for summary statistics fmt.Println("\n3. Summary statistics:") var totalCustomers int var avgOrders, totalRevenue, avgSpent float64 err = db.QueryRowContext(ctx, ` SELECT COUNT(*) as total_customers, AVG(total_orders) as avg_orders, SUM(total_spent) as total_revenue, AVG(total_spent) as avg_spent FROM customer_summary `).Scan(&totalCustomers, &avgOrders, &totalRevenue, &avgSpent) if err != nil { log.Fatal(err) } fmt.Printf("Total customers: %d\n", totalCustomers) fmt.Printf("Average orders per customer: %.1f\n", avgOrders) fmt.Printf("Total revenue: $%.2f\n", totalRevenue) fmt.Printf("Average customer value: $%.2f\n", avgSpent) } // createLargeTestData creates test data for performance optimization examples func createLargeTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_large_example_*") if err != nil { log.Fatal(err) } customerData := `customer_id,name,email,registration_date,total_orders,total_spent 1001,Premium Customer A,premium.a@example.com,2023-01-15,15,1500.00 1002,Premium Customer B,premium.b@example.com,2023-02-20,12,1200.00 1003,Premium Customer C,premium.c@example.com,2023-03-10,10,1000.00 1004,Regular Customer D,regular.d@example.com,2023-04-05,8,800.00 1005,Regular Customer E,regular.e@example.com,2023-05-15,6,600.00 1006,Regular Customer F,regular.f@example.com,2023-06-20,5,500.00 1007,Budget Customer G,budget.g@example.com,2023-07-10,3,300.00 1008,Budget Customer H,budget.h@example.com,2023-08-25,2,200.00 1009,New Customer I,new.i@example.com,2023-09-30,1,100.00 1010,New Customer J,new.j@example.com,2023-10-15,1,100.00` customerFile := filepath.Join(tmpDir, "customer_summary.csv") err = os.WriteFile(customerFile, []byte(customerData), 0600) if err != nil { log.Fatal(err) } return tmpDir }
Output: === Performance Optimization Techniques === 1. Pagination with LIMIT and OFFSET: Page 1: - Premium Customer A (ID: 1001, Orders: 15) - Premium Customer B (ID: 1002, Orders: 12) - Premium Customer C (ID: 1003, Orders: 10) Page 2: - Regular Customer D (ID: 1004, Orders: 8) - Regular Customer E (ID: 1005, Orders: 6) - Regular Customer F (ID: 1006, Orders: 5) 2. Efficient filtering with indexes: High-value customers: - Premium Customer A (premium.a@example.com) - Registered: 2023-01-15 - Premium Customer B (premium.b@example.com) - Registered: 2023-02-20 3. Summary statistics: Total customers: 10 Average orders per customer: 6.3 Total revenue: $6300.00 Average customer value: $630.00
Example (SalesAnalysis) ΒΆ
ExampleOpen_salesAnalysis demonstrates practical sales data analysis
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) // createSalesTestData creates test data for sales analysis examples func createSalesTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_sales_example") if err != nil { log.Fatal(err) } salesData := `order_id,customer_id,product_name,category,quantity,unit_price,order_date,region 1,101,Laptop Pro,Electronics,2,1299.99,2024-01-15,North 2,102,Wireless Mouse,Electronics,1,29.99,2024-01-16,South 3,103,Office Chair,Furniture,1,299.99,2024-01-17,East 4,101,USB Cable,Electronics,3,12.99,2024-01-18,North 5,104,Standing Desk,Furniture,1,599.99,2024-01-19,West 6,105,Bluetooth Speaker,Electronics,2,79.99,2024-01-20,South 7,106,Coffee Table,Furniture,1,199.99,2024-01-21,East 8,102,Keyboard,Electronics,1,89.99,2024-01-22,South 9,107,Monitor 24inch,Electronics,1,249.99,2024-01-23,North 10,103,Desk Lamp,Furniture,2,39.99,2024-01-24,East` err = os.WriteFile(filepath.Join(tmpDir, "sales.csv"), []byte(salesData), 0600) if err != nil { log.Fatal(err) } customersData := `customer_id,name,email,city,registration_date 101,John Doe,john@example.com,New York,2023-06-01 102,Jane Smith,jane@example.com,Los Angeles,2023-07-15 103,Bob Johnson,bob@example.com,Chicago,2023-08-20 104,Alice Brown,alice@example.com,Houston,2023-09-10 105,Charlie Wilson,charlie@example.com,Phoenix,2023-10-05 106,Diana Lee,diana@example.com,Philadelphia,2023-11-12 107,Frank Miller,frank@example.com,San Antonio,2023-12-03` err = os.WriteFile(filepath.Join(tmpDir, "customers.csv"), []byte(customersData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { tmpDir := createSalesTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() // Sales summary by category and region query := ` SELECT category, region, COUNT(*) as order_count, SUM(quantity * unit_price) as total_revenue, AVG(quantity * unit_price) as avg_order_value, MIN(order_date) as first_order, MAX(order_date) as last_order FROM sales GROUP BY category, region ORDER BY total_revenue DESC ` rows, err := db.QueryContext(ctx, query) if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("Sales Analysis by Category and Region:") fmt.Println("=====================================") fmt.Printf("%-12s %-8s %-6s %-10s %-12s %-12s %s\n", "Category", "Region", "Orders", "Revenue", "Avg Order", "First Order", "Last Order") fmt.Println(strings.Repeat("-", 80)) for rows.Next() { var category, region, firstOrder, lastOrder string var orderCount int var totalRevenue, avgOrderValue float64 err := rows.Scan(&category, ®ion, &orderCount, &totalRevenue, &avgOrderValue, &firstOrder, &lastOrder) if err != nil { log.Fatal(err) } fmt.Printf("%-12s %-8s %-6d $%-9.2f $%-11.2f %-12s %s\n", category, region, orderCount, totalRevenue, avgOrderValue, firstOrder, lastOrder) } }
Output: Sales Analysis by Category and Region: ===================================== Category Region Orders Revenue Avg Order First Order Last Order -------------------------------------------------------------------------------- Electronics North 3 $2888.94 $962.98 2024-01-15 2024-01-23 Furniture West 1 $599.99 $599.99 2024-01-19 2024-01-19 Furniture East 3 $579.96 $193.32 2024-01-17 2024-01-24 Electronics South 3 $279.96 $93.32 2024-01-16 2024-01-22
Example (WebLogAnalysis) ΒΆ
ExampleOpen_webLogAnalysis demonstrates analyzing web server logs
package main import ( "context" "fmt" "log" "os" "path/filepath" "strings" "time" "github.com/nao1215/filesql" ) func main() { tmpDir := createWebLogTestData() defer os.RemoveAll(tmpDir) ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, tmpDir) if err != nil { log.Fatal(err) } defer db.Close() fmt.Println("=== Web Log Analysis ===") // Top pages by hits fmt.Println("\n1. Top Pages by Hits:") rows, err := db.QueryContext(ctx, ` SELECT path, COUNT(*) as hits, COUNT(DISTINCT ip_address) as unique_visitors FROM access_logs WHERE status_code = 200 GROUP BY path ORDER BY hits DESC LIMIT 5 `) if err != nil { log.Fatal(err) } fmt.Printf("%-20s %-6s %s\n", "Path", "Hits", "Unique") fmt.Println(strings.Repeat("-", 35)) for rows.Next() { var path string var hits, unique int if err := rows.Scan(&path, &hits, &unique); err != nil { log.Fatal(err) } fmt.Printf("%-20s %-6d %d\n", path, hits, unique) } _ = rows.Close() // Ignore close error in test cleanup // Error analysis fmt.Println("\n2. Error Analysis:") rows, err = db.QueryContext(ctx, ` SELECT status_code, COUNT(*) as error_count, ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM access_logs), 2) as percentage FROM access_logs WHERE status_code >= 400 GROUP BY status_code ORDER BY error_count DESC `) if err != nil { log.Fatal(err) } fmt.Printf("%-12s %-6s %-10s\n", "Status Code", "Count", "Percentage") fmt.Println(strings.Repeat("-", 30)) for rows.Next() { var statusCode, errorCount int var percentage float64 if err := rows.Scan(&statusCode, &errorCount, &percentage); err != nil { log.Fatal(err) } fmt.Printf("%-12d %-6d %-10.2f%%\n", statusCode, errorCount, percentage) } _ = rows.Close() // Ignore close error in test cleanup // Hourly traffic pattern fmt.Println("\n3. Traffic by Hour:") rows, err = db.QueryContext(ctx, ` SELECT CAST(strftime('%H', timestamp) AS INTEGER) as hour, COUNT(*) as requests, AVG(response_time) as avg_response_time FROM access_logs GROUP BY hour ORDER BY hour `) if err != nil { log.Fatal(err) } fmt.Printf("%-5s %-9s %-12s\n", "Hour", "Requests", "Avg Response") fmt.Println(strings.Repeat("-", 28)) for rows.Next() { var hour, requests int var avgResponseTime float64 if err := rows.Scan(&hour, &requests, &avgResponseTime); err != nil { log.Fatal(err) } fmt.Printf("%-5d %-9d %-12.0fms\n", hour, requests, avgResponseTime) } _ = rows.Close() // Ignore close error in test cleanup } // createWebLogTestData creates test data for web log analysis examples func createWebLogTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_weblog_example_*") if err != nil { log.Fatal(err) } accessLogsData := `timestamp,ip_address,method,path,status_code,response_time,user_agent 2024-01-01 09:15:30,192.168.1.100,GET,/,200,150,Mozilla/5.0 2024-01-01 09:30:45,192.168.1.101,GET,/products,200,200,Mozilla/5.0 2024-01-01 10:05:15,192.168.1.100,GET,/,200,120,Mozilla/5.0 2024-01-01 10:20:30,192.168.1.102,GET,/about,200,180,Mozilla/5.0 2024-01-01 10:35:45,192.168.1.101,GET,/products,200,160,Mozilla/5.0 2024-01-01 11:10:15,192.168.1.103,GET,/contact,200,140,Mozilla/5.0 2024-01-01 11:25:30,192.168.1.100,GET,/,200,200,Mozilla/5.0 2024-01-01 11:40:45,192.168.1.104,GET,/missing,404,50,Mozilla/5.0 2024-01-01 14:15:30,192.168.1.105,GET,/notfound,404,100,Mozilla/5.0` accessLogsFile := filepath.Join(tmpDir, "access_logs.csv") err = os.WriteFile(accessLogsFile, []byte(accessLogsData), 0600) if err != nil { log.Fatal(err) } return tmpDir }
Output: === Web Log Analysis === 1. Top Pages by Hits: Path Hits Unique ----------------------------------- / 3 1 /products 2 1 /contact 1 1 /about 1 1 2. Error Analysis: Status Code Count Percentage ------------------------------ 404 2 22.22 % 3. Traffic by Hour: Hour Requests Avg Response ---------------------------- 9 2 175 ms 10 3 153 ms 11 3 130 ms 14 1 100 ms
func OpenContext ΒΆ added in v0.0.2
OpenContext is like Open but accepts a context for cancellation and timeout control.
Use this when you need to:
- Set timeouts for loading large files
- Support cancellation in server applications
- Integrate with context-aware code
Example with timeout:
// Open a single CSV file with timeout ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() db, err := filesql.OpenContext(ctx, "data/users.csv") if err != nil { log.Fatal(err) } defer db.Close() // Complex query with JOINs, aggregation, and window functions rows, err := db.QueryContext(ctx, ` SELECT u.name, u.department, u.salary, AVG(u.salary) OVER (PARTITION BY u.department) as dept_avg_salary, RANK() OVER (PARTITION BY u.department ORDER BY u.salary DESC) as salary_rank, COUNT(*) OVER (PARTITION BY u.department) as dept_size FROM users u WHERE u.salary > ( SELECT AVG(salary) * 0.8 FROM users WHERE department = u.department ) ORDER BY u.department, u.salary DESC `) if err != nil { log.Fatal(err) } defer rows.Close() // Process results for rows.Next() { var name, dept string var salary, deptAvg float64 var rank, deptSize int if err := rows.Scan(&name, &dept, &salary, &deptAvg, &rank, &deptSize); err != nil { log.Fatal(err) } fmt.Printf("%s (%s): $%.2f (Rank: %d/%d, Dept Avg: $%.2f)\n", name, dept, salary, rank, deptSize, deptAvg) }
Example ΒΆ
ExampleOpenContext demonstrates opening files with context support for timeout and cancellation
package main import ( "context" "fmt" "log" "os" "path/filepath" "time" "github.com/nao1215/filesql" ) // createTempTestData creates temporary CSV files for the example func createTempTestData() string { tmpDir, err := os.MkdirTemp("", "filesql_example") if err != nil { log.Fatal(err) } employeesData := `id,name,department_id,salary,hire_date 1,Alice Johnson,1,95000,2020-01-15 2,Bob Smith,1,85000,2019-03-22 3,Charlie Brown,1,80000,2021-06-10 4,David Wilson,1,75000,2022-02-28 5,Eve Davis,2,70000,2020-09-15 6,Frank Miller,2,65000,2021-11-30 7,Grace Lee,3,60000,2019-12-05 8,Henry Taylor,3,55000,2022-04-18` err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600) if err != nil { log.Fatal(err) } departmentsData := `id,name,budget,manager_id 1,Engineering,1000000,1 2,Marketing,800000,5 3,Sales,600000,7 4,HR,400000,9` err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600) if err != nil { log.Fatal(err) } return tmpDir } func main() { tmpDir := createTempTestData() defer os.RemoveAll(tmpDir) // Create a context with timeout ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second) defer cancel() // Open database with context db, err := filesql.OpenContext(ctx, filepath.Join(tmpDir, "employees.csv")) if err != nil { log.Fatal(err) } defer db.Close() // Query with context support rows, err := db.QueryContext(ctx, ` SELECT name, salary FROM employees WHERE salary > 70000 ORDER BY salary DESC `) if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("High earners (>$70,000):") for rows.Next() { var name string var salary float64 if err := rows.Scan(&name, &salary); err != nil { log.Fatal(err) } fmt.Printf("- %s: $%.0f\n", name, salary) } }
Output: High earners (>$70,000): - Alice Johnson: $95000 - Bob Smith: $85000 - Charlie Brown: $80000 - David Wilson: $75000
Types ΒΆ
type ChunkSize ΒΆ added in v0.4.2
type ChunkSize int
ChunkSize represents a chunk size with validation
func NewChunkSize ΒΆ added in v0.4.2
NewChunkSize creates a new ChunkSize with validation
type CompressionFactory ΒΆ added in v0.4.4
type CompressionFactory struct{}
CompressionFactory provides factory methods for compression handling
func NewCompressionFactory ΒΆ added in v0.4.4
func NewCompressionFactory() *CompressionFactory
NewCompressionFactory creates a new compression factory
func (*CompressionFactory) CreateHandlerForFile ΒΆ added in v0.4.4
func (f *CompressionFactory) CreateHandlerForFile(path string) CompressionHandler
CreateHandlerForFile creates an appropriate compression handler for a given file path
func (*CompressionFactory) CreateReaderForFile ΒΆ added in v0.4.4
CreateReaderForFile opens a file and returns a reader that handles decompression
func (*CompressionFactory) CreateWriterForFile ΒΆ added in v0.4.4
func (f *CompressionFactory) CreateWriterForFile(path string, compressionType CompressionType) (io.Writer, func() error, error)
CreateWriterForFile creates a file and returns a writer that handles compression
func (*CompressionFactory) DetectCompressionType ΒΆ added in v0.4.4
func (f *CompressionFactory) DetectCompressionType(path string) CompressionType
DetectCompressionType detects the compression type from a file path
func (*CompressionFactory) GetBaseFileType ΒΆ added in v0.4.4
func (f *CompressionFactory) GetBaseFileType(path string) FileType
GetBaseFileType determines the base file type after removing compression extensions
func (*CompressionFactory) RemoveCompressionExtension ΒΆ added in v0.4.4
func (f *CompressionFactory) RemoveCompressionExtension(path string) string
RemoveCompressionExtension removes the compression extension from a file path if present
type CompressionHandler ΒΆ added in v0.4.4
type CompressionHandler interface { // CreateReader wraps an io.Reader with a decompression reader if needed CreateReader(reader io.Reader) (io.Reader, func() error, error) // CreateWriter wraps an io.Writer with a compression writer if needed CreateWriter(writer io.Writer) (io.Writer, func() error, error) // Extension returns the file extension for this compression type (e.g., ".gz") Extension() string }
CompressionHandler defines the interface for handling file compression/decompression
func NewCompressionHandler ΒΆ added in v0.4.4
func NewCompressionHandler(compressionType CompressionType) CompressionHandler
NewCompressionHandler creates a new compression handler for the given compression type
type CompressionType ΒΆ added in v0.0.4
type CompressionType int
CompressionType represents the compression type
const ( // CompressionNone represents no compression CompressionNone CompressionType = iota // CompressionGZ represents gzip compression CompressionGZ // CompressionBZ2 represents bzip2 compression CompressionBZ2 // CompressionXZ represents xz compression CompressionXZ // CompressionZSTD represents zstd compression CompressionZSTD )
func (CompressionType) Extension ΒΆ added in v0.2.0
func (c CompressionType) Extension() string
Extension returns the file extension for the compression type
func (CompressionType) String ΒΆ added in v0.2.0
func (c CompressionType) String() string
String returns the string representation of CompressionType
type DBBuilder ΒΆ added in v0.1.0
type DBBuilder struct {
// contains filtered or unexported fields
}
DBBuilder configures and creates database connections from various data sources.
Basic usage:
builder := filesql.NewBuilder(). AddPath("data.csv"). AddPath("users.tsv") validatedBuilder, err := builder.Build(ctx) if err != nil { return err } db, err := validatedBuilder.Open(ctx) defer db.Close()
Supports:
- File paths (AddPath)
- Embedded filesystems (AddFS)
- io.Reader streams (AddReader)
- Auto-save functionality (EnableAutoSave)
Example (Chaining) ΒΆ
package main import ( "context" "fmt" "log" "os" "path/filepath" "testing/fstest" "github.com/nao1215/filesql" ) func main() { // Create temporary files tempDir, _ := os.MkdirTemp("", "filesql-example") defer os.RemoveAll(tempDir) csvFile := filepath.Join(tempDir, "data1.csv") content1 := "id,name\n1,Alice\n2,Bob\n" os.WriteFile(csvFile, []byte(content1), 0644) tsvFile := filepath.Join(tempDir, "data2.tsv") content2 := "id\tproduct\n1\tLaptop\n2\tPhone\n" os.WriteFile(tsvFile, []byte(content2), 0644) // Create mock filesystem mockFS := fstest.MapFS{ "logs.ltsv": &fstest.MapFile{Data: []byte("time:2024-01-01T00:00:00Z\tlevel:info\n")}, } // Demonstrate method chaining ctx := context.Background() db, err := filesql.NewBuilder(). AddPath(csvFile). AddPaths(tsvFile). AddFS(mockFS). Build(ctx) if err != nil { log.Fatal(err) } connection, err := db.Open(ctx) if err != nil { log.Fatal(err) } defer connection.Close() // Count tables from different sources rows, err := connection.Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name") if err != nil { log.Fatal(err) } defer rows.Close() var tableCount int for rows.Next() { var tableName string rows.Scan(&tableName) tableCount++ } fmt.Printf("Successfully loaded %d tables from mixed sources\n", tableCount) }
Output: Successfully loaded 3 tables from mixed sources
Example (ErrorHandling) ΒΆ
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Example 1: Build without inputs should fail builder := filesql.NewBuilder() ctx := context.Background() _, err := builder.Build(ctx) if err != nil { fmt.Printf("Expected error for no inputs: %v\n", err) } // Example 2: Open without Build should fail builder2 := filesql.NewBuilder().AddPath("nonexistent.csv") _, err = builder2.Open(ctx) if err != nil { fmt.Println("Expected error for Open without Build") } // Example 3: Non-existent file should fail during Build builder3 := filesql.NewBuilder().AddPath(filepath.Join("nonexistent", "file.csv")) _, err = builder3.Build(ctx) if err != nil { fmt.Println("Expected error for non-existent file") } // Example 4: Success case tempDir, _ := os.MkdirTemp("", "filesql-example") defer os.RemoveAll(tempDir) csvFile := filepath.Join(tempDir, "valid.csv") os.WriteFile(csvFile, []byte("id,name\n1,test\n"), 0644) builder4 := filesql.NewBuilder().AddPath(csvFile) validatedBuilder, err := builder4.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() fmt.Println("Success: Valid file loaded correctly") }
Output: Expected error for no inputs: at least one path must be provided Expected error for Open without Build Expected error for non-existent file Success: Valid file loaded correctly
func NewBuilder ΒΆ added in v0.1.0
func NewBuilder() *DBBuilder
NewBuilder creates a new database builder.
Start here when you need:
- Multiple data sources (files, streams, embedded FS)
- Auto-save functionality
- Custom chunk sizes for large files
- More control than the simple Open() function
Example:
builder := filesql.NewBuilder(). AddPath("data.csv"). EnableAutoSave("./backup")
Example ΒΆ
ExampleNewBuilder demonstrates the basic usage of the Builder pattern. This is the recommended approach for most use cases, especially when working with embedded filesystems or when you need more control over the database creation process.
package main import ( "fmt" "github.com/nao1215/filesql" ) func main() { // Create a new builder - this is the starting point for all Builder pattern usage builder := filesql.NewBuilder() // The builder supports method chaining for a fluent API // You can add individual paths or multiple paths at once builder.AddPath("users.csv").AddPaths("orders.tsv", "products.ltsv") fmt.Printf("Builder created successfully: %t\n", builder != nil) // In real usage, you would continue with: // ctx := context.Background() // validatedBuilder, err := builder.Build(ctx) // if err != nil { return err } // db, err := validatedBuilder.Open(ctx) // if err != nil { return err } // defer db.Close() }
Output: Builder created successfully: true
func (*DBBuilder) AddFS ΒΆ added in v0.1.0
AddFS adds files from an embedded filesystem (go:embed).
Automatically finds all CSV, TSV, and LTSV files in the filesystem.
Example:
//go:embed data/*.csv data/*.tsv var dataFS embed.FS builder.AddFS(dataFS)
Returns self for chaining.
Example ΒΆ
package main import ( "context" "fmt" "log" "testing/fstest" "github.com/nao1215/filesql" ) func main() { // Create mock filesystem with test data mockFS := fstest.MapFS{ "users.csv": &fstest.MapFile{Data: []byte("id,name,department\n1,Alice,Engineering\n2,Bob,Sales\n")}, "products.tsv": &fstest.MapFile{Data: []byte("id\tname\tprice\n1\tLaptop\t1000\n2\tPhone\t500\n")}, "logs.ltsv": &fstest.MapFile{Data: []byte("time:2024-01-01T00:00:00Z\tlevel:info\tmsg:started\n")}, "readme.txt": &fstest.MapFile{Data: []byte("This file will be ignored\n")}, // unsupported format } // Use builder with filesystem builder := filesql.NewBuilder().AddFS(mockFS) ctx := context.Background() validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Clean up temporary files // List all tables that were created from the filesystem rows, err := db.Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name") if err != nil { log.Fatal(err) } defer rows.Close() var tableCount int for rows.Next() { var tableName string rows.Scan(&tableName) tableCount++ } fmt.Printf("Created %d tables from filesystem\n", tableCount) }
Output: Created 3 tables from filesystem
Example (EmbedFS) ΒΆ
package main import ( "context" "embed" "fmt" "io/fs" "log" "github.com/nao1215/filesql" ) //go:embed testdata/embed_test/*.csv testdata/embed_test/*.tsv var builderExampleFS embed.FS func main() { // Use embedded test filesystem subFS, err := fs.Sub(builderExampleFS, "testdata/embed_test") if err != nil { log.Fatal(err) } // Use builder with embedded filesystem builder := filesql.NewBuilder().AddFS(subFS) ctx := context.Background() validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Clean up temporary files // Count the number of tables created from embedded files rows, err := db.Query("SELECT COUNT(*) FROM sqlite_master WHERE type='table'") if err != nil { log.Fatal(err) } defer rows.Close() var tableCount int if rows.Next() { rows.Scan(&tableCount) } fmt.Printf("Created %d tables from embedded files\n", tableCount) }
Output: Created 3 tables from embedded files
func (*DBBuilder) AddPath ΒΆ added in v0.1.0
AddPath adds a file or directory to load.
Examples:
- Single file: AddPath("users.csv")
- Compressed: AddPath("data.tsv.gz")
- Directory: AddPath("/data/") // loads all CSV/TSV/LTSV files
Returns self for chaining.
Example ΒΆ
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Create temporary CSV file for example tempDir, _ := os.MkdirTemp("", "filesql-example") defer os.RemoveAll(tempDir) csvFile := filepath.Join(tempDir, "users.csv") content := "id,name,age\n1,Alice,30\n2,Bob,25\n" os.WriteFile(csvFile, []byte(content), 0644) // Use builder to add a single file path builder := filesql.NewBuilder().AddPath(csvFile) // Build and open database ctx := context.Background() validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Query the data rows, err := db.Query("SELECT COUNT(*) FROM users") if err != nil { log.Fatal(err) } defer rows.Close() var count int if rows.Next() { rows.Scan(&count) } fmt.Printf("Number of users: %d\n", count) }
Output: Number of users: 2
func (*DBBuilder) AddPaths ΒΆ added in v0.1.0
AddPaths adds multiple files or directories at once.
Example:
builder.AddPaths("users.csv", "products.tsv", "/data/logs/")
Returns self for chaining.
Example ΒΆ
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Create temporary files for example tempDir, _ := os.MkdirTemp("", "filesql-example") defer os.RemoveAll(tempDir) // Create users.csv usersFile := filepath.Join(tempDir, "users.csv") usersContent := "id,name\n1,Alice\n2,Bob\n" os.WriteFile(usersFile, []byte(usersContent), 0644) // Create products.csv productsFile := filepath.Join(tempDir, "products.csv") productsContent := "id,product_name\n1,Laptop\n2,Phone\n" os.WriteFile(productsFile, []byte(productsContent), 0644) // Use builder to add multiple file paths builder := filesql.NewBuilder().AddPaths(usersFile, productsFile) ctx := context.Background() validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Query data from both tables rows, err := db.Query(` SELECT u.name, p.product_name FROM users u JOIN products p ON u.id = p.id `) if err != nil { log.Fatal(err) } defer rows.Close() for rows.Next() { var name, product string rows.Scan(&name, &product) fmt.Printf("%s has %s\n", name, product) } }
Output: Alice has Laptop Bob has Phone
func (*DBBuilder) AddReader ΒΆ added in v0.2.0
AddReader adds data from an io.Reader (file, network stream, etc.).
Parameters:
- reader: Any io.Reader (file, bytes.Buffer, http.Response.Body, etc.)
- tableName: Name for the SQL table (e.g., "users")
- fileType: Data format (FileTypeCSV, FileTypeTSV, FileTypeLTSV, etc.)
Example:
resp, _ := http.Get("https://example.com/data.csv") builder.AddReader(resp.Body, "remote_data", FileTypeCSV)
Returns self for chaining.
Example ΒΆ
ExampleDBBuilder_AddReader demonstrates using io.Reader as a data source
package main import ( "context" "fmt" "log" "strings" "github.com/nao1215/filesql" ) func main() { // CSV data from an io.Reader (could be from network, API response, etc.) csvData := `id,name,department,salary 1,Alice,Engineering,95000 2,Bob,Sales,78000 3,Charlie,Engineering,102000 4,Diana,Marketing,85000` // Create a reader from the CSV data reader := strings.NewReader(csvData) // Build database with Reader input ctx := context.Background() builder := filesql.NewBuilder(). AddReader(reader, "employees", filesql.FileTypeCSV). // Specify table name and type explicitly SetDefaultChunkSize(5000) // Set 5000 rows per chunk for large data // Build validates the input validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } // Open creates the database connection db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Query the data rows, err := db.Query(` SELECT name, department, salary FROM employees WHERE salary > 80000 ORDER BY salary DESC `) if err != nil { log.Fatal(err) } defer rows.Close() // Process results fmt.Println("High earners (salary > 80,000):") for rows.Next() { var name, dept string var salary int if err := rows.Scan(&name, &dept, &salary); err != nil { log.Fatal(err) } fmt.Printf("- %s (%s): $%d\n", name, dept, salary) } }
Output: High earners (salary > 80,000): - Charlie (Engineering): $102000 - Alice (Engineering): $95000 - Diana (Marketing): $85000
Example (Compressed) ΒΆ
ExampleDBBuilder_AddReader_compressed demonstrates using compressed data from io.Reader
package main import ( "bytes" "context" "fmt" "log" "github.com/nao1215/filesql" ) func main() { // Simulate compressed TSV data (in practice, this would be actual compressed data) tsvData := "product_id\tproduct_name\tprice\n1\tLaptop\t999\n2\tMouse\t25\n3\tKeyboard\t75" reader := bytes.NewReader([]byte(tsvData)) ctx := context.Background() builder := filesql.NewBuilder(). // Specify that this is TSV data (not actually compressed in this example) AddReader(reader, "products", filesql.FileTypeTSV) validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Query the products var count int err = db.QueryRow("SELECT COUNT(*) FROM products WHERE price < 100").Scan(&count) if err != nil { log.Fatal(err) } fmt.Printf("Products under $100: %d\n", count) }
Output: Products under $100: 2
Example (Multiple) ΒΆ
ExampleDBBuilder_AddReader_multiple demonstrates combining multiple readers with files
package main import ( "context" "fmt" "log" "strings" "github.com/nao1215/filesql" ) func main() { // First reader: Users data usersCSV := `user_id,name,email 1,Alice,alice@example.com 2,Bob,bob@example.com` // Second reader: Orders data ordersCSV := `order_id,user_id,amount 101,1,250 102,2,180 103,1,320` ctx := context.Background() builder := filesql.NewBuilder(). AddReader(strings.NewReader(usersCSV), "users", filesql.FileTypeCSV). AddReader(strings.NewReader(ordersCSV), "orders", filesql.FileTypeCSV). SetDefaultChunkSize(2500) // 2500 rows per chunk validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Join query across both tables rows, err := db.Query(` SELECT u.name, SUM(o.amount) as total_spent FROM users u JOIN orders o ON u.user_id = o.user_id GROUP BY u.user_id, u.name ORDER BY total_spent DESC `) if err != nil { log.Fatal(err) } defer rows.Close() fmt.Println("Customer spending:") for rows.Next() { var name string var total float64 if err := rows.Scan(&name, &total); err != nil { log.Fatal(err) } fmt.Printf("- %s: $%.0f\n", name, total) } }
Output: Customer spending: - Alice: $570 - Bob: $180
func (*DBBuilder) Build ΒΆ added in v0.1.0
Build validates all configured inputs and prepares the builder for opening a database. This method must be called before Open(). It performs the following operations:
1. Validates that at least one input source is configured 2. Checks existence and format of all file paths 3. Processes embedded filesystems by converting files to streaming readers 4. Validates that all files have supported extensions
After successful validation, the builder is ready to create database connections with Open(). The context is used for file operations and can be used for cancellation.
Returns the same builder instance for method chaining, or an error if validation fails.
Example ΒΆ
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Create temporary CSV file tempDir, _ := os.MkdirTemp("", "filesql-example") defer os.RemoveAll(tempDir) csvFile := filepath.Join(tempDir, "data.csv") content := "name,value\ntest,123\n" os.WriteFile(csvFile, []byte(content), 0644) // Build validates inputs and prepares for opening builder := filesql.NewBuilder().AddPath(csvFile) ctx := context.Background() validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } fmt.Printf("Builder validated successfully: %t\n", validatedBuilder != nil) }
Output: Builder validated successfully: true
func (*DBBuilder) DisableAutoSave ΒΆ added in v0.1.0
DisableAutoSave disables automatic saving (default behavior). Returns the builder for method chaining.
func (*DBBuilder) EnableAutoSave ΒΆ added in v0.1.0
func (b *DBBuilder) EnableAutoSave(outputDir string, options ...DumpOptions) *DBBuilder
EnableAutoSave automatically saves changes when the database is closed.
Parameters:
- outputDir: Where to save files
- "" (empty): Overwrite original files
- "./backup": Save to backup directory
Example:
builder.AddPath("data.csv"). EnableAutoSave("") // Auto-save to original file on db.Close()
Returns self for chaining.
Example ΒΆ
ExampleDBBuilder_EnableAutoSave demonstrates automatic saving on database close. This feature automatically saves modified data when the database connection is closed, providing a convenient way to persist changes without manual intervention.
package main import ( "context" "fmt" "os" "path/filepath" "time" "github.com/nao1215/filesql" ) func main() { // Create temporary directory and test file tempDir, _ := os.MkdirTemp("", "filesql-autosave-example") defer os.RemoveAll(tempDir) // Create sample CSV file csvPath := filepath.Join(tempDir, "employees.csv") csvContent := "name,department,salary\nAlice,Engineering,80000\nBob,Marketing,65000\n" _ = os.WriteFile(csvPath, []byte(csvContent), 0600) // Create output directory outputDir := filepath.Join(tempDir, "backup") _ = os.MkdirAll(outputDir, 0750) ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() // Configure builder with auto-save on close builder := filesql.NewBuilder(). AddPath(csvPath). EnableAutoSave(outputDir, filesql.NewDumpOptions()) // Save to backup directory on close validatedBuilder, _ := builder.Build(ctx) db, _ := validatedBuilder.Open(ctx) // Modify data - this will be automatically saved when db.Close() is called _, _ = db.ExecContext(ctx, "INSERT INTO employees (name, department, salary) VALUES ('Charlie', 'Sales', 70000)") _, _ = db.ExecContext(ctx, "UPDATE employees SET salary = 85000 WHERE name = 'Alice'") // Close database - triggers automatic save to backup directory _ = db.Close() // Verify the backup file was created and contains our changes backupFile := filepath.Join(outputDir, "employees.csv") if _, err := os.Stat(backupFile); err == nil { fmt.Println("Auto-save completed successfully") } }
Output: Auto-save completed successfully
func (*DBBuilder) EnableAutoSaveOnCommit ΒΆ added in v0.1.0
func (b *DBBuilder) EnableAutoSaveOnCommit(outputDir string, options ...DumpOptions) *DBBuilder
EnableAutoSaveOnCommit automatically saves changes after each transaction commit.
Use this for real-time persistence. Note: May impact performance.
Example:
builder.AddPath("data.csv"). EnableAutoSaveOnCommit("./output") // Save after each commit
Returns self for chaining.
Example ΒΆ
ExampleDBBuilder_EnableAutoSaveOnCommit demonstrates automatic saving on transaction commit. This provides more frequent saves but may impact performance for workloads with many commits.
package main import ( "context" "fmt" "os" "path/filepath" "time" "github.com/nao1215/filesql" ) func main() { // Create temporary directory and test file tempDir, _ := os.MkdirTemp("", "filesql-commit-save-example") defer os.RemoveAll(tempDir) // Create sample CSV file csvPath := filepath.Join(tempDir, "transactions.csv") csvContent := "id,amount,status\n1,100.50,pending\n2,250.75,pending\n" _ = os.WriteFile(csvPath, []byte(csvContent), 0600) ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second) defer cancel() // Configure builder with auto-save on commit // Using temp directory to keep the example self-contained builder := filesql.NewBuilder(). AddPath(csvPath). EnableAutoSaveOnCommit(tempDir, filesql.NewDumpOptions()) // Save to temp directory on each commit validatedBuilder, _ := builder.Build(ctx) db, _ := validatedBuilder.Open(ctx) defer func() { _ = db.Close() }() // Start transaction tx, _ := db.BeginTx(ctx, nil) // Process transactions within the transaction _, _ = tx.ExecContext(ctx, "UPDATE transactions SET status = 'completed' WHERE id = 1") _, _ = tx.ExecContext(ctx, "INSERT INTO transactions (id, amount, status) VALUES (3, 175.25, 'completed')") // Commit transaction - triggers automatic save _ = tx.Commit() fmt.Println("Transaction committed with auto-save") }
Output: Transaction committed with auto-save
func (*DBBuilder) Open ΒΆ added in v0.1.0
Open creates and returns a database connection using the configured and validated inputs. This method can only be called after Build() has been successfully executed. It creates an in-memory SQLite database and loads all configured files as tables using streaming.
Table names are derived from file names without extensions: - "users.csv" becomes table "users" - "data.tsv.gz" becomes table "data"
The returned database connection supports the full SQLite3 SQL syntax. Auto-save functionality is supported for both file paths and reader inputs. The caller is responsible for closing the connection when done.
Returns a *sql.DB connection or an error if the database cannot be created.
Example ΒΆ
package main import ( "context" "fmt" "log" "os" "path/filepath" "github.com/nao1215/filesql" ) func main() { // Create temporary CSV file tempDir, _ := os.MkdirTemp("", "filesql-example") defer os.RemoveAll(tempDir) csvFile := filepath.Join(tempDir, "employees.csv") content := "id,name,salary\n1,Alice,50000\n2,Bob,60000\n3,Charlie,55000\n" os.WriteFile(csvFile, []byte(content), 0644) // Complete builder workflow: AddPath -> Build -> Open builder := filesql.NewBuilder().AddPath(csvFile) ctx := context.Background() validatedBuilder, err := builder.Build(ctx) if err != nil { log.Fatal(err) } db, err := validatedBuilder.Open(ctx) if err != nil { log.Fatal(err) } defer db.Close() // Perform complex SQL query rows, err := db.Query(` SELECT name, salary, salary - (SELECT AVG(salary) FROM employees) as salary_diff FROM employees WHERE salary > (SELECT AVG(salary) FROM employees) ORDER BY salary DESC `) if err != nil { log.Fatal(err) } defer rows.Close() for rows.Next() { var name string var salary, diff float64 rows.Scan(&name, &salary, &diff) fmt.Printf("%s: $%.0f (+$%.0f above average)\n", name, salary, diff) } }
Output: Bob: $60000 (+$5000 above average)
func (*DBBuilder) SetDefaultChunkSize ΒΆ added in v0.2.0
SetDefaultChunkSize sets chunk size (number of rows) for large file processing.
Default: 1000 rows. Adjust based on available memory and processing needs.
Example:
builder.SetDefaultChunkSize(5000) // 5000 rows per chunk
Returns self for chaining.
type DumpOptions ΒΆ added in v0.0.4
type DumpOptions struct { // Format specifies the output file format Format OutputFormat // Compression specifies the compression type Compression CompressionType }
DumpOptions configures how database tables are exported to files.
Example:
options := NewDumpOptions(). WithFormat(OutputFormatTSV). WithCompression(CompressionGZ) err := DumpDatabase(db, "./output", options)
func NewDumpOptions ΒΆ added in v0.0.4
func NewDumpOptions() DumpOptions
NewDumpOptions creates default export options (CSV, no compression).
Modify with:
- WithFormat(): Change file format (CSV, TSV, LTSV)
- WithCompression(): Add compression (GZ, BZ2, XZ, ZSTD)
func (DumpOptions) FileExtension ΒΆ added in v0.2.0
func (o DumpOptions) FileExtension() string
FileExtension returns the complete file extension including compression
Example ΒΆ
package main import ( "fmt" "github.com/nao1215/filesql" ) func main() { // Show how file extensions are built examples := []struct { format filesql.OutputFormat compression filesql.CompressionType }{ {filesql.OutputFormatCSV, filesql.CompressionNone}, {filesql.OutputFormatTSV, filesql.CompressionGZ}, {filesql.OutputFormatLTSV, filesql.CompressionBZ2}, {filesql.OutputFormatCSV, filesql.CompressionXZ}, {filesql.OutputFormatTSV, filesql.CompressionZSTD}, } for _, ex := range examples { options := filesql.DumpOptions{ Format: ex.format, Compression: ex.compression, } fmt.Printf("Format: %-4s, Compression: %-4s -> Extension: %s\n", ex.format.String(), ex.compression.String(), options.FileExtension()) } }
Output: Format: csv , Compression: none -> Extension: .csv Format: tsv , Compression: gz -> Extension: .tsv.gz Format: ltsv, Compression: bz2 -> Extension: .ltsv.bz2 Format: csv , Compression: xz -> Extension: .csv.xz Format: tsv , Compression: zstd -> Extension: .tsv.zst
func (DumpOptions) WithCompression ΒΆ added in v0.2.0
func (o DumpOptions) WithCompression(compression CompressionType) DumpOptions
WithCompression adds compression to output files.
Options:
- CompressionNone: No compression (default)
- CompressionGZ: Gzip compression (.gz)
- CompressionBZ2: Bzip2 compression (.bz2)
- CompressionXZ: XZ compression (.xz)
- CompressionZSTD: Zstandard compression (.zst)
func (DumpOptions) WithFormat ΒΆ added in v0.2.0
func (o DumpOptions) WithFormat(format OutputFormat) DumpOptions
WithFormat sets the output file format.
Options:
- OutputFormatCSV: Comma-separated values
- OutputFormatTSV: Tab-separated values
- OutputFormatLTSV: Labeled tab-separated values
- OutputFormatParquet: Apache Parquet columnar format
type ErrorContext ΒΆ added in v0.4.2
ErrorContext provides context for where an error occurred
func NewErrorContext ΒΆ added in v0.4.2
func NewErrorContext(operation, filePath string) *ErrorContext
NewErrorContext creates a new error context
func (*ErrorContext) Error ΒΆ added in v0.4.2
func (ec *ErrorContext) Error(baseErr error) error
Error creates a formatted error with context
func (*ErrorContext) WithDetails ΒΆ added in v0.4.2
func (ec *ErrorContext) WithDetails(details string) *ErrorContext
WithDetails adds details to the error context
func (*ErrorContext) WithTable ΒΆ added in v0.4.2
func (ec *ErrorContext) WithTable(tableName string) *ErrorContext
WithTable adds table context to the error
type FileType ΒΆ added in v0.2.0
type FileType int
FileType represents supported file types including compression variants
const ( // FileTypeCSV represents CSV file type FileTypeCSV FileType = iota // FileTypeTSV represents TSV file type FileTypeTSV // FileTypeLTSV represents LTSV file type FileTypeLTSV // FileTypeParquet represents Parquet file type FileTypeParquet // FileTypeXLSX represents Excel XLSX file type FileTypeXLSX // FileTypeCSVGZ represents gzip-compressed CSV file type FileTypeCSVGZ // FileTypeTSVGZ represents gzip-compressed TSV file type FileTypeTSVGZ // FileTypeLTSVGZ represents gzip-compressed LTSV file type FileTypeLTSVGZ // FileTypeParquetGZ represents gzip-compressed Parquet file type FileTypeParquetGZ // FileTypeCSVBZ2 represents bzip2-compressed CSV file type FileTypeCSVBZ2 // FileTypeTSVBZ2 represents bzip2-compressed TSV file type FileTypeTSVBZ2 // FileTypeLTSVBZ2 represents bzip2-compressed LTSV file type FileTypeLTSVBZ2 // FileTypeParquetBZ2 represents bzip2-compressed Parquet file type FileTypeParquetBZ2 // FileTypeCSVXZ represents xz-compressed CSV file type FileTypeCSVXZ // FileTypeTSVXZ represents xz-compressed TSV file type FileTypeTSVXZ // FileTypeLTSVXZ represents xz-compressed LTSV file type FileTypeLTSVXZ // FileTypeParquetXZ represents xz-compressed Parquet file type FileTypeParquetXZ // FileTypeCSVZSTD represents zstd-compressed CSV file type FileTypeCSVZSTD // FileTypeTSVZSTD represents zstd-compressed TSV file type FileTypeTSVZSTD // FileTypeLTSVZSTD represents zstd-compressed LTSV file type FileTypeLTSVZSTD // FileTypeParquetZSTD represents zstd-compressed Parquet file type FileTypeParquetZSTD // FileTypeXLSXGZ represents gzip-compressed Excel XLSX file type FileTypeXLSXGZ // FileTypeXLSXBZ2 represents bzip2-compressed Excel XLSX file type FileTypeXLSXBZ2 // FileTypeXLSXXZ represents xz-compressed Excel XLSX file type FileTypeXLSXXZ // FileTypeXLSXZSTD represents zstd-compressed Excel XLSX file type FileTypeXLSXZSTD // FileTypeUnsupported represents unsupported file type FileTypeUnsupported )
type MemoryInfo ΒΆ added in v0.4.4
type MemoryInfo struct { CurrentMB int64 // Current memory usage in MB LimitMB int64 // Memory limit in MB Usage float64 // Usage percentage (0.0-1.0) Status MemoryStatus // Current status }
MemoryInfo contains detailed memory usage information
type MemoryLimit ΒΆ added in v0.4.4
type MemoryLimit struct {
// contains filtered or unexported fields
}
MemoryLimit provides configurable memory limits with graceful degradation for file processing operations. It monitors heap usage and can trigger memory management actions when thresholds are exceeded.
The system supports three states:
- OK: Memory usage is within acceptable limits
- WARNING: Memory usage approaches the limit, suggesting reduced chunk sizes
- EXCEEDED: Memory usage has exceeded the limit, processing should be halted
Usage example:
limit := NewMemoryLimit(512) // 512MB limit if limit.CheckMemoryUsage() == MemoryStatusExceeded { return limit.CreateMemoryError("processing") }
Performance Note: CheckMemoryUsage() calls runtime.ReadMemStats which can pause for milliseconds. Use sparingly in hot paths.
Thread Safety: All methods are safe for concurrent use by multiple goroutines.
func NewMemoryLimit ΒΆ added in v0.4.4
func NewMemoryLimit(maxMemoryMB int64) *MemoryLimit
NewMemoryLimit creates a new memory limit configuration
func (*MemoryLimit) CheckMemoryUsage ΒΆ added in v0.4.4
func (ml *MemoryLimit) CheckMemoryUsage() MemoryStatus
CheckMemoryUsage checks current memory usage against limits
func (*MemoryLimit) CreateMemoryError ΒΆ added in v0.4.4
func (ml *MemoryLimit) CreateMemoryError(operation string) error
CreateMemoryError creates a memory limit error with helpful context
func (*MemoryLimit) Disable ΒΆ added in v0.4.4
func (ml *MemoryLimit) Disable()
Disable disables memory limit checking
func (*MemoryLimit) Enable ΒΆ added in v0.4.4
func (ml *MemoryLimit) Enable()
Enable enables memory limit checking
func (*MemoryLimit) GetMemoryInfo ΒΆ added in v0.4.4
func (ml *MemoryLimit) GetMemoryInfo() MemoryInfo
GetMemoryInfo returns current memory usage information
func (*MemoryLimit) IsEnabled ΒΆ added in v0.4.4
func (ml *MemoryLimit) IsEnabled() bool
IsEnabled returns whether memory limits are enabled
func (*MemoryLimit) SetWarningThreshold ΒΆ added in v0.4.4
func (ml *MemoryLimit) SetWarningThreshold(threshold float64)
SetWarningThreshold sets the warning threshold (0.0-1.0)
func (*MemoryLimit) ShouldReduceChunkSize ΒΆ added in v0.4.4
func (ml *MemoryLimit) ShouldReduceChunkSize(chunkSize int) (bool, int)
ShouldReduceChunkSize returns true if chunk size should be reduced for memory management
type MemoryPool ΒΆ added in v0.4.4
type MemoryPool struct {
// contains filtered or unexported fields
}
MemoryPool manages a pool of reusable byte slices, record slices, and string slices to reduce memory allocations during file processing operations.
The pool automatically manages object lifecycles and includes safeguards against memory leaks by limiting the maximum size of objects that can be returned to the pool. Objects that grow beyond maxSize are discarded rather than pooled.
Usage example:
pool := NewMemoryPool(1024 * 1024) // 1MB max buffer size buffer := pool.GetByteBuffer() defer pool.PutByteBuffer(buffer) // Use buffer...
Thread Safety: All methods are safe for concurrent use by multiple goroutines.
func NewMemoryPool ΒΆ added in v0.4.4
func NewMemoryPool(maxSize int) *MemoryPool
NewMemoryPool creates a new memory pool with configurable max buffer size
func (*MemoryPool) ForceGC ΒΆ added in v0.4.4
func (mp *MemoryPool) ForceGC()
ForceGC forces garbage collection and clears pools if memory pressure is high
func (*MemoryPool) GetByteBuffer ΒΆ added in v0.4.4
func (mp *MemoryPool) GetByteBuffer() []byte
GetByteBuffer gets a byte buffer from the pool
func (*MemoryPool) GetRecordSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) GetRecordSlice() []Record
GetRecordSlice gets a record slice from the pool
func (*MemoryPool) GetStringSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) GetStringSlice() []string
GetStringSlice gets a string slice from the pool
func (*MemoryPool) PutByteBuffer ΒΆ added in v0.4.4
func (mp *MemoryPool) PutByteBuffer(buf []byte)
PutByteBuffer returns a byte buffer to the pool if it's not too large
func (*MemoryPool) PutRecordSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) PutRecordSlice(slice []Record)
PutRecordSlice returns a record slice to the pool if it's not too large
func (*MemoryPool) PutStringSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) PutStringSlice(slice []string)
PutStringSlice returns a string slice to the pool if it's not too large
type MemoryStatus ΒΆ added in v0.4.4
type MemoryStatus int
MemoryStatus represents the current memory status
const ( // MemoryStatusOK indicates memory usage is within acceptable limits MemoryStatusOK MemoryStatus = iota // MemoryStatusWarning indicates memory usage is approaching the limit MemoryStatusWarning // MemoryStatusExceeded indicates memory usage has exceeded the limit MemoryStatusExceeded )
Memory status constants
func (MemoryStatus) String ΒΆ added in v0.4.4
func (ms MemoryStatus) String() string
String returns string representation of memory status
type OutputFormat ΒΆ added in v0.0.4
type OutputFormat int
OutputFormat represents the output file format
const ( // OutputFormatCSV represents CSV output format OutputFormatCSV OutputFormat = iota // OutputFormatTSV represents TSV output format OutputFormatTSV // OutputFormatLTSV represents LTSV output format OutputFormatLTSV // OutputFormatParquet represents Parquet output format OutputFormatParquet // OutputFormatXLSX represents Excel XLSX output format OutputFormatXLSX )
func (OutputFormat) Extension ΒΆ added in v0.2.0
func (f OutputFormat) Extension() string
Extension returns the file extension for the format
func (OutputFormat) String ΒΆ added in v0.2.0
func (f OutputFormat) String() string
String returns the string representation of OutputFormat
type Record ΒΆ added in v0.4.4
type Record []string
Record represents file records as a slice of string fields. This type was changed from unexported 'record' to exported 'Record' in v0.5.0 to fix lint issues with exported methods returning unexported types.
Breaking change: Code that previously imported and used the unexported 'record' type will need to be updated to use 'Record'.
type TableName ΒΆ added in v0.4.2
type TableName struct {
// contains filtered or unexported fields
}
TableName represents a table name with validation
func NewTableName ΒΆ added in v0.4.2
NewTableName creates a new TableName with validation