Documentation
ΒΆ
Overview ΒΆ
Package filesql provides a file-based SQL driver implementation that enables querying CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using SQLite3 SQL syntax.
filesql allows you to treat structured text files as SQL databases without any data import or transformation steps. It uses SQLite3 as an in-memory database engine, providing full SQL capabilities including JOINs, aggregations, window functions, and CTEs.
Features ΒΆ
- Query CSV, TSV, LTSV, Parquet, and Excel (XLSX) files using standard SQL
- Automatic handling of compressed files (gzip, bzip2, xz, zstandard)
- Support for multiple input sources (files, directories, io.Reader, embed.FS)
- Efficient streaming for large files with configurable chunk sizes
- Cross-platform compatibility (Linux, macOS, Windows)
- Optional auto-save functionality to persist changes
Basic Usage ΒΆ
The simplest way to use filesql is with the Open or OpenContext functions:
db, err := filesql.Open("data.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
rows, err := db.Query("SELECT * FROM data WHERE age > 25")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
Advanced Usage ΒΆ
For more complex scenarios, use the Builder pattern:
builder := filesql.NewBuilder().
AddPath("users.csv").
AddPath("orders.tsv").
EnableAutoSave("./output")
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
Table Naming ΒΆ
Table names are automatically derived from file paths:
- "users.csv" becomes table "users"
- "data.tsv.gz" becomes table "data"
- "/path/to/logs.ltsv" becomes table "logs"
- "sales.xlsx" with multiple sheets becomes tables "sales_Sheet1", "sales_Sheet2", etc.
Data Modifications ΒΆ
INSERT, UPDATE, and DELETE operations affect only the in-memory database. Original files remain unchanged unless auto-save is enabled. To persist changes manually, use the DumpDatabase function.
SQL Syntax ΒΆ
Since filesql uses SQLite3 as its underlying engine, all SQL syntax follows SQLite3's SQL dialect. This includes support for:
- Common Table Expressions (CTEs)
- Window functions
- JSON functions
- Date and time functions
- And all other SQLite3 features
Column Name Handling ΒΆ
Column names are handled with case-sensitive comparison for duplicate detection, maintaining backward compatibility. Headers with identical names after trimming whitespace (regardless of case differences) are considered duplicates and will result in an error.
For complete SQL syntax documentation, see: https://www.sqlite.org/lang.html
Index ΒΆ
- Constants
- Variables
- func DumpDatabase(db *sql.DB, outputDir string, opts ...DumpOptions) error
- func Open(paths ...string) (*sql.DB, error)
- func OpenContext(ctx context.Context, paths ...string) (*sql.DB, error)
- type ChunkSize
- type CompressionFactory
- func (f *CompressionFactory) CreateHandlerForFile(path string) CompressionHandler
- func (f *CompressionFactory) CreateReaderForFile(path string) (io.Reader, func() error, error)
- func (f *CompressionFactory) CreateWriterForFile(path string, compressionType CompressionType) (io.Writer, func() error, error)
- func (f *CompressionFactory) DetectCompressionType(path string) CompressionType
- func (f *CompressionFactory) GetBaseFileType(path string) FileType
- func (f *CompressionFactory) RemoveCompressionExtension(path string) string
- type CompressionHandler
- type CompressionType
- type DBBuilder
- func (b *DBBuilder) AddFS(filesystem fs.FS) *DBBuilder
- func (b *DBBuilder) AddPath(path string) *DBBuilder
- func (b *DBBuilder) AddPaths(paths ...string) *DBBuilder
- func (b *DBBuilder) AddReader(reader io.Reader, tableName string, fileType FileType) *DBBuilder
- func (b *DBBuilder) Build(ctx context.Context) (*DBBuilder, error)
- func (b *DBBuilder) DisableAutoSave() *DBBuilder
- func (b *DBBuilder) EnableAutoSave(outputDir string, options ...DumpOptions) *DBBuilder
- func (b *DBBuilder) EnableAutoSaveOnCommit(outputDir string, options ...DumpOptions) *DBBuilder
- func (b *DBBuilder) Open(ctx context.Context) (*sql.DB, error)
- func (b *DBBuilder) SetDefaultChunkSize(size int) *DBBuilder
- type DumpOptions
- type ErrorContext
- type FileType
- type MemoryInfo
- type MemoryLimit
- func (ml *MemoryLimit) CheckMemoryUsage() MemoryStatus
- func (ml *MemoryLimit) CreateMemoryError(operation string) error
- func (ml *MemoryLimit) Disable()
- func (ml *MemoryLimit) Enable()
- func (ml *MemoryLimit) GetMemoryInfo() MemoryInfo
- func (ml *MemoryLimit) IsEnabled() bool
- func (ml *MemoryLimit) SetWarningThreshold(threshold float64)
- func (ml *MemoryLimit) ShouldReduceChunkSize(chunkSize int) (bool, int)
- type MemoryPool
- func (mp *MemoryPool) ForceGC()
- func (mp *MemoryPool) GetByteBuffer() []byte
- func (mp *MemoryPool) GetRecordSlice() []Record
- func (mp *MemoryPool) GetStringSlice() []string
- func (mp *MemoryPool) PutByteBuffer(buf []byte)
- func (mp *MemoryPool) PutRecordSlice(slice []Record)
- func (mp *MemoryPool) PutStringSlice(slice []string)
- type MemoryStatus
- type OutputFormat
- type Record
- type TableName
Examples ΒΆ
- DBBuilder (Chaining)
- DBBuilder (ErrorHandling)
- DBBuilder.AddFS
- DBBuilder.AddFS (EmbedFS)
- DBBuilder.AddPath
- DBBuilder.AddPaths
- DBBuilder.AddReader
- DBBuilder.AddReader (Compressed)
- DBBuilder.AddReader (Multiple)
- DBBuilder.Build
- DBBuilder.EnableAutoSave
- DBBuilder.EnableAutoSaveOnCommit
- DBBuilder.Open
- DumpDatabase
- DumpDatabase (DataProcessing)
- DumpDatabase (MultipleFormats)
- DumpDatabase (WithOptions)
- DumpOptions.FileExtension
- NewBuilder
- Open
- Open (AdvancedSQL)
- Open (CompressionSupport)
- Open (Constraints)
- Open (CustomerInsights)
- Open (ErrorHandling)
- Open (FinancialDataAnalysis)
- Open (MultipleFiles)
- Open (PerformanceOptimization)
- Open (SalesAnalysis)
- Open (WebLogAnalysis)
- OpenContext
Constants ΒΆ
const ( // DefaultRowsPerChunk is the default number of rows per chunk DefaultRowsPerChunk = 1000 // DefaultChunkSize is the default chunk size (rows); alias for clarity DefaultChunkSize = DefaultRowsPerChunk // MinChunkSize is the minimum allowed rows per chunk MinChunkSize = 1 // ValidationPeekSize is the size used for validation peek operations ValidationPeekSize = 1 )
Processing constants (rows-based)
const ( // MaxSampleSize limits how many values to sample for type inference MaxSampleSize = 1000 // MinConfidenceThreshold is the minimum percentage of values that must match a type MinConfidenceThreshold = 0.8 // EarlyTerminationThreshold is the percentage of text values that triggers early termination EarlyTerminationThreshold = 0.5 // MinDatetimeLength is the minimum reasonable length for datetime values MinDatetimeLength = 4 // MaxDatetimeLength is the maximum reasonable length for datetime values MaxDatetimeLength = 35 // SamplingStratificationFactor determines when to use stratified vs simple sampling SamplingStratificationFactor = 3 // MinRealThreshold is the minimum percentage of real values needed to classify as REAL MinRealThreshold = 0.1 )
Type inference constants
Variables ΒΆ
var ( // ErrEmptyData indicates that the data source contains no records ErrEmptyData = errors.New("filesql: empty data source") // ErrUnsupportedFormat indicates an unsupported file format ErrUnsupportedFormat = errors.New("filesql: unsupported file format") // ErrInvalidData indicates malformed or invalid data ErrInvalidData = errors.New("filesql: invalid data format") // ErrNoTables indicates no tables found in database ErrNoTables = errors.New("filesql: no tables found in database") // ErrFileNotFound indicates file not found ErrFileNotFound = errors.New("filesql: file not found") // ErrPermissionDenied indicates permission denied ErrPermissionDenied = errors.New("filesql: permission denied") // ErrMemoryLimit indicates memory limit exceeded ErrMemoryLimit = errors.New("filesql: memory limit exceeded") // ErrContextCancelled indicates context was cancelled ErrContextCancelled = errors.New("filesql: context cancelled") )
Standard error messages and error creation functions for consistency
Functions ΒΆ
func DumpDatabase ΒΆ
func DumpDatabase(db *sql.DB, outputDir string, opts ...DumpOptions) error
DumpDatabase saves all database tables to files in the specified directory.
Basic usage:
err := filesql.DumpDatabase(db, "./output")
This will save all tables as CSV files in the output directory.
Advanced usage with options:
// Default: Export as CSV files err := DumpDatabase(db, "./output") // Export as TSV files with gzip compression options := NewDumpOptions(). WithFormat(OutputFormatTSV). WithCompression(CompressionGZ) err := DumpDatabase(db, "./output", options)
Example ΒΆ
ExampleDumpDatabase demonstrates exporting modified data
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Modify data in memory
_, err = db.ExecContext(ctx, `
UPDATE employees
SET salary = salary * 1.10
WHERE department_id = 1
`)
if err != nil {
log.Fatal(err)
}
// Add a new employee
_, err = db.ExecContext(ctx, `
INSERT INTO employees (id, name, department_id, salary, hire_date)
VALUES (99, 'New Employee', 2, 60000, '2024-01-01')
`)
if err != nil {
log.Fatal(err)
}
// Create output directory
outputDir := filepath.Join(tmpDir, "output")
err = os.MkdirAll(outputDir, 0750)
if err != nil {
log.Fatal(err)
}
// Export modified data
err = filesql.DumpDatabase(db, outputDir)
if err != nil {
log.Fatal(err)
}
// Verify export by reading the exported file
exportedFile := filepath.Join(outputDir, "employees.csv")
if _, err := os.Stat(exportedFile); err != nil {
log.Fatal("Exported file not found:", err)
}
// Count records in exported file
db2, err := filesql.OpenContext(ctx, exportedFile)
if err != nil {
log.Fatal(err)
}
defer db2.Close()
var count int
err = db2.QueryRowContext(ctx, "SELECT COUNT(*) FROM employees").Scan(&count)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Original file: 8 employees\n")
fmt.Printf("Modified and exported: %d employees\n", count)
// Extract just the filename for consistent output (normalize path separators for cross-platform compatibility)
exportPath := strings.Replace(exportedFile, tmpDir, "/tmp/filesql_example*", 1)
exportPath = strings.ReplaceAll(exportPath, "\\", "/") // Convert Windows backslashes to forward slashes
fmt.Printf("Export location: %s\n", exportPath)
}
Output: Original file: 8 employees Modified and exported: 9 employees Export location: /tmp/filesql_example*/output/employees.csv
Example (DataProcessing) ΒΆ
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
tempDir := filepath.Join(os.TempDir(), "filesql_processing_example")
if err := os.MkdirAll(tempDir, 0750); err != nil {
log.Fatal(err)
}
defer os.RemoveAll(tempDir)
// Open CSV file
db, err := filesql.Open(filepath.Join("testdata", "sample.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Process data with SQL
_, err = db.Exec(`
UPDATE sample
SET age = age + 1
WHERE name LIKE '%John%'
`)
if err != nil {
log.Fatal(err)
}
// Add aggregated data
_, err = db.Exec(`
INSERT INTO sample (id, name, age, email)
SELECT 999, 'Summary: ' || COUNT(*), AVG(age), 'summary@example.com'
FROM sample
WHERE id < 999
`)
if err != nil {
log.Fatal(err)
}
// Export processed data in different formats for different use cases
// 1. TSV for spreadsheet import
options := filesql.NewDumpOptions().WithFormat(filesql.OutputFormatTSV)
spreadsheetDir := filepath.Join(tempDir, "for_spreadsheet")
if err := filesql.DumpDatabase(db, spreadsheetDir, options); err != nil {
log.Fatal(err)
}
fmt.Println("Exported TSV for spreadsheet import")
// 2. Compressed CSV for archival
options = filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatCSV).
WithCompression(filesql.CompressionGZ)
archiveDir := filepath.Join(tempDir, "for_archive")
if err := filesql.DumpDatabase(db, archiveDir, options); err != nil {
log.Fatal(err)
}
fmt.Println("Exported compressed CSV for archival")
// 3. LTSV for log analysis
options = filesql.NewDumpOptions().WithFormat(filesql.OutputFormatLTSV)
logDir := filepath.Join(tempDir, "for_logs")
if err := filesql.DumpDatabase(db, logDir, options); err != nil {
log.Fatal(err)
}
fmt.Println("Exported LTSV for log analysis")
// Show what was created
dirs := []string{"for_spreadsheet", "for_archive", "for_logs"}
for _, dir := range dirs {
files, err := filepath.Glob(filepath.Join(tempDir, dir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files {
fmt.Printf("%s: %s\n", dir, filepath.Base(file))
}
}
}
Output: Exported TSV for spreadsheet import Exported compressed CSV for archival Exported LTSV for log analysis for_spreadsheet: sample.tsv for_archive: sample.csv.gz for_logs: sample.ltsv
Example (MultipleFormats) ΒΆ
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
tempDir := filepath.Join(os.TempDir(), "filesql_formats_example")
if err := os.MkdirAll(tempDir, 0750); err != nil {
log.Fatal(err)
}
defer os.RemoveAll(tempDir)
// Open CSV file and modify data
db, err := filesql.Open(filepath.Join("testdata", "sample.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Add some data to demonstrate functionality
_, err = db.Exec("INSERT INTO sample (id, name, age, email) VALUES (4, 'Alice Brown', 28, 'alice@example.com')")
if err != nil {
log.Fatal(err)
}
// Demonstrate different compression options
compressionTypes := []struct {
name string
compression filesql.CompressionType
extension string
}{
{"No compression", filesql.CompressionNone, ""},
{"Gzip compression", filesql.CompressionGZ, ".gz"},
{"XZ compression", filesql.CompressionXZ, ".xz"},
{"Zstd compression", filesql.CompressionZSTD, ".zst"},
}
for _, ct := range compressionTypes {
fmt.Printf("%s:\n", ct.name)
options := filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatCSV).
WithCompression(ct.compression)
outputDir := filepath.Join(tempDir, "compression_"+ct.compression.String())
if err := filesql.DumpDatabase(db, outputDir, options); err != nil {
log.Fatal(err)
}
files, err := filepath.Glob(filepath.Join(outputDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files {
fmt.Printf(" %s\n", filepath.Base(file))
}
}
}
Output: No compression: sample.csv Gzip compression: sample.csv.gz XZ compression: sample.csv.xz Zstd compression: sample.csv.zst
Example (WithOptions) ΒΆ
package main
import (
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create a temporary directory for output
tempDir := filepath.Join(os.TempDir(), "filesql_dump_example")
if err := os.MkdirAll(tempDir, 0750); err != nil {
log.Fatal(err)
}
defer os.RemoveAll(tempDir)
// Open CSV file
db, err := filesql.Open(filepath.Join("testdata", "sample.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Example 1: Default CSV output (no options)
fmt.Println("Example 1: Default CSV output")
csvDir := filepath.Join(tempDir, "csv_output")
if err := filesql.DumpDatabase(db, csvDir); err != nil {
log.Fatal(err)
}
// List output files
files1, err := filepath.Glob(filepath.Join(csvDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files1 {
fmt.Printf("Created: %s\n", filepath.Base(file))
}
// Example 2: TSV output with gzip compression
fmt.Println("\nExample 2: TSV output with gzip compression")
tsvDir := filepath.Join(tempDir, "tsv_output")
options := filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatTSV).
WithCompression(filesql.CompressionGZ)
if err := filesql.DumpDatabase(db, tsvDir, options); err != nil {
log.Fatal(err)
}
files2, err := filepath.Glob(filepath.Join(tsvDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files2 {
fmt.Printf("Created: %s\n", filepath.Base(file))
}
// Example 3: LTSV output with zstd compression
fmt.Println("\nExample 3: LTSV output with zstd compression")
ltsvDir := filepath.Join(tempDir, "ltsv_output")
options3 := filesql.NewDumpOptions().
WithFormat(filesql.OutputFormatLTSV).
WithCompression(filesql.CompressionZSTD)
if err := filesql.DumpDatabase(db, ltsvDir, options3); err != nil {
log.Fatal(err)
}
files3, err := filepath.Glob(filepath.Join(ltsvDir, "*"))
if err != nil {
log.Fatal(err)
}
for _, file := range files3 {
fmt.Printf("Created: %s\n", filepath.Base(file))
}
}
Output: Example 1: Default CSV output Created: sample.csv Example 2: TSV output with gzip compression Created: sample.tsv.gz Example 3: LTSV output with zstd compression Created: sample.ltsv.zst
func Open ΒΆ
Open creates an SQL database from CSV, TSV, or LTSV files.
Quick start:
db, err := filesql.Open("data.csv")
if err != nil {
return err
}
defer db.Close()
rows, err := db.Query("SELECT * FROM data WHERE age > 25")
Parameters:
- paths: One or more file paths or directories
- Files: "users.csv", "products.tsv", "logs.ltsv"
- Compressed: "data.csv.gz", "archive.tsv.bz2"
- Directories: "/data/" (loads all CSV/TSV/LTSV files recursively)
Table names:
- "users.csv" β table "users"
- "data.tsv.gz" β table "data"
- "/path/to/sales.csv" β table "sales"
Note: Original files are never modified. Changes exist only in memory. To save changes, use DumpDatabase() function.
Example with multiple files:
// Open a single CSV file
db, err := filesql.Open("data/users.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Complex query with JOINs, aggregation, and window functions
rows, err := db.Query(`
SELECT
u.name,
u.department,
u.salary,
AVG(u.salary) OVER (PARTITION BY u.department) as dept_avg_salary,
RANK() OVER (PARTITION BY u.department ORDER BY u.salary DESC) as salary_rank,
COUNT(*) OVER (PARTITION BY u.department) as dept_size
FROM users u
WHERE u.salary > (
SELECT AVG(salary) * 0.8
FROM users
WHERE department = u.department
)
ORDER BY u.department, u.salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
for rows.Next() {
var name, dept string
var salary, deptAvg float64
var rank, deptSize int
if err := rows.Scan(&name, &dept, &salary, &deptAvg, &rank, &deptSize); err != nil {
log.Fatal(err)
}
fmt.Printf("%s (%s): $%.2f (Rank: %d/%d, Dept Avg: $%.2f)\n",
name, dept, salary, rank, deptSize, deptAvg)
}
Example ΒΆ
ExampleOpen demonstrates how to use filesql.Open() with complex SQL queries. This example shows advanced SQL features including JOINs, window functions, subqueries, and aggregations on CSV data loaded into an in-memory SQLite database.
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary test data files
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
// Open the database with multiple files
db, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"), filepath.Join(tmpDir, "departments.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Complex query demonstrating multiple SQL features:
// - JOINs between tables
// - Window functions (RANK, AVG, COUNT)
// - Subqueries with correlated conditions
// - CASE statements
// - Grouping and ordering
query := `
SELECT
e.name,
d.name as department_name,
e.salary,
d.budget,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank_in_dept,
AVG(e.salary) OVER (PARTITION BY e.department_id) as dept_avg_salary,
COUNT(*) OVER (PARTITION BY e.department_id) as dept_employee_count,
CASE
WHEN e.salary > (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) * 1.2
THEN 'High Performer'
WHEN e.salary < (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) * 0.8
THEN 'Below Average'
ELSE 'Average'
END as performance_category,
ROUND(e.salary / d.budget * 100, 2) as salary_budget_percentage
FROM employees e
JOIN departments d ON e.department_id = d.id
WHERE e.salary > (
SELECT AVG(salary) * 0.7
FROM employees e2
WHERE e2.department_id = e.department_id
)
AND d.budget > 500000
ORDER BY d.name, e.salary DESC
LIMIT 10
`
rows, err := db.QueryContext(context.Background(), query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Employee Analysis Report:")
fmt.Println("========================")
for rows.Next() {
var name, deptName, perfCategory string
var salary, budget, deptAvgSalary, salaryBudgetPct float64
var salaryRank, deptEmpCount int
err := rows.Scan(&name, &deptName, &salary, &budget, &salaryRank,
&deptAvgSalary, &deptEmpCount, &perfCategory, &salaryBudgetPct)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s | %-12s | $%7.0f | Rank: %d/%d | %s\n",
name, deptName, salary, salaryRank, deptEmpCount, perfCategory)
}
if err = rows.Err(); err != nil {
log.Fatal(err)
}
}
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: Employee Analysis Report: ======================== Alice Johnson | Engineering | $ 95000 | Rank: 1/4 | Average Bob Smith | Engineering | $ 85000 | Rank: 2/4 | Average Charlie Brown | Engineering | $ 80000 | Rank: 3/4 | Average David Wilson | Engineering | $ 75000 | Rank: 4/4 | Average Eve Davis | Marketing | $ 70000 | Rank: 1/2 | Average Frank Miller | Marketing | $ 65000 | Rank: 2/2 | Average Grace Lee | Sales | $ 60000 | Rank: 1/2 | Average Henry Taylor | Sales | $ 55000 | Rank: 2/2 | Average
Example (AdvancedSQL) ΒΆ
ExampleOpen_advancedSQL demonstrates advanced SQL features available in SQLite3
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createAdvancedTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Advanced SQL Features ===")
// Window functions with RANK() and ROW_NUMBER()
fmt.Println("\n1. Window Functions - Employee Rankings by Department:")
rows, err := db.QueryContext(ctx, `
SELECT
e.name,
d.name as department,
e.salary,
RANK() OVER (PARTITION BY e.department_id ORDER BY e.salary DESC) as salary_rank,
ROW_NUMBER() OVER (ORDER BY e.salary DESC) as overall_rank
FROM employees e
JOIN departments d ON e.department_id = d.id
ORDER BY e.department_id, salary_rank
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-12s %-8s %-10s %s\n", "Name", "Department", "Salary", "Dept Rank", "Overall Rank")
fmt.Println(strings.Repeat("-", 65))
for rows.Next() {
var name, department string
var salary float64
var salaryRank, overallRank int
if err := rows.Scan(&name, &department, &salary, &salaryRank, &overallRank); err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-12s $%-7.0f %-10d %d\n", name, department, salary, salaryRank, overallRank)
}
_ = rows.Close() // Ignore close error in test cleanup
// Common Table Expressions (CTE)
fmt.Println("\n2. Common Table Expressions - Department Analysis:")
rows, err = db.QueryContext(ctx, `
WITH dept_stats AS (
SELECT
d.name as department,
COUNT(e.id) as employee_count,
AVG(e.salary) as avg_salary,
MAX(e.salary) as max_salary,
MIN(e.salary) as min_salary
FROM departments d
LEFT JOIN employees e ON d.id = e.department_id
GROUP BY d.id, d.name
),
company_avg AS (
SELECT AVG(salary) as company_avg_salary
FROM employees
)
SELECT
ds.department,
ds.employee_count,
ds.avg_salary,
ca.company_avg_salary,
ds.avg_salary - ca.company_avg_salary as salary_diff,
CASE
WHEN ds.avg_salary > ca.company_avg_salary THEN 'Above Average'
WHEN ds.avg_salary < ca.company_avg_salary THEN 'Below Average'
ELSE 'At Average'
END as comparison
FROM dept_stats ds
CROSS JOIN company_avg ca
WHERE ds.employee_count > 0
ORDER BY ds.avg_salary DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-5s %-10s %-12s %-10s %s\n", "Department", "Count", "Avg Salary", "Company Avg", "Difference", "Comparison")
fmt.Println(strings.Repeat("-", 75))
for rows.Next() {
var department, comparison string
var employeeCount int
var avgSalary, companyAvg, salaryDiff float64
if err := rows.Scan(&department, &employeeCount, &avgSalary, &companyAvg, &salaryDiff, &comparison); err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-5d $%-9.0f $%-11.0f $%-9.0f %s\n",
department, employeeCount, avgSalary, companyAvg, salaryDiff, comparison)
}
_ = rows.Close() // Ignore close error in test cleanup
// JSON operations (if data contains JSON)
fmt.Println("\n3. Text Functions - Name Analysis:")
rows, err = db.QueryContext(ctx, `
SELECT
name,
LENGTH(name) as name_length,
UPPER(SUBSTR(name, 1, 1)) || LOWER(SUBSTR(name, 2)) as formatted_name,
INSTR(name, ' ') as space_position,
CASE
WHEN INSTR(name, ' ') > 0 THEN SUBSTR(name, 1, INSTR(name, ' ') - 1)
ELSE name
END as first_name
FROM employees
WHERE LENGTH(name) > 8
ORDER BY name_length DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6s %-15s %-8s %s\n", "Name", "Length", "Formatted", "Space@", "First Name")
fmt.Println(strings.Repeat("-", 60))
for rows.Next() {
var name, formattedName, firstName string
var nameLength, spacePos int
if err := rows.Scan(&name, &nameLength, &formattedName, &spacePos, &firstName); err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6d %-15s %-8d %s\n", name, nameLength, formattedName, spacePos, firstName)
}
_ = rows.Close() // Ignore close error in test cleanup
}
// createAdvancedTestData creates test data for advanced SQL examples
func createAdvancedTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_advanced_example_*")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2023-01-15
2,Bob Smith,2,85000,2023-02-20
3,Charlie Brown,1,80000,2023-03-10
4,David Wilson,1,75000,2023-04-05
5,Eve Davis,2,65000,2023-05-15
6,Frank Miller,3,70000,2023-06-01`
employeesFile := filepath.Join(tmpDir, "employees.csv")
err = os.WriteFile(employeesFile, []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget
1,Engineering,500000
2,Sales,300000
3,Marketing,200000
4,HR,150000`
departmentsFile := filepath.Join(tmpDir, "departments.csv")
err = os.WriteFile(departmentsFile, []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Advanced SQL Features === 1. Window Functions - Employee Rankings by Department: Name Department Salary Dept Rank Overall Rank ----------------------------------------------------------------- Alice Johnson Engineering $95000 1 1 Charlie Brown Engineering $80000 2 3 David Wilson Engineering $75000 3 4 Bob Smith Sales $85000 1 2 Eve Davis Sales $65000 2 6 Frank Miller Marketing $70000 1 5 2. Common Table Expressions - Department Analysis: Department Count Avg Salary Company Avg Difference Comparison --------------------------------------------------------------------------- Engineering 3 $83333 $78333 $5000 Above Average Sales 2 $75000 $78333 $-3333 Below Average Marketing 1 $70000 $78333 $-8333 Below Average 3. Text Functions - Name Analysis: Name Length Formatted Space@ First Name ------------------------------------------------------------ Alice Johnson 13 Alice johnson 6 Alice Charlie Brown 13 Charlie brown 8 Charlie David Wilson 12 David wilson 6 David Frank Miller 12 Frank miller 6 Frank Bob Smith 9 Bob smith 4 Bob Eve Davis 9 Eve davis 4 Eve
Example (CompressionSupport) ΒΆ
ExampleOpen_compressionSupport demonstrates working with compressed files
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createCompressedTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
// Open compressed files seamlessly
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Compression Support Demo ===")
fmt.Println("Successfully loaded compressed files:")
// List all tables from compressed files
rows, err := db.QueryContext(ctx, `
SELECT name, sql
FROM sqlite_master
WHERE type='table'
ORDER BY name
`)
if err != nil {
log.Fatal(err)
}
for rows.Next() {
var tableName, createSQL string
if err := rows.Scan(&tableName, &createSQL); err != nil {
log.Fatal(err)
}
// Count records in each table
var count int
countQuery := "SELECT COUNT(*) FROM " + tableName
if err := db.QueryRowContext(ctx, countQuery).Scan(&count); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s: %d records\n", tableName, count)
}
_ = rows.Close() // Ignore close error in test cleanup
// Demonstrate querying across compressed files
fmt.Println("\nCross-file analysis from compressed data:")
analysisRows, err := db.QueryContext(ctx, `
SELECT
'logs' as source_table,
COUNT(*) as total_records,
MIN(timestamp) as earliest,
MAX(timestamp) as latest
FROM logs
UNION ALL
SELECT
'products' as source_table,
COUNT(*) as total_records,
'N/A' as earliest,
'N/A' as latest
FROM products
ORDER BY source_table
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-8s %-19s %s\n", "Table", "Records", "Earliest", "Latest")
fmt.Println(strings.Repeat("-", 60))
for analysisRows.Next() {
var sourceTable, earliest, latest string
var totalRecords int
if err := analysisRows.Scan(&sourceTable, &totalRecords, &earliest, &latest); err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-8d %-19s %s\n", sourceTable, totalRecords, earliest, latest)
}
_ = analysisRows.Close() // Ignore close error in test cleanup
}
// createCompressedTestData creates test data with compressed files
func createCompressedTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_compressed_example_*")
if err != nil {
log.Fatal(err)
}
logsData := `timestamp,level,message,user_id
2024-01-01 10:00:00,INFO,User login,1001
2024-01-01 11:30:00,INFO,Order created,1002
2024-01-01 12:15:00,ERROR,Payment failed,1003
2024-01-01 13:45:00,INFO,User logout,1001
2024-01-01 14:00:00,INFO,System backup completed,0`
logsFile := filepath.Join(tmpDir, "logs.csv")
err = os.WriteFile(logsFile, []byte(logsData), 0600)
if err != nil {
log.Fatal(err)
}
productsData := `id,name,category,price,in_stock
1,Laptop Pro,Electronics,1299.99,true
2,Office Chair,Furniture,299.99,true
3,Wireless Mouse,Electronics,49.99,false`
productsFile := filepath.Join(tmpDir, "products.csv")
err = os.WriteFile(productsFile, []byte(productsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Compression Support Demo === Successfully loaded compressed files: - logs: 5 records - products: 3 records Cross-file analysis from compressed data: Table Records Earliest Latest ------------------------------------------------------------ logs 5 2024-01-01 10:00:00 2024-01-01 14:00:00 products 3 N/A N/A
Example (Constraints) ΒΆ
ExampleOpen_constraints demonstrates the constraint that modifications don't affect original files
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
db, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Show original data count
var originalCount int
err = db.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&originalCount)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Original employee count: %d\n", originalCount)
// Insert new data (only affects in-memory database)
_, err = db.ExecContext(context.Background(), "INSERT INTO employees (id, name, department_id, salary, hire_date) VALUES (99, 'Test User', 1, 50000, '2023-01-01')")
if err != nil {
log.Fatal(err)
}
// Show in-memory count
var memoryCount int
err = db.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&memoryCount)
if err != nil {
log.Fatal(err)
}
fmt.Printf("In-memory count after INSERT: %d\n", memoryCount)
// Verify original file is unchanged by reopening
db2, err := filesql.Open(filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db2.Close()
var fileCount int
err = db2.QueryRowContext(context.Background(), "SELECT COUNT(*) FROM employees").Scan(&fileCount)
if err != nil {
log.Fatal(err)
}
fmt.Printf("File-based count (unchanged): %d\n", fileCount)
}
Output: Original employee count: 8 In-memory count after INSERT: 9 File-based count (unchanged): 8
Example (CustomerInsights) ΒΆ
ExampleOpen_customerInsights demonstrates customer behavior analysis
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createSalesTestData creates test data for sales analysis examples
func createSalesTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_sales_example")
if err != nil {
log.Fatal(err)
}
salesData := `order_id,customer_id,product_name,category,quantity,unit_price,order_date,region
1,101,Laptop Pro,Electronics,2,1299.99,2024-01-15,North
2,102,Wireless Mouse,Electronics,1,29.99,2024-01-16,South
3,103,Office Chair,Furniture,1,299.99,2024-01-17,East
4,101,USB Cable,Electronics,3,12.99,2024-01-18,North
5,104,Standing Desk,Furniture,1,599.99,2024-01-19,West
6,105,Bluetooth Speaker,Electronics,2,79.99,2024-01-20,South
7,106,Coffee Table,Furniture,1,199.99,2024-01-21,East
8,102,Keyboard,Electronics,1,89.99,2024-01-22,South
9,107,Monitor 24inch,Electronics,1,249.99,2024-01-23,North
10,103,Desk Lamp,Furniture,2,39.99,2024-01-24,East`
err = os.WriteFile(filepath.Join(tmpDir, "sales.csv"), []byte(salesData), 0600)
if err != nil {
log.Fatal(err)
}
customersData := `customer_id,name,email,city,registration_date
101,John Doe,john@example.com,New York,2023-06-01
102,Jane Smith,jane@example.com,Los Angeles,2023-07-15
103,Bob Johnson,bob@example.com,Chicago,2023-08-20
104,Alice Brown,alice@example.com,Houston,2023-09-10
105,Charlie Wilson,charlie@example.com,Phoenix,2023-10-05
106,Diana Lee,diana@example.com,Philadelphia,2023-11-12
107,Frank Miller,frank@example.com,San Antonio,2023-12-03`
err = os.WriteFile(filepath.Join(tmpDir, "customers.csv"), []byte(customersData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createSalesTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Customer lifetime value and behavior analysis
query := `
SELECT
c.name,
c.city,
COUNT(s.order_id) as total_orders,
SUM(s.quantity * s.unit_price) as lifetime_value,
AVG(s.quantity * s.unit_price) as avg_order_value,
MIN(s.order_date) as first_purchase,
MAX(s.order_date) as last_purchase,
julianday(MAX(s.order_date)) - julianday(MIN(s.order_date)) as days_active,
COUNT(DISTINCT s.category) as categories_purchased
FROM customers c
JOIN sales s ON c.customer_id = s.customer_id
GROUP BY c.customer_id, c.name, c.city
HAVING total_orders > 1
ORDER BY lifetime_value DESC
`
rows, err := db.QueryContext(ctx, query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Customer Insights (Multi-Purchase Customers):")
fmt.Println("===========================================")
fmt.Printf("%-12s %-12s %-7s %-10s %-10s %-12s %-12s %-6s %s\n",
"Name", "City", "Orders", "LTV", "Avg Order", "First Buy", "Last Buy", "Days", "Categories")
fmt.Println(strings.Repeat("-", 100))
for rows.Next() {
var name, city, firstPurchase, lastPurchase string
var totalOrders, daysActive, categoriesPurchased int
var lifetimeValue, avgOrderValue float64
err := rows.Scan(&name, &city, &totalOrders, &lifetimeValue, &avgOrderValue,
&firstPurchase, &lastPurchase, &daysActive, &categoriesPurchased)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-12s %-7d $%-9.2f $%-9.2f %-12s %-12s %-6d %d\n",
name, city, totalOrders, lifetimeValue, avgOrderValue,
firstPurchase, lastPurchase, daysActive, categoriesPurchased)
}
}
Output: Customer Insights (Multi-Purchase Customers): =========================================== Name City Orders LTV Avg Order First Buy Last Buy Days Categories ---------------------------------------------------------------------------------------------------- John Doe New York 2 $2638.95 $1319.47 2024-01-15 2024-01-18 3 1 Bob Johnson Chicago 2 $379.97 $189.99 2024-01-17 2024-01-24 7 1 Jane Smith Los Angeles 2 $119.98 $59.99 2024-01-16 2024-01-22 6 1
Example (ErrorHandling) ΒΆ
ExampleOpen_errorHandling demonstrates proper error handling patterns
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
// Example 1: Handling non-existent files gracefully
_, err := filesql.Open("nonexistent.csv")
if err != nil {
fmt.Printf("Expected error for non-existent file: %v\n", err)
}
// Example 2: Context timeout handling
ctx, cancel := context.WithTimeout(context.Background(), 1*time.Nanosecond) // Very short timeout
defer cancel()
// This will likely timeout
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
time.Sleep(10 * time.Millisecond) // Ensure timeout triggers
_, err = filesql.OpenContext(ctx, tmpDir)
if err != nil {
// Extract the core error message (context deadline exceeded)
errMsg := err.Error()
if strings.Contains(errMsg, "context deadline exceeded") {
fmt.Printf("Expected timeout error: %s\n", "context deadline exceeded")
} else {
fmt.Printf("Expected timeout error: %v\n", err)
}
}
// Example 3: Successful operation with proper error checking
ctx2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel2()
db, err := filesql.OpenContext(ctx2, tmpDir)
if err != nil {
fmt.Printf("Unexpected error: %v\n", err)
return
}
defer db.Close()
// Test query with error handling
rows, err := db.QueryContext(ctx2, "SELECT COUNT(*) FROM employees")
if err != nil {
fmt.Printf("Query error: %v\n", err)
return
}
defer rows.Close()
var count int
if rows.Next() {
if err := rows.Scan(&count); err != nil {
fmt.Printf("Scan error: %v\n", err)
return
}
fmt.Printf("Successfully counted %d employees\n", count)
}
if err := rows.Err(); err != nil {
fmt.Printf("Rows iteration error: %v\n", err)
return
}
}
Output: Expected error for non-existent file: failed to load file: path does not exist: nonexistent.csv Expected timeout error: context deadline exceeded Successfully counted 8 employees
Example (FinancialDataAnalysis) ΒΆ
ExampleOpen_financialDataAnalysis demonstrates financial data processing
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createFinancialTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Financial Data Analysis ===")
// Monthly revenue trend
fmt.Println("\n1. Monthly Revenue Trend:")
rows, err := db.QueryContext(ctx, `
SELECT
strftime('%Y-%m', transaction_date) as month,
COUNT(*) as transaction_count,
SUM(amount) as total_revenue,
AVG(amount) as avg_transaction,
MAX(amount) as largest_transaction
FROM transactions
WHERE type = 'sale'
GROUP BY month
ORDER BY month
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-8s %-6s %-10s %-8s %s\n", "Month", "Count", "Revenue", "Average", "Largest")
fmt.Println(strings.Repeat("-", 50))
for rows.Next() {
var month string
var count int
var revenue, average, largest float64
if err := rows.Scan(&month, &count, &revenue, &average, &largest); err != nil {
log.Fatal(err)
}
fmt.Printf("%-8s %-6d $%-9.2f $%-7.2f $%.2f\n", month, count, revenue, average, largest)
}
_ = rows.Close() // Ignore close error in test cleanup
// Expense category breakdown
fmt.Println("\n2. Expense Categories:")
rows, err = db.QueryContext(ctx, `
SELECT
category,
COUNT(*) as transaction_count,
SUM(ABS(amount)) as total_expense,
ROUND(SUM(ABS(amount)) * 100.0 / (
SELECT SUM(ABS(amount)) FROM transactions WHERE type = 'expense'
), 2) as percentage
FROM transactions
WHERE type = 'expense'
GROUP BY category
ORDER BY total_expense DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6s %-12s %-10s\n", "Category", "Count", "Total", "Percentage")
fmt.Println(strings.Repeat("-", 45))
for rows.Next() {
var category string
var count int
var expense, percentage float64
if err := rows.Scan(&category, &count, &expense, &percentage); err != nil {
log.Fatal(err)
}
fmt.Printf("%-15s %-6d $%-11.2f %-10.2f%%\n", category, count, expense, percentage)
}
_ = rows.Close() // Ignore close error in test cleanup
// Cash flow summary
fmt.Println("\n3. Cash Flow Summary:")
var totalIncome, totalExpenses, netIncome float64
err = db.QueryRowContext(ctx, `
SELECT
SUM(CASE WHEN type = 'sale' THEN amount ELSE 0 END) as total_income,
SUM(CASE WHEN type = 'expense' THEN ABS(amount) ELSE 0 END) as total_expenses,
SUM(CASE WHEN type = 'sale' THEN amount ELSE -ABS(amount) END) as net_income
FROM transactions
`).Scan(&totalIncome, &totalExpenses, &netIncome)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Total Income: $%.2f\n", totalIncome)
fmt.Printf("Total Expenses: $%.2f\n", totalExpenses)
fmt.Printf("Net Income: $%.2f\n", netIncome)
fmt.Printf("Profit Margin: %.2f%%\n", (netIncome/totalIncome)*100)
}
// createFinancialTestData creates test data for financial analysis examples
func createFinancialTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_financial_example_*")
if err != nil {
log.Fatal(err)
}
transactionData := `transaction_id,transaction_date,type,category,amount,description
1,2024-01-15,sale,Product Sales,1500.00,Sale of premium product
2,2024-01-20,sale,Product Sales,850.00,Sale of standard product
3,2024-01-25,sale,Service,1200.00,Consulting service
4,2024-01-10,expense,Office Supplies,-150.00,Office equipment purchase
5,2024-01-18,expense,Marketing,-250.00,Social media advertising
6,2024-02-05,sale,Product Sales,1200.00,Sale of premium product
7,2024-02-15,sale,Service,1000.00,Training service
8,2024-02-08,expense,Office Supplies,-200.00,Stationery purchase`
transactionFile := filepath.Join(tmpDir, "transactions.csv")
err = os.WriteFile(transactionFile, []byte(transactionData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Financial Data Analysis === 1. Monthly Revenue Trend: Month Count Revenue Average Largest -------------------------------------------------- 2024-01 3 $3550.00 $1183.33 $1500.00 2024-02 2 $2200.00 $1100.00 $1200.00 2. Expense Categories: Category Count Total Percentage --------------------------------------------- Office Supplies 2 $350.00 58.33 % Marketing 1 $250.00 41.67 % 3. Cash Flow Summary: Total Income: $5750.00 Total Expenses: $600.00 Net Income: $5150.00 Profit Margin: 89.57%
Example (MultipleFiles) ΒΆ
ExampleOpen_multipleFiles demonstrates opening multiple files and directories
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
// Open database with multiple paths (files and directories)
db, err := filesql.Open(tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query to show all available tables
rows, err := db.QueryContext(context.Background(), "SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Available tables:")
for rows.Next() {
var tableName string
if err := rows.Scan(&tableName); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s\n", tableName)
}
}
Output: Available tables: - departments - employees
Example (PerformanceOptimization) ΒΆ
ExampleOpen_performanceOptimization demonstrates techniques for handling large datasets efficiently
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createLargeTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Technique 1: Use LIMIT and OFFSET for pagination
fmt.Println("=== Performance Optimization Techniques ===")
fmt.Println("\n1. Pagination with LIMIT and OFFSET:")
pageSize := 3
offset := 0
for page := 1; page <= 2; page++ {
rows, err := db.QueryContext(ctx, `
SELECT customer_id, name, total_orders
FROM customer_summary
ORDER BY total_orders DESC
LIMIT ? OFFSET ?
`, pageSize, offset)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Page %d:\n", page)
for rows.Next() {
var customerID int
var name string
var totalOrders int
if err := rows.Scan(&customerID, &name, &totalOrders); err != nil {
log.Fatal(err)
}
fmt.Printf(" - %s (ID: %d, Orders: %d)\n", name, customerID, totalOrders)
}
_ = rows.Close() // Ignore close error in test cleanup
offset += pageSize
}
// Technique 2: Use indexes by querying with WHERE clauses on sorted columns
fmt.Println("\n2. Efficient filtering with indexes:")
rows, err := db.QueryContext(ctx, `
SELECT name, email, registration_date
FROM customer_summary
WHERE total_spent > 1000
ORDER BY total_spent DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("High-value customers:")
for rows.Next() {
var name, email, regDate string
if err := rows.Scan(&name, &email, ®Date); err != nil {
log.Fatal(err)
}
fmt.Printf(" - %s (%s) - Registered: %s\n", name, email, regDate)
}
// Technique 3: Aggregate queries for summary statistics
fmt.Println("\n3. Summary statistics:")
var totalCustomers int
var avgOrders, totalRevenue, avgSpent float64
err = db.QueryRowContext(ctx, `
SELECT
COUNT(*) as total_customers,
AVG(total_orders) as avg_orders,
SUM(total_spent) as total_revenue,
AVG(total_spent) as avg_spent
FROM customer_summary
`).Scan(&totalCustomers, &avgOrders, &totalRevenue, &avgSpent)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Total customers: %d\n", totalCustomers)
fmt.Printf("Average orders per customer: %.1f\n", avgOrders)
fmt.Printf("Total revenue: $%.2f\n", totalRevenue)
fmt.Printf("Average customer value: $%.2f\n", avgSpent)
}
// createLargeTestData creates test data for performance optimization examples
func createLargeTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_large_example_*")
if err != nil {
log.Fatal(err)
}
customerData := `customer_id,name,email,registration_date,total_orders,total_spent
1001,Premium Customer A,premium.a@example.com,2023-01-15,15,1500.00
1002,Premium Customer B,premium.b@example.com,2023-02-20,12,1200.00
1003,Premium Customer C,premium.c@example.com,2023-03-10,10,1000.00
1004,Regular Customer D,regular.d@example.com,2023-04-05,8,800.00
1005,Regular Customer E,regular.e@example.com,2023-05-15,6,600.00
1006,Regular Customer F,regular.f@example.com,2023-06-20,5,500.00
1007,Budget Customer G,budget.g@example.com,2023-07-10,3,300.00
1008,Budget Customer H,budget.h@example.com,2023-08-25,2,200.00
1009,New Customer I,new.i@example.com,2023-09-30,1,100.00
1010,New Customer J,new.j@example.com,2023-10-15,1,100.00`
customerFile := filepath.Join(tmpDir, "customer_summary.csv")
err = os.WriteFile(customerFile, []byte(customerData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Performance Optimization Techniques === 1. Pagination with LIMIT and OFFSET: Page 1: - Premium Customer A (ID: 1001, Orders: 15) - Premium Customer B (ID: 1002, Orders: 12) - Premium Customer C (ID: 1003, Orders: 10) Page 2: - Regular Customer D (ID: 1004, Orders: 8) - Regular Customer E (ID: 1005, Orders: 6) - Regular Customer F (ID: 1006, Orders: 5) 2. Efficient filtering with indexes: High-value customers: - Premium Customer A (premium.a@example.com) - Registered: 2023-01-15 - Premium Customer B (premium.b@example.com) - Registered: 2023-02-20 3. Summary statistics: Total customers: 10 Average orders per customer: 6.3 Total revenue: $6300.00 Average customer value: $630.00
Example (SalesAnalysis) ΒΆ
ExampleOpen_salesAnalysis demonstrates practical sales data analysis
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
// createSalesTestData creates test data for sales analysis examples
func createSalesTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_sales_example")
if err != nil {
log.Fatal(err)
}
salesData := `order_id,customer_id,product_name,category,quantity,unit_price,order_date,region
1,101,Laptop Pro,Electronics,2,1299.99,2024-01-15,North
2,102,Wireless Mouse,Electronics,1,29.99,2024-01-16,South
3,103,Office Chair,Furniture,1,299.99,2024-01-17,East
4,101,USB Cable,Electronics,3,12.99,2024-01-18,North
5,104,Standing Desk,Furniture,1,599.99,2024-01-19,West
6,105,Bluetooth Speaker,Electronics,2,79.99,2024-01-20,South
7,106,Coffee Table,Furniture,1,199.99,2024-01-21,East
8,102,Keyboard,Electronics,1,89.99,2024-01-22,South
9,107,Monitor 24inch,Electronics,1,249.99,2024-01-23,North
10,103,Desk Lamp,Furniture,2,39.99,2024-01-24,East`
err = os.WriteFile(filepath.Join(tmpDir, "sales.csv"), []byte(salesData), 0600)
if err != nil {
log.Fatal(err)
}
customersData := `customer_id,name,email,city,registration_date
101,John Doe,john@example.com,New York,2023-06-01
102,Jane Smith,jane@example.com,Los Angeles,2023-07-15
103,Bob Johnson,bob@example.com,Chicago,2023-08-20
104,Alice Brown,alice@example.com,Houston,2023-09-10
105,Charlie Wilson,charlie@example.com,Phoenix,2023-10-05
106,Diana Lee,diana@example.com,Philadelphia,2023-11-12
107,Frank Miller,frank@example.com,San Antonio,2023-12-03`
err = os.WriteFile(filepath.Join(tmpDir, "customers.csv"), []byte(customersData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createSalesTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Sales summary by category and region
query := `
SELECT
category,
region,
COUNT(*) as order_count,
SUM(quantity * unit_price) as total_revenue,
AVG(quantity * unit_price) as avg_order_value,
MIN(order_date) as first_order,
MAX(order_date) as last_order
FROM sales
GROUP BY category, region
ORDER BY total_revenue DESC
`
rows, err := db.QueryContext(ctx, query)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Sales Analysis by Category and Region:")
fmt.Println("=====================================")
fmt.Printf("%-12s %-8s %-6s %-10s %-12s %-12s %s\n",
"Category", "Region", "Orders", "Revenue", "Avg Order", "First Order", "Last Order")
fmt.Println(strings.Repeat("-", 80))
for rows.Next() {
var category, region, firstOrder, lastOrder string
var orderCount int
var totalRevenue, avgOrderValue float64
err := rows.Scan(&category, ®ion, &orderCount, &totalRevenue, &avgOrderValue, &firstOrder, &lastOrder)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-8s %-6d $%-9.2f $%-11.2f %-12s %s\n",
category, region, orderCount, totalRevenue, avgOrderValue, firstOrder, lastOrder)
}
}
Output: Sales Analysis by Category and Region: ===================================== Category Region Orders Revenue Avg Order First Order Last Order -------------------------------------------------------------------------------- Electronics North 3 $2888.94 $962.98 2024-01-15 2024-01-23 Furniture West 1 $599.99 $599.99 2024-01-19 2024-01-19 Furniture East 3 $579.96 $193.32 2024-01-17 2024-01-24 Electronics South 3 $279.96 $93.32 2024-01-16 2024-01-22
Example (WebLogAnalysis) ΒΆ
ExampleOpen_webLogAnalysis demonstrates analyzing web server logs
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"strings"
"time"
"github.com/nao1215/filesql"
)
func main() {
tmpDir := createWebLogTestData()
defer os.RemoveAll(tmpDir)
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, tmpDir)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("=== Web Log Analysis ===")
// Top pages by hits
fmt.Println("\n1. Top Pages by Hits:")
rows, err := db.QueryContext(ctx, `
SELECT
path,
COUNT(*) as hits,
COUNT(DISTINCT ip_address) as unique_visitors
FROM access_logs
WHERE status_code = 200
GROUP BY path
ORDER BY hits DESC
LIMIT 5
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-20s %-6s %s\n", "Path", "Hits", "Unique")
fmt.Println(strings.Repeat("-", 35))
for rows.Next() {
var path string
var hits, unique int
if err := rows.Scan(&path, &hits, &unique); err != nil {
log.Fatal(err)
}
fmt.Printf("%-20s %-6d %d\n", path, hits, unique)
}
_ = rows.Close() // Ignore close error in test cleanup
// Error analysis
fmt.Println("\n2. Error Analysis:")
rows, err = db.QueryContext(ctx, `
SELECT
status_code,
COUNT(*) as error_count,
ROUND(COUNT(*) * 100.0 / (SELECT COUNT(*) FROM access_logs), 2) as percentage
FROM access_logs
WHERE status_code >= 400
GROUP BY status_code
ORDER BY error_count DESC
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-12s %-6s %-10s\n", "Status Code", "Count", "Percentage")
fmt.Println(strings.Repeat("-", 30))
for rows.Next() {
var statusCode, errorCount int
var percentage float64
if err := rows.Scan(&statusCode, &errorCount, &percentage); err != nil {
log.Fatal(err)
}
fmt.Printf("%-12d %-6d %-10.2f%%\n", statusCode, errorCount, percentage)
}
_ = rows.Close() // Ignore close error in test cleanup
// Hourly traffic pattern
fmt.Println("\n3. Traffic by Hour:")
rows, err = db.QueryContext(ctx, `
SELECT
CAST(strftime('%H', timestamp) AS INTEGER) as hour,
COUNT(*) as requests,
AVG(response_time) as avg_response_time
FROM access_logs
GROUP BY hour
ORDER BY hour
`)
if err != nil {
log.Fatal(err)
}
fmt.Printf("%-5s %-9s %-12s\n", "Hour", "Requests", "Avg Response")
fmt.Println(strings.Repeat("-", 28))
for rows.Next() {
var hour, requests int
var avgResponseTime float64
if err := rows.Scan(&hour, &requests, &avgResponseTime); err != nil {
log.Fatal(err)
}
fmt.Printf("%-5d %-9d %-12.0fms\n", hour, requests, avgResponseTime)
}
_ = rows.Close() // Ignore close error in test cleanup
}
// createWebLogTestData creates test data for web log analysis examples
func createWebLogTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_weblog_example_*")
if err != nil {
log.Fatal(err)
}
accessLogsData := `timestamp,ip_address,method,path,status_code,response_time,user_agent
2024-01-01 09:15:30,192.168.1.100,GET,/,200,150,Mozilla/5.0
2024-01-01 09:30:45,192.168.1.101,GET,/products,200,200,Mozilla/5.0
2024-01-01 10:05:15,192.168.1.100,GET,/,200,120,Mozilla/5.0
2024-01-01 10:20:30,192.168.1.102,GET,/about,200,180,Mozilla/5.0
2024-01-01 10:35:45,192.168.1.101,GET,/products,200,160,Mozilla/5.0
2024-01-01 11:10:15,192.168.1.103,GET,/contact,200,140,Mozilla/5.0
2024-01-01 11:25:30,192.168.1.100,GET,/,200,200,Mozilla/5.0
2024-01-01 11:40:45,192.168.1.104,GET,/missing,404,50,Mozilla/5.0
2024-01-01 14:15:30,192.168.1.105,GET,/notfound,404,100,Mozilla/5.0`
accessLogsFile := filepath.Join(tmpDir, "access_logs.csv")
err = os.WriteFile(accessLogsFile, []byte(accessLogsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
Output: === Web Log Analysis === 1. Top Pages by Hits: Path Hits Unique ----------------------------------- / 3 1 /products 2 1 /contact 1 1 /about 1 1 2. Error Analysis: Status Code Count Percentage ------------------------------ 404 2 22.22 % 3. Traffic by Hour: Hour Requests Avg Response ---------------------------- 9 2 175 ms 10 3 153 ms 11 3 130 ms 14 1 100 ms
func OpenContext ΒΆ added in v0.0.2
OpenContext is like Open but accepts a context for cancellation and timeout control.
Use this when you need to:
- Set timeouts for loading large files
- Support cancellation in server applications
- Integrate with context-aware code
Example with timeout:
// Open a single CSV file with timeout
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
db, err := filesql.OpenContext(ctx, "data/users.csv")
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Complex query with JOINs, aggregation, and window functions
rows, err := db.QueryContext(ctx, `
SELECT
u.name,
u.department,
u.salary,
AVG(u.salary) OVER (PARTITION BY u.department) as dept_avg_salary,
RANK() OVER (PARTITION BY u.department ORDER BY u.salary DESC) as salary_rank,
COUNT(*) OVER (PARTITION BY u.department) as dept_size
FROM users u
WHERE u.salary > (
SELECT AVG(salary) * 0.8
FROM users
WHERE department = u.department
)
ORDER BY u.department, u.salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
for rows.Next() {
var name, dept string
var salary, deptAvg float64
var rank, deptSize int
if err := rows.Scan(&name, &dept, &salary, &deptAvg, &rank, &deptSize); err != nil {
log.Fatal(err)
}
fmt.Printf("%s (%s): $%.2f (Rank: %d/%d, Dept Avg: $%.2f)\n",
name, dept, salary, rank, deptSize, deptAvg)
}
Example ΒΆ
ExampleOpenContext demonstrates opening files with context support for timeout and cancellation
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
// createTempTestData creates temporary CSV files for the example
func createTempTestData() string {
tmpDir, err := os.MkdirTemp("", "filesql_example")
if err != nil {
log.Fatal(err)
}
employeesData := `id,name,department_id,salary,hire_date
1,Alice Johnson,1,95000,2020-01-15
2,Bob Smith,1,85000,2019-03-22
3,Charlie Brown,1,80000,2021-06-10
4,David Wilson,1,75000,2022-02-28
5,Eve Davis,2,70000,2020-09-15
6,Frank Miller,2,65000,2021-11-30
7,Grace Lee,3,60000,2019-12-05
8,Henry Taylor,3,55000,2022-04-18`
err = os.WriteFile(filepath.Join(tmpDir, "employees.csv"), []byte(employeesData), 0600)
if err != nil {
log.Fatal(err)
}
departmentsData := `id,name,budget,manager_id
1,Engineering,1000000,1
2,Marketing,800000,5
3,Sales,600000,7
4,HR,400000,9`
err = os.WriteFile(filepath.Join(tmpDir, "departments.csv"), []byte(departmentsData), 0600)
if err != nil {
log.Fatal(err)
}
return tmpDir
}
func main() {
tmpDir := createTempTestData()
defer os.RemoveAll(tmpDir)
// Create a context with timeout
ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel()
// Open database with context
db, err := filesql.OpenContext(ctx, filepath.Join(tmpDir, "employees.csv"))
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query with context support
rows, err := db.QueryContext(ctx, `
SELECT name, salary
FROM employees
WHERE salary > 70000
ORDER BY salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("High earners (>$70,000):")
for rows.Next() {
var name string
var salary float64
if err := rows.Scan(&name, &salary); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s: $%.0f\n", name, salary)
}
}
Output: High earners (>$70,000): - Alice Johnson: $95000 - Bob Smith: $85000 - Charlie Brown: $80000 - David Wilson: $75000
Types ΒΆ
type ChunkSize ΒΆ added in v0.4.2
type ChunkSize int
ChunkSize represents a chunk size with validation
func NewChunkSize ΒΆ added in v0.4.2
NewChunkSize creates a new ChunkSize with validation
type CompressionFactory ΒΆ added in v0.4.4
type CompressionFactory struct{}
CompressionFactory provides factory methods for compression handling
func NewCompressionFactory ΒΆ added in v0.4.4
func NewCompressionFactory() *CompressionFactory
NewCompressionFactory creates a new compression factory
func (*CompressionFactory) CreateHandlerForFile ΒΆ added in v0.4.4
func (f *CompressionFactory) CreateHandlerForFile(path string) CompressionHandler
CreateHandlerForFile creates an appropriate compression handler for a given file path
func (*CompressionFactory) CreateReaderForFile ΒΆ added in v0.4.4
CreateReaderForFile opens a file and returns a reader that handles decompression
func (*CompressionFactory) CreateWriterForFile ΒΆ added in v0.4.4
func (f *CompressionFactory) CreateWriterForFile(path string, compressionType CompressionType) (io.Writer, func() error, error)
CreateWriterForFile creates a file and returns a writer that handles compression
func (*CompressionFactory) DetectCompressionType ΒΆ added in v0.4.4
func (f *CompressionFactory) DetectCompressionType(path string) CompressionType
DetectCompressionType detects the compression type from a file path
func (*CompressionFactory) GetBaseFileType ΒΆ added in v0.4.4
func (f *CompressionFactory) GetBaseFileType(path string) FileType
GetBaseFileType determines the base file type after removing compression extensions
func (*CompressionFactory) RemoveCompressionExtension ΒΆ added in v0.4.4
func (f *CompressionFactory) RemoveCompressionExtension(path string) string
RemoveCompressionExtension removes the compression extension from a file path if present
type CompressionHandler ΒΆ added in v0.4.4
type CompressionHandler interface {
// CreateReader wraps an io.Reader with a decompression reader if needed
CreateReader(reader io.Reader) (io.Reader, func() error, error)
// CreateWriter wraps an io.Writer with a compression writer if needed
CreateWriter(writer io.Writer) (io.Writer, func() error, error)
// Extension returns the file extension for this compression type (e.g., ".gz")
Extension() string
}
CompressionHandler defines the interface for handling file compression/decompression
func NewCompressionHandler ΒΆ added in v0.4.4
func NewCompressionHandler(compressionType CompressionType) CompressionHandler
NewCompressionHandler creates a new compression handler for the given compression type
type CompressionType ΒΆ added in v0.0.4
type CompressionType int
CompressionType represents the compression type
const ( // CompressionNone represents no compression CompressionNone CompressionType = iota // CompressionGZ represents gzip compression CompressionGZ // CompressionBZ2 represents bzip2 compression CompressionBZ2 // CompressionXZ represents xz compression CompressionXZ // CompressionZSTD represents zstd compression CompressionZSTD )
func (CompressionType) Extension ΒΆ added in v0.2.0
func (c CompressionType) Extension() string
Extension returns the file extension for the compression type
func (CompressionType) String ΒΆ added in v0.2.0
func (c CompressionType) String() string
String returns the string representation of CompressionType
type DBBuilder ΒΆ added in v0.1.0
type DBBuilder struct {
// contains filtered or unexported fields
}
DBBuilder configures and creates database connections from various data sources.
Basic usage:
builder := filesql.NewBuilder().
AddPath("data.csv").
AddPath("users.tsv")
validatedBuilder, err := builder.Build(ctx)
if err != nil {
return err
}
db, err := validatedBuilder.Open(ctx)
defer db.Close()
Supports:
- File paths (AddPath)
- Embedded filesystems (AddFS)
- io.Reader streams (AddReader)
- Auto-save functionality (EnableAutoSave)
Example (Chaining) ΒΆ
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"testing/fstest"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary files
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "data1.csv")
content1 := "id,name\n1,Alice\n2,Bob\n"
os.WriteFile(csvFile, []byte(content1), 0644)
tsvFile := filepath.Join(tempDir, "data2.tsv")
content2 := "id\tproduct\n1\tLaptop\n2\tPhone\n"
os.WriteFile(tsvFile, []byte(content2), 0644)
// Create mock filesystem
mockFS := fstest.MapFS{
"logs.ltsv": &fstest.MapFile{Data: []byte("time:2024-01-01T00:00:00Z\tlevel:info\n")},
}
// Demonstrate method chaining
ctx := context.Background()
db, err := filesql.NewBuilder().
AddPath(csvFile).
AddPaths(tsvFile).
AddFS(mockFS).
Build(ctx)
if err != nil {
log.Fatal(err)
}
connection, err := db.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer connection.Close()
// Count tables from different sources
rows, err := connection.Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var tableCount int
for rows.Next() {
var tableName string
rows.Scan(&tableName)
tableCount++
}
fmt.Printf("Successfully loaded %d tables from mixed sources\n", tableCount)
}
Output: Successfully loaded 3 tables from mixed sources
Example (ErrorHandling) ΒΆ
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Example 1: Build without inputs should fail
builder := filesql.NewBuilder()
ctx := context.Background()
_, err := builder.Build(ctx)
if err != nil {
fmt.Printf("Expected error for no inputs: %v\n", err)
}
// Example 2: Open without Build should fail
builder2 := filesql.NewBuilder().AddPath("nonexistent.csv")
_, err = builder2.Open(ctx)
if err != nil {
fmt.Println("Expected error for Open without Build")
}
// Example 3: Non-existent file should fail during Build
builder3 := filesql.NewBuilder().AddPath(filepath.Join("nonexistent", "file.csv"))
_, err = builder3.Build(ctx)
if err != nil {
fmt.Println("Expected error for non-existent file")
}
// Example 4: Success case
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "valid.csv")
os.WriteFile(csvFile, []byte("id,name\n1,test\n"), 0644)
builder4 := filesql.NewBuilder().AddPath(csvFile)
validatedBuilder, err := builder4.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
fmt.Println("Success: Valid file loaded correctly")
}
Output: Expected error for no inputs: at least one path must be provided Expected error for Open without Build Expected error for non-existent file Success: Valid file loaded correctly
func NewBuilder ΒΆ added in v0.1.0
func NewBuilder() *DBBuilder
NewBuilder creates a new database builder.
Start here when you need:
- Multiple data sources (files, streams, embedded FS)
- Auto-save functionality
- Custom chunk sizes for large files
- More control than the simple Open() function
Example:
builder := filesql.NewBuilder().
AddPath("data.csv").
EnableAutoSave("./backup")
Example ΒΆ
ExampleNewBuilder demonstrates the basic usage of the Builder pattern. This is the recommended approach for most use cases, especially when working with embedded filesystems or when you need more control over the database creation process.
package main
import (
"fmt"
"github.com/nao1215/filesql"
)
func main() {
// Create a new builder - this is the starting point for all Builder pattern usage
builder := filesql.NewBuilder()
// The builder supports method chaining for a fluent API
// You can add individual paths or multiple paths at once
builder.AddPath("users.csv").AddPaths("orders.tsv", "products.ltsv")
fmt.Printf("Builder created successfully: %t\n", builder != nil)
// In real usage, you would continue with:
// ctx := context.Background()
// validatedBuilder, err := builder.Build(ctx)
// if err != nil { return err }
// db, err := validatedBuilder.Open(ctx)
// if err != nil { return err }
// defer db.Close()
}
Output: Builder created successfully: true
func (*DBBuilder) AddFS ΒΆ added in v0.1.0
AddFS adds files from an embedded filesystem (go:embed).
Automatically finds all CSV, TSV, and LTSV files in the filesystem.
Example:
//go:embed data/*.csv data/*.tsv var dataFS embed.FS builder.AddFS(dataFS)
Returns self for chaining.
Example ΒΆ
package main
import (
"context"
"fmt"
"log"
"testing/fstest"
"github.com/nao1215/filesql"
)
func main() {
// Create mock filesystem with test data
mockFS := fstest.MapFS{
"users.csv": &fstest.MapFile{Data: []byte("id,name,department\n1,Alice,Engineering\n2,Bob,Sales\n")},
"products.tsv": &fstest.MapFile{Data: []byte("id\tname\tprice\n1\tLaptop\t1000\n2\tPhone\t500\n")},
"logs.ltsv": &fstest.MapFile{Data: []byte("time:2024-01-01T00:00:00Z\tlevel:info\tmsg:started\n")},
"readme.txt": &fstest.MapFile{Data: []byte("This file will be ignored\n")}, // unsupported format
}
// Use builder with filesystem
builder := filesql.NewBuilder().AddFS(mockFS)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Clean up temporary files
// List all tables that were created from the filesystem
rows, err := db.Query("SELECT name FROM sqlite_master WHERE type='table' ORDER BY name")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var tableCount int
for rows.Next() {
var tableName string
rows.Scan(&tableName)
tableCount++
}
fmt.Printf("Created %d tables from filesystem\n", tableCount)
}
Output: Created 3 tables from filesystem
Example (EmbedFS) ΒΆ
package main
import (
"context"
"embed"
"fmt"
"io/fs"
"log"
"github.com/nao1215/filesql"
)
//go:embed testdata/embed_test/*.csv testdata/embed_test/*.tsv
var builderExampleFS embed.FS
func main() {
// Use embedded test filesystem
subFS, err := fs.Sub(builderExampleFS, "testdata/embed_test")
if err != nil {
log.Fatal(err)
}
// Use builder with embedded filesystem
builder := filesql.NewBuilder().AddFS(subFS)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Clean up temporary files
// Count the number of tables created from embedded files
rows, err := db.Query("SELECT COUNT(*) FROM sqlite_master WHERE type='table'")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var tableCount int
if rows.Next() {
rows.Scan(&tableCount)
}
fmt.Printf("Created %d tables from embedded files\n", tableCount)
}
Output: Created 3 tables from embedded files
func (*DBBuilder) AddPath ΒΆ added in v0.1.0
AddPath adds a file or directory to load.
Examples:
- Single file: AddPath("users.csv")
- Compressed: AddPath("data.tsv.gz")
- Directory: AddPath("/data/") // loads all CSV/TSV/LTSV files
Returns self for chaining.
Example ΒΆ
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary CSV file for example
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "users.csv")
content := "id,name,age\n1,Alice,30\n2,Bob,25\n"
os.WriteFile(csvFile, []byte(content), 0644)
// Use builder to add a single file path
builder := filesql.NewBuilder().AddPath(csvFile)
// Build and open database
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query the data
rows, err := db.Query("SELECT COUNT(*) FROM users")
if err != nil {
log.Fatal(err)
}
defer rows.Close()
var count int
if rows.Next() {
rows.Scan(&count)
}
fmt.Printf("Number of users: %d\n", count)
}
Output: Number of users: 2
func (*DBBuilder) AddPaths ΒΆ added in v0.1.0
AddPaths adds multiple files or directories at once.
Example:
builder.AddPaths("users.csv", "products.tsv", "/data/logs/")
Returns self for chaining.
Example ΒΆ
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary files for example
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
// Create users.csv
usersFile := filepath.Join(tempDir, "users.csv")
usersContent := "id,name\n1,Alice\n2,Bob\n"
os.WriteFile(usersFile, []byte(usersContent), 0644)
// Create products.csv
productsFile := filepath.Join(tempDir, "products.csv")
productsContent := "id,product_name\n1,Laptop\n2,Phone\n"
os.WriteFile(productsFile, []byte(productsContent), 0644)
// Use builder to add multiple file paths
builder := filesql.NewBuilder().AddPaths(usersFile, productsFile)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query data from both tables
rows, err := db.Query(`
SELECT u.name, p.product_name
FROM users u
JOIN products p ON u.id = p.id
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
for rows.Next() {
var name, product string
rows.Scan(&name, &product)
fmt.Printf("%s has %s\n", name, product)
}
}
Output: Alice has Laptop Bob has Phone
func (*DBBuilder) AddReader ΒΆ added in v0.2.0
AddReader adds data from an io.Reader (file, network stream, etc.).
Parameters:
- reader: Any io.Reader (file, bytes.Buffer, http.Response.Body, etc.)
- tableName: Name for the SQL table (e.g., "users")
- fileType: Data format (FileTypeCSV, FileTypeTSV, FileTypeLTSV, etc.)
Example:
resp, _ := http.Get("https://example.com/data.csv")
builder.AddReader(resp.Body, "remote_data", FileTypeCSV)
Returns self for chaining.
Example ΒΆ
ExampleDBBuilder_AddReader demonstrates using io.Reader as a data source
package main
import (
"context"
"fmt"
"log"
"strings"
"github.com/nao1215/filesql"
)
func main() {
// CSV data from an io.Reader (could be from network, API response, etc.)
csvData := `id,name,department,salary
1,Alice,Engineering,95000
2,Bob,Sales,78000
3,Charlie,Engineering,102000
4,Diana,Marketing,85000`
// Create a reader from the CSV data
reader := strings.NewReader(csvData)
// Build database with Reader input
ctx := context.Background()
builder := filesql.NewBuilder().
AddReader(reader, "employees", filesql.FileTypeCSV). // Specify table name and type explicitly
SetDefaultChunkSize(5000) // Set 5000 rows per chunk for large data
// Build validates the input
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
// Open creates the database connection
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query the data
rows, err := db.Query(`
SELECT name, department, salary
FROM employees
WHERE salary > 80000
ORDER BY salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
// Process results
fmt.Println("High earners (salary > 80,000):")
for rows.Next() {
var name, dept string
var salary int
if err := rows.Scan(&name, &dept, &salary); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s (%s): $%d\n", name, dept, salary)
}
}
Output: High earners (salary > 80,000): - Charlie (Engineering): $102000 - Alice (Engineering): $95000 - Diana (Marketing): $85000
Example (Compressed) ΒΆ
ExampleDBBuilder_AddReader_compressed demonstrates using compressed data from io.Reader
package main
import (
"bytes"
"context"
"fmt"
"log"
"github.com/nao1215/filesql"
)
func main() {
// Simulate compressed TSV data (in practice, this would be actual compressed data)
tsvData := "product_id\tproduct_name\tprice\n1\tLaptop\t999\n2\tMouse\t25\n3\tKeyboard\t75"
reader := bytes.NewReader([]byte(tsvData))
ctx := context.Background()
builder := filesql.NewBuilder().
// Specify that this is TSV data (not actually compressed in this example)
AddReader(reader, "products", filesql.FileTypeTSV)
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Query the products
var count int
err = db.QueryRow("SELECT COUNT(*) FROM products WHERE price < 100").Scan(&count)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Products under $100: %d\n", count)
}
Output: Products under $100: 2
Example (Multiple) ΒΆ
ExampleDBBuilder_AddReader_multiple demonstrates combining multiple readers with files
package main
import (
"context"
"fmt"
"log"
"strings"
"github.com/nao1215/filesql"
)
func main() {
// First reader: Users data
usersCSV := `user_id,name,email
1,Alice,alice@example.com
2,Bob,bob@example.com`
// Second reader: Orders data
ordersCSV := `order_id,user_id,amount
101,1,250
102,2,180
103,1,320`
ctx := context.Background()
builder := filesql.NewBuilder().
AddReader(strings.NewReader(usersCSV), "users", filesql.FileTypeCSV).
AddReader(strings.NewReader(ordersCSV), "orders", filesql.FileTypeCSV).
SetDefaultChunkSize(2500) // 2500 rows per chunk
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Join query across both tables
rows, err := db.Query(`
SELECT u.name, SUM(o.amount) as total_spent
FROM users u
JOIN orders o ON u.user_id = o.user_id
GROUP BY u.user_id, u.name
ORDER BY total_spent DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
fmt.Println("Customer spending:")
for rows.Next() {
var name string
var total float64
if err := rows.Scan(&name, &total); err != nil {
log.Fatal(err)
}
fmt.Printf("- %s: $%.0f\n", name, total)
}
}
Output: Customer spending: - Alice: $570 - Bob: $180
func (*DBBuilder) Build ΒΆ added in v0.1.0
Build validates all configured inputs and prepares the builder for opening a database. This method must be called before Open(). It performs the following operations:
1. Validates that at least one input source is configured 2. Checks existence and format of all file paths 3. Processes embedded filesystems by converting files to streaming readers 4. Validates that all files have supported extensions
After successful validation, the builder is ready to create database connections with Open(). The context is used for file operations and can be used for cancellation.
Returns the same builder instance for method chaining, or an error if validation fails.
Example ΒΆ
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary CSV file
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "data.csv")
content := "name,value\ntest,123\n"
os.WriteFile(csvFile, []byte(content), 0644)
// Build validates inputs and prepares for opening
builder := filesql.NewBuilder().AddPath(csvFile)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
fmt.Printf("Builder validated successfully: %t\n", validatedBuilder != nil)
}
Output: Builder validated successfully: true
func (*DBBuilder) DisableAutoSave ΒΆ added in v0.1.0
DisableAutoSave disables automatic saving (default behavior). Returns the builder for method chaining.
func (*DBBuilder) EnableAutoSave ΒΆ added in v0.1.0
func (b *DBBuilder) EnableAutoSave(outputDir string, options ...DumpOptions) *DBBuilder
EnableAutoSave automatically saves changes when the database is closed.
Parameters:
- outputDir: Where to save files
- "" (empty): Overwrite original files
- "./backup": Save to backup directory
Example:
builder.AddPath("data.csv").
EnableAutoSave("") // Auto-save to original file on db.Close()
Returns self for chaining.
Example ΒΆ
ExampleDBBuilder_EnableAutoSave demonstrates automatic saving on database close. This feature automatically saves modified data when the database connection is closed, providing a convenient way to persist changes without manual intervention.
package main
import (
"context"
"fmt"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary directory and test file
tempDir, _ := os.MkdirTemp("", "filesql-autosave-example")
defer os.RemoveAll(tempDir)
// Create sample CSV file
csvPath := filepath.Join(tempDir, "employees.csv")
csvContent := "name,department,salary\nAlice,Engineering,80000\nBob,Marketing,65000\n"
_ = os.WriteFile(csvPath, []byte(csvContent), 0600)
// Create output directory
outputDir := filepath.Join(tempDir, "backup")
_ = os.MkdirAll(outputDir, 0750)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Configure builder with auto-save on close
builder := filesql.NewBuilder().
AddPath(csvPath).
EnableAutoSave(outputDir, filesql.NewDumpOptions()) // Save to backup directory on close
validatedBuilder, _ := builder.Build(ctx)
db, _ := validatedBuilder.Open(ctx)
// Modify data - this will be automatically saved when db.Close() is called
_, _ = db.ExecContext(ctx, "INSERT INTO employees (name, department, salary) VALUES ('Charlie', 'Sales', 70000)")
_, _ = db.ExecContext(ctx, "UPDATE employees SET salary = 85000 WHERE name = 'Alice'")
// Close database - triggers automatic save to backup directory
_ = db.Close()
// Verify the backup file was created and contains our changes
backupFile := filepath.Join(outputDir, "employees.csv")
if _, err := os.Stat(backupFile); err == nil {
fmt.Println("Auto-save completed successfully")
}
}
Output: Auto-save completed successfully
func (*DBBuilder) EnableAutoSaveOnCommit ΒΆ added in v0.1.0
func (b *DBBuilder) EnableAutoSaveOnCommit(outputDir string, options ...DumpOptions) *DBBuilder
EnableAutoSaveOnCommit automatically saves changes after each transaction commit.
Use this for real-time persistence. Note: May impact performance.
Example:
builder.AddPath("data.csv").
EnableAutoSaveOnCommit("./output") // Save after each commit
Returns self for chaining.
Example ΒΆ
ExampleDBBuilder_EnableAutoSaveOnCommit demonstrates automatic saving on transaction commit. This provides more frequent saves but may impact performance for workloads with many commits.
package main
import (
"context"
"fmt"
"os"
"path/filepath"
"time"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary directory and test file
tempDir, _ := os.MkdirTemp("", "filesql-commit-save-example")
defer os.RemoveAll(tempDir)
// Create sample CSV file
csvPath := filepath.Join(tempDir, "transactions.csv")
csvContent := "id,amount,status\n1,100.50,pending\n2,250.75,pending\n"
_ = os.WriteFile(csvPath, []byte(csvContent), 0600)
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Second)
defer cancel()
// Configure builder with auto-save on commit
// Using temp directory to keep the example self-contained
builder := filesql.NewBuilder().
AddPath(csvPath).
EnableAutoSaveOnCommit(tempDir, filesql.NewDumpOptions()) // Save to temp directory on each commit
validatedBuilder, _ := builder.Build(ctx)
db, _ := validatedBuilder.Open(ctx)
defer func() { _ = db.Close() }()
// Start transaction
tx, _ := db.BeginTx(ctx, nil)
// Process transactions within the transaction
_, _ = tx.ExecContext(ctx, "UPDATE transactions SET status = 'completed' WHERE id = 1")
_, _ = tx.ExecContext(ctx, "INSERT INTO transactions (id, amount, status) VALUES (3, 175.25, 'completed')")
// Commit transaction - triggers automatic save
_ = tx.Commit()
fmt.Println("Transaction committed with auto-save")
}
Output: Transaction committed with auto-save
func (*DBBuilder) Open ΒΆ added in v0.1.0
Open creates and returns a database connection using the configured and validated inputs. This method can only be called after Build() has been successfully executed. It creates an in-memory SQLite database and loads all configured files as tables using streaming.
Table names are derived from file names without extensions: - "users.csv" becomes table "users" - "data.tsv.gz" becomes table "data"
The returned database connection supports the full SQLite3 SQL syntax. Auto-save functionality is supported for both file paths and reader inputs. The caller is responsible for closing the connection when done.
Returns a *sql.DB connection or an error if the database cannot be created.
Example ΒΆ
package main
import (
"context"
"fmt"
"log"
"os"
"path/filepath"
"github.com/nao1215/filesql"
)
func main() {
// Create temporary CSV file
tempDir, _ := os.MkdirTemp("", "filesql-example")
defer os.RemoveAll(tempDir)
csvFile := filepath.Join(tempDir, "employees.csv")
content := "id,name,salary\n1,Alice,50000\n2,Bob,60000\n3,Charlie,55000\n"
os.WriteFile(csvFile, []byte(content), 0644)
// Complete builder workflow: AddPath -> Build -> Open
builder := filesql.NewBuilder().AddPath(csvFile)
ctx := context.Background()
validatedBuilder, err := builder.Build(ctx)
if err != nil {
log.Fatal(err)
}
db, err := validatedBuilder.Open(ctx)
if err != nil {
log.Fatal(err)
}
defer db.Close()
// Perform complex SQL query
rows, err := db.Query(`
SELECT name, salary,
salary - (SELECT AVG(salary) FROM employees) as salary_diff
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees)
ORDER BY salary DESC
`)
if err != nil {
log.Fatal(err)
}
defer rows.Close()
for rows.Next() {
var name string
var salary, diff float64
rows.Scan(&name, &salary, &diff)
fmt.Printf("%s: $%.0f (+$%.0f above average)\n", name, salary, diff)
}
}
Output: Bob: $60000 (+$5000 above average)
func (*DBBuilder) SetDefaultChunkSize ΒΆ added in v0.2.0
SetDefaultChunkSize sets chunk size (number of rows) for large file processing.
Default: 1000 rows. Adjust based on available memory and processing needs.
Example:
builder.SetDefaultChunkSize(5000) // 5000 rows per chunk
Returns self for chaining.
type DumpOptions ΒΆ added in v0.0.4
type DumpOptions struct {
// Format specifies the output file format
Format OutputFormat
// Compression specifies the compression type
Compression CompressionType
}
DumpOptions configures how database tables are exported to files.
Example:
options := NewDumpOptions(). WithFormat(OutputFormatTSV). WithCompression(CompressionGZ) err := DumpDatabase(db, "./output", options)
func NewDumpOptions ΒΆ added in v0.0.4
func NewDumpOptions() DumpOptions
NewDumpOptions creates default export options (CSV, no compression).
Modify with:
- WithFormat(): Change file format (CSV, TSV, LTSV)
- WithCompression(): Add compression (GZ, BZ2, XZ, ZSTD)
func (DumpOptions) FileExtension ΒΆ added in v0.2.0
func (o DumpOptions) FileExtension() string
FileExtension returns the complete file extension including compression
Example ΒΆ
package main
import (
"fmt"
"github.com/nao1215/filesql"
)
func main() {
// Show how file extensions are built
examples := []struct {
format filesql.OutputFormat
compression filesql.CompressionType
}{
{filesql.OutputFormatCSV, filesql.CompressionNone},
{filesql.OutputFormatTSV, filesql.CompressionGZ},
{filesql.OutputFormatLTSV, filesql.CompressionBZ2},
{filesql.OutputFormatCSV, filesql.CompressionXZ},
{filesql.OutputFormatTSV, filesql.CompressionZSTD},
}
for _, ex := range examples {
options := filesql.DumpOptions{
Format: ex.format,
Compression: ex.compression,
}
fmt.Printf("Format: %-4s, Compression: %-4s -> Extension: %s\n",
ex.format.String(),
ex.compression.String(),
options.FileExtension())
}
}
Output: Format: csv , Compression: none -> Extension: .csv Format: tsv , Compression: gz -> Extension: .tsv.gz Format: ltsv, Compression: bz2 -> Extension: .ltsv.bz2 Format: csv , Compression: xz -> Extension: .csv.xz Format: tsv , Compression: zstd -> Extension: .tsv.zst
func (DumpOptions) WithCompression ΒΆ added in v0.2.0
func (o DumpOptions) WithCompression(compression CompressionType) DumpOptions
WithCompression adds compression to output files.
Options:
- CompressionNone: No compression (default)
- CompressionGZ: Gzip compression (.gz)
- CompressionBZ2: Bzip2 compression (.bz2)
- CompressionXZ: XZ compression (.xz)
- CompressionZSTD: Zstandard compression (.zst)
func (DumpOptions) WithFormat ΒΆ added in v0.2.0
func (o DumpOptions) WithFormat(format OutputFormat) DumpOptions
WithFormat sets the output file format.
Options:
- OutputFormatCSV: Comma-separated values
- OutputFormatTSV: Tab-separated values
- OutputFormatLTSV: Labeled tab-separated values
- OutputFormatParquet: Apache Parquet columnar format
type ErrorContext ΒΆ added in v0.4.2
ErrorContext provides context for where an error occurred
func NewErrorContext ΒΆ added in v0.4.2
func NewErrorContext(operation, filePath string) *ErrorContext
NewErrorContext creates a new error context
func (*ErrorContext) Error ΒΆ added in v0.4.2
func (ec *ErrorContext) Error(baseErr error) error
Error creates a formatted error with context
func (*ErrorContext) WithDetails ΒΆ added in v0.4.2
func (ec *ErrorContext) WithDetails(details string) *ErrorContext
WithDetails adds details to the error context
func (*ErrorContext) WithTable ΒΆ added in v0.4.2
func (ec *ErrorContext) WithTable(tableName string) *ErrorContext
WithTable adds table context to the error
type FileType ΒΆ added in v0.2.0
type FileType int
FileType represents supported file types including compression variants
const ( // FileTypeCSV represents CSV file type FileTypeCSV FileType = iota // FileTypeTSV represents TSV file type FileTypeTSV // FileTypeLTSV represents LTSV file type FileTypeLTSV // FileTypeParquet represents Parquet file type FileTypeParquet // FileTypeXLSX represents Excel XLSX file type FileTypeXLSX // FileTypeCSVGZ represents gzip-compressed CSV file type FileTypeCSVGZ // FileTypeTSVGZ represents gzip-compressed TSV file type FileTypeTSVGZ // FileTypeLTSVGZ represents gzip-compressed LTSV file type FileTypeLTSVGZ // FileTypeParquetGZ represents gzip-compressed Parquet file type FileTypeParquetGZ // FileTypeCSVBZ2 represents bzip2-compressed CSV file type FileTypeCSVBZ2 // FileTypeTSVBZ2 represents bzip2-compressed TSV file type FileTypeTSVBZ2 // FileTypeLTSVBZ2 represents bzip2-compressed LTSV file type FileTypeLTSVBZ2 // FileTypeParquetBZ2 represents bzip2-compressed Parquet file type FileTypeParquetBZ2 // FileTypeCSVXZ represents xz-compressed CSV file type FileTypeCSVXZ // FileTypeTSVXZ represents xz-compressed TSV file type FileTypeTSVXZ // FileTypeLTSVXZ represents xz-compressed LTSV file type FileTypeLTSVXZ // FileTypeParquetXZ represents xz-compressed Parquet file type FileTypeParquetXZ // FileTypeCSVZSTD represents zstd-compressed CSV file type FileTypeCSVZSTD // FileTypeTSVZSTD represents zstd-compressed TSV file type FileTypeTSVZSTD // FileTypeLTSVZSTD represents zstd-compressed LTSV file type FileTypeLTSVZSTD // FileTypeParquetZSTD represents zstd-compressed Parquet file type FileTypeParquetZSTD // FileTypeXLSXGZ represents gzip-compressed Excel XLSX file type FileTypeXLSXGZ // FileTypeXLSXBZ2 represents bzip2-compressed Excel XLSX file type FileTypeXLSXBZ2 // FileTypeXLSXXZ represents xz-compressed Excel XLSX file type FileTypeXLSXXZ // FileTypeXLSXZSTD represents zstd-compressed Excel XLSX file type FileTypeXLSXZSTD // FileTypeUnsupported represents unsupported file type FileTypeUnsupported )
type MemoryInfo ΒΆ added in v0.4.4
type MemoryInfo struct {
CurrentMB int64 // Current memory usage in MB
LimitMB int64 // Memory limit in MB
Usage float64 // Usage percentage (0.0-1.0)
Status MemoryStatus // Current status
}
MemoryInfo contains detailed memory usage information
type MemoryLimit ΒΆ added in v0.4.4
type MemoryLimit struct {
// contains filtered or unexported fields
}
MemoryLimit provides configurable memory limits with graceful degradation for file processing operations. It monitors heap usage and can trigger memory management actions when thresholds are exceeded.
The system supports three states:
- OK: Memory usage is within acceptable limits
- WARNING: Memory usage approaches the limit, suggesting reduced chunk sizes
- EXCEEDED: Memory usage has exceeded the limit, processing should be halted
Usage example:
limit := NewMemoryLimit(512) // 512MB limit
if limit.CheckMemoryUsage() == MemoryStatusExceeded {
return limit.CreateMemoryError("processing")
}
Performance Note: CheckMemoryUsage() calls runtime.ReadMemStats which can pause for milliseconds. Use sparingly in hot paths.
Thread Safety: All methods are safe for concurrent use by multiple goroutines.
func NewMemoryLimit ΒΆ added in v0.4.4
func NewMemoryLimit(maxMemoryMB int64) *MemoryLimit
NewMemoryLimit creates a new memory limit configuration
func (*MemoryLimit) CheckMemoryUsage ΒΆ added in v0.4.4
func (ml *MemoryLimit) CheckMemoryUsage() MemoryStatus
CheckMemoryUsage checks current memory usage against limits
func (*MemoryLimit) CreateMemoryError ΒΆ added in v0.4.4
func (ml *MemoryLimit) CreateMemoryError(operation string) error
CreateMemoryError creates a memory limit error with helpful context
func (*MemoryLimit) Disable ΒΆ added in v0.4.4
func (ml *MemoryLimit) Disable()
Disable disables memory limit checking
func (*MemoryLimit) Enable ΒΆ added in v0.4.4
func (ml *MemoryLimit) Enable()
Enable enables memory limit checking
func (*MemoryLimit) GetMemoryInfo ΒΆ added in v0.4.4
func (ml *MemoryLimit) GetMemoryInfo() MemoryInfo
GetMemoryInfo returns current memory usage information
func (*MemoryLimit) IsEnabled ΒΆ added in v0.4.4
func (ml *MemoryLimit) IsEnabled() bool
IsEnabled returns whether memory limits are enabled
func (*MemoryLimit) SetWarningThreshold ΒΆ added in v0.4.4
func (ml *MemoryLimit) SetWarningThreshold(threshold float64)
SetWarningThreshold sets the warning threshold (0.0-1.0)
func (*MemoryLimit) ShouldReduceChunkSize ΒΆ added in v0.4.4
func (ml *MemoryLimit) ShouldReduceChunkSize(chunkSize int) (bool, int)
ShouldReduceChunkSize returns true if chunk size should be reduced for memory management
type MemoryPool ΒΆ added in v0.4.4
type MemoryPool struct {
// contains filtered or unexported fields
}
MemoryPool manages a pool of reusable byte slices, record slices, and string slices to reduce memory allocations during file processing operations.
The pool automatically manages object lifecycles and includes safeguards against memory leaks by limiting the maximum size of objects that can be returned to the pool. Objects that grow beyond maxSize are discarded rather than pooled.
Usage example:
pool := NewMemoryPool(1024 * 1024) // 1MB max buffer size buffer := pool.GetByteBuffer() defer pool.PutByteBuffer(buffer) // Use buffer...
Thread Safety: All methods are safe for concurrent use by multiple goroutines.
func NewMemoryPool ΒΆ added in v0.4.4
func NewMemoryPool(maxSize int) *MemoryPool
NewMemoryPool creates a new memory pool with configurable max buffer size
func (*MemoryPool) ForceGC ΒΆ added in v0.4.4
func (mp *MemoryPool) ForceGC()
ForceGC forces garbage collection and clears pools if memory pressure is high
func (*MemoryPool) GetByteBuffer ΒΆ added in v0.4.4
func (mp *MemoryPool) GetByteBuffer() []byte
GetByteBuffer gets a byte buffer from the pool
func (*MemoryPool) GetRecordSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) GetRecordSlice() []Record
GetRecordSlice gets a record slice from the pool
func (*MemoryPool) GetStringSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) GetStringSlice() []string
GetStringSlice gets a string slice from the pool
func (*MemoryPool) PutByteBuffer ΒΆ added in v0.4.4
func (mp *MemoryPool) PutByteBuffer(buf []byte)
PutByteBuffer returns a byte buffer to the pool if it's not too large
func (*MemoryPool) PutRecordSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) PutRecordSlice(slice []Record)
PutRecordSlice returns a record slice to the pool if it's not too large
func (*MemoryPool) PutStringSlice ΒΆ added in v0.4.4
func (mp *MemoryPool) PutStringSlice(slice []string)
PutStringSlice returns a string slice to the pool if it's not too large
type MemoryStatus ΒΆ added in v0.4.4
type MemoryStatus int
MemoryStatus represents the current memory status
const ( // MemoryStatusOK indicates memory usage is within acceptable limits MemoryStatusOK MemoryStatus = iota // MemoryStatusWarning indicates memory usage is approaching the limit MemoryStatusWarning // MemoryStatusExceeded indicates memory usage has exceeded the limit MemoryStatusExceeded )
Memory status constants
func (MemoryStatus) String ΒΆ added in v0.4.4
func (ms MemoryStatus) String() string
String returns string representation of memory status
type OutputFormat ΒΆ added in v0.0.4
type OutputFormat int
OutputFormat represents the output file format
const ( // OutputFormatCSV represents CSV output format OutputFormatCSV OutputFormat = iota // OutputFormatTSV represents TSV output format OutputFormatTSV // OutputFormatLTSV represents LTSV output format OutputFormatLTSV // OutputFormatParquet represents Parquet output format OutputFormatParquet // OutputFormatXLSX represents Excel XLSX output format OutputFormatXLSX )
func (OutputFormat) Extension ΒΆ added in v0.2.0
func (f OutputFormat) Extension() string
Extension returns the file extension for the format
func (OutputFormat) String ΒΆ added in v0.2.0
func (f OutputFormat) String() string
String returns the string representation of OutputFormat
type Record ΒΆ added in v0.4.4
type Record []string
Record represents file records as a slice of string fields. This type was changed from unexported 'record' to exported 'Record' in v0.5.0 to fix lint issues with exported methods returning unexported types.
Breaking change: Code that previously imported and used the unexported 'record' type will need to be updated to use 'Record'.
type TableName ΒΆ added in v0.4.2
type TableName struct {
// contains filtered or unexported fields
}
TableName represents a table name with validation
func NewTableName ΒΆ added in v0.4.2
NewTableName creates a new TableName with validation
