io

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 27, 2026 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package io defines the I/O pipeline framework for Pulse: Reader/Writer interfaces, schema inference, and job types (ImportJob, ExportJob, ConvertJob).

Format-specific adapters live in sub-packages (csv, tsv).

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ErrStopIteration

func ErrStopIteration() error

ErrStopIteration returns the stop iteration sentinel for use by readers.

Types

type ConvertJob

type ConvertJob struct {
	Source      Reader
	Target      Writer
	Schema      *encoding.Schema
	KeepPulseAt string // optional: also write intermediate .pulse
	SampleRows  int
	FS          afero.Fs
}

ConvertJob chains import and export with no intermediate file on disk (unless KeepPulseAt is set).

func NewConvertJob

func NewConvertJob(source Reader, target Writer) *ConvertJob

NewConvertJob creates a ConvertJob with default settings.

func (*ConvertJob) Predict

func (j *ConvertJob) Predict(ctx context.Context) (*PredictReport, error)

Predict validates the conversion without writing.

func (*ConvertJob) Run

func (j *ConvertJob) Run(ctx context.Context) (*ConvertReport, error)

Run executes the convert job, streaming from source to target. If KeepPulseAt is set, also writes an intermediate .pulse file.

type ConvertReport

type ConvertReport struct {
	RowsConverted int
	Schema        *encoding.Schema
	RowErrors     []RowError
}

ConvertReport summarizes the result of a convert operation.

type ExportJob

type ExportJob struct {
	Source string // input .pulse path
	Target Writer
	FS     afero.Fs
}

ExportJob converts a .pulse file into tabular output.

func NewExportJob

func NewExportJob(source string, target Writer) *ExportJob

NewExportJob creates an ExportJob.

func (*ExportJob) Predict

func (j *ExportJob) Predict(ctx context.Context) (*PredictReport, error)

Predict validates the export without writing.

func (*ExportJob) Run

func (j *ExportJob) Run(ctx context.Context) (*ExportReport, error)

Run executes the export job, converting a .pulse file to a tabular target.

type ExportReport

type ExportReport struct {
	RowsExported int
	RowErrors    []RowError
}

ExportReport summarizes the result of an export operation.

type ImportJob

type ImportJob struct {
	Source     Reader
	Target     string // output .pulse path
	Schema     *encoding.Schema
	SampleRows int // default 500, min 50
	FS         afero.Fs
}

ImportJob converts tabular source data into a .pulse file.

func NewImportJob

func NewImportJob(source Reader, target string) *ImportJob

NewImportJob creates an ImportJob with default settings.

func (*ImportJob) Predict

func (j *ImportJob) Predict(ctx context.Context) (*PredictReport, error)

Predict validates the import without writing any output.

func (*ImportJob) Run

func (j *ImportJob) Run(ctx context.Context) (*ImportReport, error)

Run executes the import job, converting tabular source into a .pulse file.

type ImportReport

type ImportReport struct {
	RowsImported int
	Schema       *encoding.Schema
	RowErrors    []RowError
}

ImportReport summarizes the result of an import operation.

type InferenceWarning

type InferenceWarning struct {
	Column  string
	Message string
}

InferenceWarning records a non-fatal observation during inference.

func InferSchema

func InferSchema(reader Reader, sampleRows int) (*encoding.Schema, []InferenceWarning, error)

InferSchema samples up to sampleRows rows from reader and proposes a Schema. If sampleRows <= 0, defaultSampleRows is used. The minimum is minSampleRows.

type PredictReport

type PredictReport struct {
	Schema        *encoding.Schema
	EstimatedRows int
	Warnings      []InferenceWarning
}

PredictReport summarizes a validation-only run.

type Reader

type Reader interface {
	// ReadHeader returns column names from the source.
	ReadHeader() ([]string, error)
	// ReadRows streams rows; calls fn for each row.
	// The context controls cancellation.
	ReadRows(ctx context.Context, fn func(row []string) error) error
	// Close releases underlying resources.
	Close() error
}

Reader reads tabular data from a source format.

type ResetReader

type ResetReader interface {
	Reader
	// Reset rewinds the reader to the beginning.
	Reset() error
}

ResetReader is an optional interface for readers that support rewinding to the beginning. This is needed for schema inference followed by import.

type RowError

type RowError struct {
	Row int
	Err error
}

RowError records a per-row error during import or export.

type Writer

type Writer interface {
	// WriteHeader writes column names to the target.
	WriteHeader(columns []string) error
	// WriteRow writes a single row of values.
	WriteRow(values []any) error
	// Close flushes and releases resources.
	Close() error
}

Writer writes tabular data to a target format.

Directories

Path Synopsis
Package csv provides CSV format adapters for the Pulse I/O pipeline.
Package csv provides CSV format adapters for the Pulse I/O pipeline.
Package excel provides Excel import and export for the pulse I/O pipeline.
Package excel provides Excel import and export for the pulse I/O pipeline.
Package ndjson provides NDJSON (newline-delimited JSON) import and export for the pulse I/O pipeline.
Package ndjson provides NDJSON (newline-delimited JSON) import and export for the pulse I/O pipeline.
Package parquet provides Parquet import and export for the pulse I/O pipeline.
Package parquet provides Parquet import and export for the pulse I/O pipeline.
Package tsv provides TSV import and export for the pulse I/O pipeline.
Package tsv provides TSV import and export for the pulse I/O pipeline.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL