io

package

v0.11.0 Latest Latest Go to latest Published: May 27, 2026 License: MIT Imports: 12 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/frankbardon/pulse

Links

Open Source Insights

Documentation ¶

Overview ¶

Package io defines the I/O pipeline framework for Pulse: Reader/Writer interfaces, schema inference, and job types (ImportJob, ExportJob, ConvertJob).

Format-specific adapters live in sub-packages (csv, tsv).

Index ¶

func ErrStopIteration() error
type ConvertJob
- func NewConvertJob(source Reader, target Writer) *ConvertJob
- func (j *ConvertJob) Predict(ctx context.Context) (*PredictReport, error)
- func (j *ConvertJob) Run(ctx context.Context) (*ConvertReport, error)
type ConvertReport
type ExportJob
- func NewExportJob(source string, target Writer) *ExportJob
- func (j *ExportJob) Predict(ctx context.Context) (*PredictReport, error)
- func (j *ExportJob) Run(ctx context.Context) (*ExportReport, error)
type ExportReport
type ImportJob
- func NewImportJob(source Reader, target string) *ImportJob
- func (j *ImportJob) Predict(ctx context.Context) (*PredictReport, error)
- func (j *ImportJob) Run(ctx context.Context) (*ImportReport, error)
type ImportReport
type InferenceWarning
- func InferSchema(reader Reader, sampleRows int) (*encoding.Schema, []InferenceWarning, error)
type LabelResolver
type LabelWarning
type PredictReport
type Reader
type ResetReader
type RowError
type SchemaAwareWriter
type Writer

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ErrStopIteration ¶

func ErrStopIteration() error

ErrStopIteration returns the stop iteration sentinel for use by readers.

Types ¶

type ConvertJob ¶

type ConvertJob struct {
	Source      Reader
	Target      Writer
	Schema      *encoding.Schema
	KeepPulseAt string // optional: also write intermediate .pulse
	SampleRows  int
	FS          afero.Fs
	// Includes restricts the export half to the named schema fields,
	// in schema order. Nil / empty means write every field. The
	// intermediate .pulse file (when KeepPulseAt is set) always
	// carries the full schema — projection is an output-time overlay,
	// not an on-disk schema change. See ExportJob.Includes.
	Includes []string
	// Labels apply to the export phase only — the import side reads
	// raw source bytes and has no use for label translation. See
	// ExportJob.Labels.
	Labels []*types.LabelBinding
	// LabelResolver carries the runtime resolver. See
	// ExportJob.LabelResolver.
	LabelResolver LabelResolver
}

ConvertJob chains import and export with no intermediate file on disk (unless KeepPulseAt is set).

func NewConvertJob ¶

func NewConvertJob(source Reader, target Writer) *ConvertJob

NewConvertJob creates a ConvertJob with default settings.

func (*ConvertJob) Predict ¶

func (j *ConvertJob) Predict(ctx context.Context) (*PredictReport, error)

Predict validates the conversion without writing.

func (*ConvertJob) Run ¶

func (j *ConvertJob) Run(ctx context.Context) (*ConvertReport, error)

Run executes the convert job, streaming from source to target. If KeepPulseAt is set, also writes an intermediate .pulse file.

type ConvertReport ¶

type ConvertReport struct {
	RowsConverted int
	Schema        *encoding.Schema
	RowErrors     []RowError
	LabelWarnings []LabelWarning
}

ConvertReport summarizes the result of a convert operation.

type ExportJob ¶

type ExportJob struct {
	Source string // input .pulse path
	Target Writer
	FS     afero.Fs
	// Includes restricts the export to the named source-schema fields,
	// in source-schema order. Nil / empty means export every field
	// (prior behaviour). Names must match Schema.Fields[i].Name
	// exactly. Unknown names return PULSE_EXPORT_FIELD_UNKNOWN with
	// the offending name + list of known fields. Label augment
	// siblings ("<field>_label") are emitted only for included source
	// fields; replace mode applies to included fields as before.
	Includes []string
	// Labels rewrites or augments categorical column values during
	// export using embedder-registered label tables. Bindings name a
	// categorical field plus a label table; mode=replace overwrites
	// the column value, mode=augment emits a sibling "<field>_label"
	// column. See types.LabelBinding for semantics. Nil/empty means
	// the exporter writes raw resolved categorical values, matching
	// pre-label behaviour.
	Labels []*types.LabelBinding
	// LabelResolver carries the runtime resolver built from Labels +
	// the pulse Service's registered LabelTables. The pulse.Export
	// facade builds the resolver and sets this field; callers using
	// io.ExportJob directly without a Pulse instance must supply a
	// satisfying implementation (typically wrapping
	// processing.BuildLabelResolver). Nil means no label translation.
	LabelResolver LabelResolver
}

ExportJob converts a .pulse file into tabular output.

func (*ExportJob) Predict ¶

func (j *ExportJob) Predict(ctx context.Context) (*PredictReport, error)

Predict validates the export without writing.

func (*ExportJob) Run ¶

func (j *ExportJob) Run(ctx context.Context) (*ExportReport, error)

Run executes the export job, converting a .pulse file to a tabular target.

type ExportReport ¶

type ExportReport struct {
	RowsExported  int
	RowErrors     []RowError
	LabelWarnings []LabelWarning
}

ExportReport summarizes the result of an export operation.

type ImportJob ¶

type ImportJob struct {
	Source     Reader
	Target     string // output .pulse path
	Schema     *encoding.Schema
	SampleRows int // default 500, min 50
	FS         afero.Fs
}

ImportJob converts tabular source data into a .pulse file.

func NewImportJob ¶

func NewImportJob(source Reader, target string) *ImportJob

NewImportJob creates an ImportJob with default settings.

func (*ImportJob) Predict ¶

func (j *ImportJob) Predict(ctx context.Context) (*PredictReport, error)

Predict validates the import without writing any output.

func (*ImportJob) Run ¶

func (j *ImportJob) Run(ctx context.Context) (*ImportReport, error)

Run executes the import job, converting tabular source into a .pulse file.

type ImportReport ¶

type ImportReport struct {
	RowsImported int
	Schema       *encoding.Schema
	RowErrors    []RowError
}

ImportReport summarizes the result of an import operation.

type InferenceWarning ¶

type InferenceWarning struct {
	Column  string
	Message string
}

InferenceWarning records a non-fatal observation during inference.

func InferSchema ¶

func InferSchema(reader Reader, sampleRows int) (*encoding.Schema, []InferenceWarning, error)

InferSchema samples up to sampleRows rows from reader and proposes a Schema. If sampleRows <= 0, defaultSampleRows is used. The minimum is minSampleRows.

type LabelResolver ¶ added in v0.10.1

type LabelResolver interface {
	// Has reports whether the resolver carries any binding for the
	// named field.
	Has(field string) bool
	// Mode returns the binding mode for the named field. Empty when
	// no binding exists.
	Mode(field string) types.LabelMode
	// Apply translates a raw value through the binding for the named
	// field. See processing.LabelResolver.Apply for semantics.
	Apply(field, raw string) (out string, sibling string, ok bool)
	// FieldsWithAugment returns the field names that should emit a
	// "<field>_label" sibling column.
	FieldsWithAugment() []string
	// Warnings flushes the per-binding collision and miss summaries
	// once the job finishes.
	Warnings() []LabelWarning
}

LabelResolver is the io-package projection of the processing-layer label resolver. ExportJob / ConvertJob consume the interface so io stays free of the processing dependency.

processing.LabelResolver satisfies this interface as-is; the pulse-package facade builds the resolver from the user-facing LabelBinding slice + LabelTable registry and hands the value to the job via ExportJob.LabelResolver.

type LabelWarning ¶ added in v0.10.1

type LabelWarning struct {
	Code    string
	Message string
	Details map[string]any
}

LabelWarning is the io-package projection of one resolver diagnostic record. ExportReport surfaces these for the caller; the CLI / MCP envelope wrapper promotes them to envelope warnings.

type PredictReport ¶

type PredictReport struct {
	Schema        *encoding.Schema
	EstimatedRows int
	Warnings      []InferenceWarning
}

PredictReport summarizes a validation-only run.

type Reader ¶

type Reader interface {
	// ReadHeader returns column names from the source.
	ReadHeader() ([]string, error)
	// ReadRows streams rows; calls fn for each row.
	// The context controls cancellation.
	ReadRows(ctx context.Context, fn func(row []string) error) error
	// Close releases underlying resources.
	Close() error
}

Reader reads tabular data from a source format.

type ResetReader ¶

type ResetReader interface {
	Reader
	// Reset rewinds the reader to the beginning.
	Reset() error
}

ResetReader is an optional interface for readers that support rewinding to the beginning. This is needed for schema inference followed by import.

type RowError ¶

type RowError struct {
	Row int
	Err error
}

RowError records a per-row error during import or export.

type SchemaAwareWriter ¶ added in v0.2.0

type SchemaAwareWriter interface {
	Writer
	SetPulseSchema(s *encoding.Schema)
}

SchemaAwareWriter is an optional extension of Writer for targets that emit native typed columns (Arrow, Parquet, Excel) and want the source .pulse schema to drive column-type selection. ExportJob calls SetPulseSchema before WriteHeader on writers that implement this interface, then passes typed values through WriteRow:

encoding.Decimal128 for decimal128 / nullable_decimal128 columns
canonical strings for narrow types (current behavior)

Writers that do not implement SchemaAwareWriter receive only canonical strings, which is the prior text-only export contract.

type Writer ¶

type Writer interface {
	// WriteHeader writes column names to the target.
	WriteHeader(columns []string) error
	// WriteRow writes a single row of values.
	WriteRow(values []any) error
	// Close flushes and releases resources.
	Close() error
}

Writer writes tabular data to a target format.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
arrow Package arrow provides Arrow IPC (Feather V2) import and export for the pulse I/O pipeline, plus shared Arrow<->Pulse type-mapping helpers used by both this package and io/parquet.	Package arrow provides Arrow IPC (Feather V2) import and export for the pulse I/O pipeline, plus shared Arrow<->Pulse type-mapping helpers used by both this package and io/parquet.
csv Package csv provides CSV format adapters for the Pulse I/O pipeline.	Package csv provides CSV format adapters for the Pulse I/O pipeline.
excel Package excel provides Excel import and export for the pulse I/O pipeline.	Package excel provides Excel import and export for the pulse I/O pipeline.
format Package format dispatches tabular Reader construction by format identifier, sitting between the io/ interface definitions and the per-format leaf packages (io/csv, io/tsv, io/ndjson, io/jsonarray, io/parquet, io/arrow, io/excel).	Package format dispatches tabular Reader construction by format identifier, sitting between the io/ interface definitions and the per-format leaf packages (io/csv, io/tsv, io/ndjson, io/jsonarray, io/parquet, io/arrow, io/excel).
jsonarray Package jsonarray provides JSON-array import and export for the pulse I/O pipeline.	Package jsonarray provides JSON-array import and export for the pulse I/O pipeline.
jsonshared Package jsonshared holds value coercion helpers shared by the ndjson and jsonarray packages.	Package jsonshared holds value coercion helpers shared by the ndjson and jsonarray packages.
ndjson Package ndjson provides NDJSON (newline-delimited JSON) import and export for the pulse I/O pipeline.	Package ndjson provides NDJSON (newline-delimited JSON) import and export for the pulse I/O pipeline.
parquet Package parquet provides Parquet import and export for the pulse I/O pipeline.	Package parquet provides Parquet import and export for the pulse I/O pipeline.
tsv Package tsv provides TSV import and export for the pulse I/O pipeline.	Package tsv provides TSV import and export for the pulse I/O pipeline.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL