io

package

v0.3.0 Latest Latest Go to latest Published: Jul 27, 2025 License: Apache-2.0, MIT Imports: 19 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/paveg/gorilla

Links

Open Source Insights

Documentation ¶

Overview ¶

Package io provides data input/output operations for DataFrames. It supports CSV reading and writing with type inference and various configuration options.

Package io provides I/O operations for reading and writing DataFrame data.

This package includes readers and writers for various data formats, with automatic type inference and schema handling. The primary implementation is CSV I/O with support for streaming large datasets.

Key components:

DataReader/DataWriter interfaces for pluggable I/O backends
CSVReader/CSVWriter for CSV file operations
Type inference for automatic schema detection
Configurable options for delimiters, headers, and batch sizes

Memory management: All I/O operations integrate with Apache Arrow's memory management system and require proper cleanup with defer patterns.

Index ¶

Constants
type CSVOptions
- func DefaultCSVOptions() CSVOptions
type CSVReader
- func NewCSVReader(reader io.Reader, options CSVOptions, mem memory.Allocator) *CSVReader
- func (r *CSVReader) Read() (*dataframe.DataFrame, error)
type CSVWriter
- func NewCSVWriter(writer io.Writer, options CSVOptions) *CSVWriter
- func (w *CSVWriter) Write(df *dataframe.DataFrame) error
type DataReader
type DataWriter
type ParquetOptions
- func DefaultParquetOptions() ParquetOptions
type ParquetReader
- func NewParquetReader(reader io.Reader, options ParquetOptions, mem memory.Allocator) *ParquetReader
- func (r *ParquetReader) Read() (*dataframe.DataFrame, error)
type ParquetWriter
- func NewParquetWriter(writer io.Writer, options ParquetOptions) *ParquetWriter
- func (w *ParquetWriter) Write(df *dataframe.DataFrame) error

Constants ¶

View Source

const (
	// DefaultChunkSize is the default chunk size for parallel processing
	DefaultChunkSize = 1000
	// DefaultBatchSize is the default batch size for I/O operations
	DefaultBatchSize = 1000
)

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type CSVOptions ¶

type CSVOptions struct {
	// Delimiter is the field delimiter (default: comma)
	Delimiter rune
	// Comment is the comment character (default: 0 = disabled)
	Comment rune
	// Header indicates whether the first row contains headers
	Header bool
	// SkipInitialSpace indicates whether to skip initial whitespace
	SkipInitialSpace bool
	// Parallel indicates whether to use parallel processing
	Parallel bool
	// ChunkSize is the size of chunks for parallel processing
	ChunkSize int
}

CSVOptions contains configuration options for CSV operations

func DefaultCSVOptions ¶

func DefaultCSVOptions() CSVOptions

DefaultCSVOptions returns default CSV options

type CSVReader ¶

type CSVReader struct {
	// contains filtered or unexported fields
}

CSVReader reads CSV data and converts it to DataFrames

func NewCSVReader ¶

func NewCSVReader(reader io.Reader, options CSVOptions, mem memory.Allocator) *CSVReader

NewCSVReader creates a new CSV reader with the specified options

func (*CSVReader) Read ¶

func (r *CSVReader) Read() (*dataframe.DataFrame, error)

Read reads CSV data and returns a DataFrame

type CSVWriter ¶

type CSVWriter struct {
	// contains filtered or unexported fields
}

CSVWriter writes DataFrames to CSV format

func NewCSVWriter ¶

func NewCSVWriter(writer io.Writer, options CSVOptions) *CSVWriter

NewCSVWriter creates a new CSV writer with the specified options

func (*CSVWriter) Write ¶

func (w *CSVWriter) Write(df *dataframe.DataFrame) error

Write writes the DataFrame to CSV format

type DataReader ¶

type DataReader interface {
	// Read reads data from the source and returns a DataFrame
	Read() (*dataframe.DataFrame, error)
}

DataReader defines the interface for reading data from various sources

type DataWriter ¶

type DataWriter interface {
	// Write writes the DataFrame to the destination
	Write(df *dataframe.DataFrame) error
}

DataWriter defines the interface for writing data to various destinations

type ParquetOptions ¶

type ParquetOptions struct {
	// Compression type for Parquet files
	Compression string
	// BatchSize for reading/writing operations
	BatchSize int
}

ParquetOptions contains configuration options for Parquet operations

func DefaultParquetOptions ¶

func DefaultParquetOptions() ParquetOptions

DefaultParquetOptions returns default Parquet options

type ParquetReader ¶ added in v0.2.0

type ParquetReader struct {
	// contains filtered or unexported fields
}

ParquetReader reads Parquet data and converts it to DataFrames

func NewParquetReader ¶ added in v0.2.0

func NewParquetReader(reader io.Reader, options ParquetOptions, mem memory.Allocator) *ParquetReader

NewParquetReader creates a new Parquet reader with the specified options

func (*ParquetReader) Read ¶ added in v0.2.0

func (r *ParquetReader) Read() (*dataframe.DataFrame, error)

Read reads Parquet data and returns a DataFrame

type ParquetWriter ¶ added in v0.2.0

type ParquetWriter struct {
	// contains filtered or unexported fields
}

ParquetWriter writes DataFrames to Parquet format

func NewParquetWriter ¶ added in v0.2.0

func NewParquetWriter(writer io.Writer, options ParquetOptions) *ParquetWriter

NewParquetWriter creates a new Parquet writer with the specified options

func (*ParquetWriter) Write ¶ added in v0.2.0

func (w *ParquetWriter) Write(df *dataframe.DataFrame) error

Write writes the DataFrame to Parquet format

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL