Documentation
¶
Overview ¶
Package csv provides CSV reading and writing functionality for DataFrames.
Features:
- Automatic type inference (int64, float64, bool, string)
- Configurable NA value detection
- Support for custom delimiters
- Header row handling
- Streaming support for large files (phase 5)
Performance targets:
- Phase 1: >500MB/s baseline reading on NVMe SSD
- Phase 5: >2GB/s with parallel chunks and SIMD optimizations
Example:
df, err := csv.ReadCSV("data.csv",
csv.WithDelimiter(','),
csv.WithHeader(true),
csv.WithNA([]string{"NA", "NULL"}),
)
err = csv.WriteCSV(df, "output.csv")
Package csv provides CSV reading and writing functionality for DataFrames.
Index ¶
- func ReadCSV(path string, opts ...CSVOption) (*dataframe.DataFrame, error)
- func ReadCSVToSeries(path string, colName string, opts ...CSVOption) (*series.Series[any], error)
- func ToCSV(df *dataframe.DataFrame, path string) error
- func WriteCSV(df *dataframe.DataFrame, path string, opts ...CSVOption) error
- type CSVOption
- func WithChunkSize(size int) CSVOption
- func WithDelimiter(delim rune) CSVOption
- func WithDtypes(dtypes map[string]core.Dtype) CSVOption
- func WithHeader(header bool) CSVOption
- func WithNA(naValues []string) CSVOption
- func WithNAValue(naValue string) CSVOption
- func WithParallel(n int) CSVOption
- func WithWriteDelimiter(delim rune) CSVOption
- func WithWriteHeader(header bool) CSVOption
- type CSVReader
- type CSVWriter
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ReadCSVToSeries ¶
ReadCSVToSeries reads a single column from a CSV file as a Series.
Types ¶
type CSVOption ¶
CSVOption is a functional option for configuring CSVReader.
func WithChunkSize ¶
WithChunkSize sets the chunk size for reading large files.
func WithDelimiter ¶
WithDelimiter sets the field delimiter (default: comma).
func WithDtypes ¶
WithDtypes sets explicit data types for columns.
func WithHeader ¶
WithHeader specifies if the first row contains column names (default: true).
func WithNAValue ¶
WithNAValue sets the string to use for null values when writing.
func WithParallel ¶
WithParallel sets the number of parallel workers (default: runtime.NumCPU()).
func WithWriteDelimiter ¶
WithWriteDelimiter sets the field delimiter for writing.
func WithWriteHeader ¶
WithWriteHeader specifies whether to write column names as the first row.