parquet

package
v0.0.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 24, 2026 License: MIT Imports: 5 Imported by: 0

Documentation

Overview

Package parquet provides Parquet reading and writing for the dataset package.

This is a pure API facade — it contains zero heavy imports. The actual Parquet processing is delegated to the engine via the dataset.ParquetReader and dataset.ParquetWriter interfaces:

  • Memory engine: uses parquet-go for struct-based row I/O (dataset/memory/parquet.go)
  • Arrow engine: uses pqarrow for zero-copy columnar I/O (dataset/arrow/parquet.go)

Usage:

ds, err := parquet.Read(ctx, file, size, eng)
err = parquet.Write(ctx, file, ds, eng)

Index

Constants

This section is empty.

Variables

View Source
var ErrUnsupportedType = errors.New("parquet: unsupported column type")

ErrUnsupportedType is returned for unsupported column types.

Functions

func Read

func Read(ctx context.Context, r io.ReaderAt, size int64, eng dataset.Engine, opts ...Option) (dataset.Dataset, error)

Read reads a Parquet file from r (which must support random access) using the given engine. The engine must have a registered ParquetReader.

func Write

func Write(ctx context.Context, w io.Writer, ds dataset.Dataset, eng dataset.Engine, opts ...Option) error

Write writes a Dataset as Parquet to w using the given engine. The engine must have a registered ParquetWriter.

Types

type Option

type Option func(*dataset.ParquetConfig)

Option is a functional option for Parquet read/write.

func WithCompression

func WithCompression(codec string) Option

WithCompression sets the compression codec ("snappy", "gzip", "zstd", "lz4", "none"). Default is "snappy".

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL