Documentation
¶
Overview ¶
Package geoarrow provides zero-allocation WKB→GeoArrow conversion for Apache Arrow record batches.
Usage:
converter := geoarrow.NewConverter(reader, geoarrow.WithBufferSize(2)) defer converter.Release() // use converter as array.RecordReader
Index ¶
- Constants
- Variables
- func ArrowTypeForGeo(gt GeoType) (arrow.DataType, string)
- func GeoArrowField(name string, gt GeoType, extensionMeta string) (arrow.Field, error)
- func NewProjector(source array.RecordReader, cols map[string]bool) array.RecordReader
- type Converter
- type GeoType
- type GeometryColumn
- type Option
- type Projector
- type StringReplacer
Constants ¶
const ( ExtMultiPoint = "geoarrow.multipoint" ExtMultiLineString = "geoarrow.multilinestring" ExtMultiPolygon = "geoarrow.multipolygon" )
GeoArrow extension type names.
Variables ¶
var ( // MultiPoint: List<FixedSizeList<Float64>[2]> MultiPointType = arrow.ListOf(CoordType) // MultiLineString: List<List<FixedSizeList<Float64>[2]>> MultiLineStringType = arrow.ListOf(arrow.ListOf(CoordType)) // MultiPolygon: List<List<List<FixedSizeList<Float64>[2]>>> MultiPolygonType = arrow.ListOf(arrow.ListOf(arrow.ListOf(CoordType))) )
GeoArrow nested types — always Multi* for schema consistency.
var CoordType = arrow.FixedSizeListOf(2, arrow.PrimitiveTypes.Float64)
CoordType is FixedSizeList(2, Float64) — interleaved XY coordinates.
Functions ¶
func ArrowTypeForGeo ¶
ArrowTypeForGeo returns the Arrow DataType and extension name for a GeoType.
func GeoArrowField ¶
GeoArrowField creates a new Arrow Field with GeoArrow extension metadata. extensionMeta is the original ARROW:extension:metadata value (passed through as-is). If empty, defaults to {"srid":4326}.
func NewProjector ¶
func NewProjector(source array.RecordReader, cols map[string]bool) array.RecordReader
NewProjector creates a RecordReader that outputs only the named columns. Columns not found in the schema are silently ignored. If all columns are present or cols is empty, returns source unchanged (no wrapper).
Types ¶
type Converter ¶
type Converter struct {
// contains filtered or unexported fields
}
Converter wraps an array.RecordReader and converts WKB geometry columns to native GeoArrow format on-the-fly. It implements array.RecordReader.
The source schema should already be flattened (Struct/Map/Union expanded to top-level columns) before passing to the Converter. This ensures that geometry columns nested inside structs are accessible by column index.
Multiple geometry columns per schema are supported — each is detected and converted independently (e.g., one column can be points, another polygons).
Conversion is pipelined: up to bufferSize batches are converted ahead in parallel goroutines while the consumer reads previous results. All goroutines respect the provided context for cancellation.
func NewConverter ¶
func NewConverter(source array.RecordReader, opts ...Option) *Converter
NewConverter creates a new WKB→GeoArrow converting RecordReader.
It reads batches from source, detects WKB geometry columns (by ARROW:extension:name metadata), and converts them to native GeoArrow (MultiPoint/MultiLineString/MultiPolygon) columns.
The source should already be flattened — if geometry is inside a struct, flatten first so the Converter can find it by column index.
Multiple geometry columns are supported; each is converted independently.
The conversion runs in a background pipeline with bufferSize goroutines. The output schema is determined after the first batch is converted (geometry column types are auto-detected from WKB content).
func (*Converter) Record ¶
func (c *Converter) Record() arrow.RecordBatch
Record is a deprecated alias for RecordBatch.
func (*Converter) RecordBatch ¶
func (c *Converter) RecordBatch() arrow.RecordBatch
type GeoType ¶
type GeoType int
GeoType represents the canonical geometry type for a column.
func ConvertBatch ¶
func ConvertBatch( rec arrow.RecordBatch, col GeometryColumn, geoType GeoType, mem memory.Allocator, ) (arrow.RecordBatch, GeoType, error)
convertBatch replaces WKB binary column(s) with native GeoArrow column(s). Returns a new RecordBatch (caller must Release). The geoType is auto-detected on first non-null geometry; subsequent calls should pass the same geoType.
type GeometryColumn ¶
type GeometryColumn struct {
Name string
Index int
SRID int
Format string // "WKB", "GeoJSON", "H3Cell"
ExtensionMeta string // original ARROW:extension:metadata (passed through)
}
GeometryColumn describes a geometry column to convert.
func DetectGeometryColumns ¶
func DetectGeometryColumns(schema *arrow.Schema) []GeometryColumn
DetectGeometryColumns finds geometry columns in an Arrow schema by checking ARROW:extension:name metadata.
type Option ¶
type Option func(*converterConfig)
Option configures the Converter.
func WithAllocator ¶
WithAllocator sets the memory allocator for Arrow arrays.
func WithBufferSize ¶
WithBufferSize sets the number of batches to buffer ahead. This equals the number of goroutines doing conversion in parallel. Default: 1 (no parallelism, but still pipelined).
func WithColumns ¶
func WithColumns(cols []GeometryColumn) Option
WithColumns overrides auto-detection and specifies which columns to convert.
func WithContext ¶
WithContext sets the context for cancellation of the background pipeline.
type Projector ¶
type Projector struct {
// contains filtered or unexported fields
}
Projector wraps a RecordReader and projects (selects) only the requested columns. Implements array.RecordReader.
func (*Projector) Record ¶
func (p *Projector) Record() arrow.RecordBatch
func (*Projector) RecordBatch ¶
func (p *Projector) RecordBatch() arrow.RecordBatch
type StringReplacer ¶
type StringReplacer struct {
// contains filtered or unexported fields
}
StringReplacer wraps a RecordReader and replaces geometry columns (WKB binary or native GeoArrow) with a Utf8 string column containing "{geometry}" for display in viewers that don't support binary/nested types.
Implements array.RecordReader.
func NewStringReplacer ¶
func NewStringReplacer(source array.RecordReader) *StringReplacer
NewStringReplacer creates a RecordReader that replaces geometry columns with "{geometry}". It auto-detects geometry columns by ARROW:extension:name metadata (geoarrow.wkb, ogc.wkb, geoarrow.point, geoarrow.multi*, etc.).
func (*StringReplacer) Err ¶
func (r *StringReplacer) Err() error
func (*StringReplacer) Next ¶
func (r *StringReplacer) Next() bool
func (*StringReplacer) Record ¶
func (r *StringReplacer) Record() arrow.RecordBatch
func (*StringReplacer) RecordBatch ¶
func (r *StringReplacer) RecordBatch() arrow.RecordBatch
func (*StringReplacer) Release ¶
func (r *StringReplacer) Release()
func (*StringReplacer) Retain ¶
func (r *StringReplacer) Retain()
func (*StringReplacer) Schema ¶
func (r *StringReplacer) Schema() *arrow.Schema