Documentation
¶
Overview ¶
Package datafusion provides a database/sql driver backed by Apache DataFusion.
The driver registers as "datafusion". It is intended for in-process analytic SQL over DataFusion's memory/session catalog and Arrow execution engine. Standard database/sql row scanning is supported for scalar Arrow types, and QueryArrowContext exposes native Arrow record batches for callers that need exact Arrow schemas or complex values. RegisterArrowReader registers Go Arrow record readers as DataFusion in-memory tables.
Index ¶
- Constants
- Variables
- func ExecStatements(ctx context.Context, execer SQLExecerContext, statements []string) error
- func RegisterArrowReader(ctx context.Context, sqlConn *sql.Conn, tableName string, ...) error
- func RegisterArrowReaderZeroCopy(ctx context.Context, sqlConn *sql.Conn, tableName string, ...) error
- type ArrowReader
- type Conn
- func (conn *Conn) Begin() (driver.Tx, error)
- func (conn *Conn) BeginTx(ctx context.Context, opts driver.TxOptions) (driver.Tx, error)
- func (conn *Conn) CheckNamedValue(nv *driver.NamedValue) error
- func (conn *Conn) Close() error
- func (conn *Conn) ExecContext(ctx context.Context, query string, args []driver.NamedValue) (driver.Result, error)
- func (conn *Conn) IsValid() bool
- func (conn *Conn) Ping(ctx context.Context) error
- func (conn *Conn) Prepare(query string) (driver.Stmt, error)
- func (conn *Conn) PrepareContext(ctx context.Context, query string) (driver.Stmt, error)
- func (conn *Conn) QueryArrowContext(ctx context.Context, query string, args []driver.NamedValue) (ArrowReader, error)
- func (conn *Conn) QueryContext(ctx context.Context, query string, args []driver.NamedValue) (driver.Rows, error)
- func (conn *Conn) ResetSession(ctx context.Context) error
- type Connector
- type ConnectorOption
- type Date
- type Decimal
- type Driver
- type Duration
- type Error
- type ErrorType
- type NativeErrorKind
- type Null
- type ParameterType
- type SQLExecerContext
- type Stmt
- func (s *Stmt) CheckNamedValue(nv *driver.NamedValue) error
- func (s *Stmt) Close() error
- func (s *Stmt) Exec(args []driver.Value) (driver.Result, error)
- func (s *Stmt) ExecContext(ctx context.Context, args []driver.NamedValue) (driver.Result, error)
- func (s *Stmt) NumInput() int
- func (s *Stmt) Query(args []driver.Value) (driver.Rows, error)
- func (s *Stmt) QueryContext(ctx context.Context, args []driver.NamedValue) (driver.Rows, error)
- type Time
- type Timestamp
- type UInt64
Constants ¶
const ( // DataFusionVersion is the exact Rust datafusion crate version pinned by this release. DataFusionVersion = "53.1.0" // DataFusionVersionEncoded is used in Go module tags: v<major>.<encoded-datafusion-version>.<patch>. DataFusionVersionEncoded = "530100" // DataFusionGoMajor is the major component of datafusion-go release tags. DataFusionGoMajor = 0 // DataFusionGoPatch is the patch component of datafusion-go release tags. DataFusionGoPatch = 1 // DataFusionGoVersion is the full datafusion-go module version without the leading v. DataFusionGoVersion = "0.530100.1" )
Variables ¶
var ( // ErrNativeCancelled matches errors caused by native query cancellation. ErrNativeCancelled = errors.New("datafusion native query canceled") // ErrNativeInvalidArgument matches native invalid-argument errors. ErrNativeInvalidArgument = errors.New("datafusion native invalid argument") // ErrNativeFailure matches uncategorized native DataFusion failures. ErrNativeFailure = errors.New("datafusion native failure") // ErrNativePanic matches panics caught on the Rust side of the FFI boundary. ErrNativePanic = errors.New("datafusion native panic") )
Functions ¶
func ExecStatements ¶
func ExecStatements(ctx context.Context, execer SQLExecerContext, statements []string) error
ExecStatements executes already-split SQL statements in order.
DataFusion prepares exactly one SQL statement at a time, so callers handling migration files should split the script before calling this helper.
func RegisterArrowReader ¶
func RegisterArrowReader(ctx context.Context, sqlConn *sql.Conn, tableName string, reader array.RecordReader) error
RegisterArrowReader registers the remaining batches in reader as a DataFusion in-memory table visible to SQL executed on sqlConn.
This safe path serializes the reader to an Arrow IPC stream and lets the native side decode that stream into Rust-owned Arrow batches. That copy is intentional: the registered table can outlive this call, while ordinary Go Arrow arrays may use Go-owned buffers that must not be retained by native code after a cgo call returns.
func RegisterArrowReaderZeroCopy ¶
func RegisterArrowReaderZeroCopy(ctx context.Context, sqlConn *sql.Conn, tableName string, reader array.RecordReader) error
RegisterArrowReaderZeroCopy registers the remaining batches in reader as a DataFusion in-memory table by exporting reader through the Arrow C Stream Interface.
Unlike RegisterArrowReader, this path does not copy buffers into Rust-owned memory. Use it only when every exported Arrow buffer is safe for native code to retain until the registered table is dropped or the owning DataFusion session/connector is closed, for example buffers allocated with Arrow Go's mallocator or another C/foreign allocator. Passing ordinary Go-allocated Arrow buffers can violate cgo pointer lifetime rules because DataFusion keeps table batches after this call returns.
Types ¶
type ArrowReader ¶
ArrowReader streams Arrow record batches returned by DataFusion.
Callers must close the reader when they are done with it. Close cancels any in-flight native execution and releases native Arrow stream resources.
func QueryArrowContext ¶
func QueryArrowContext(ctx context.Context, sqlConn *sql.Conn, query string, args ...any) (ArrowReader, error)
QueryArrowContext runs query on a DataFusion *sql.Conn and returns Arrow record batches without converting them through database/sql values.
type Conn ¶
type Conn struct {
// contains filtered or unexported fields
}
Conn is a single database/sql driver connection to a DataFusion SessionContext.
func (*Conn) Begin ¶
Begin returns an unsupported error because DataFusion transactions are not supported.
func (*Conn) BeginTx ¶
BeginTx returns an unsupported error after honoring an already-canceled context.
func (*Conn) CheckNamedValue ¶
func (conn *Conn) CheckNamedValue(nv *driver.NamedValue) error
CheckNamedValue normalizes DataFusion-specific parameter wrapper types.
func (*Conn) ExecContext ¶
func (conn *Conn) ExecContext(ctx context.Context, query string, args []driver.NamedValue) (driver.Result, error)
ExecContext executes a statement and returns a database/sql result.
func (*Conn) PrepareContext ¶
PrepareContext validates and prepares query.
func (*Conn) QueryArrowContext ¶
func (conn *Conn) QueryArrowContext(ctx context.Context, query string, args []driver.NamedValue) (ArrowReader, error)
QueryArrowContext executes a query and returns Arrow record batches.
type Connector ¶
type Connector struct {
// contains filtered or unexported fields
}
Connector owns the native DataFusion database handle used to open pooled connections.
func NewConnector ¶
NewConnector creates a DataFusion connector for database/sql.
func NewConnectorWithInitContext ¶
func NewConnectorWithInitContext(dsn string, initFn func(context.Context, driver.ExecerContext) error, options ...ConnectorOption) (*Connector, error)
NewConnectorWithInitContext creates a DataFusion connector with an optional initialization callback and connector options.
type ConnectorOption ¶
type ConnectorOption func(*connectorOptions)
ConnectorOption configures a DataFusion connector.
func WithSharedSession ¶
func WithSharedSession(shared bool) ConnectorOption
WithSharedSession controls whether connections from one Connector share a DataFusion SessionContext.
type Date ¶
type Date struct {
// contains filtered or unexported fields
}
Date binds a parameter as an Arrow Date32 value.
func DateFromTime ¶
DateFromTime returns a Date using t's calendar date in t's location.
type Decimal ¶
type Decimal struct {
// contains filtered or unexported fields
}
Decimal binds a parameter as an Arrow Decimal128 value.
func DecimalString ¶
DecimalString returns a Decimal parameter from a base-10 string, precision, and scale.
func NewDecimalString ¶
NewDecimalString validates and returns a Decimal parameter from a base-10 string, precision, and scale.
type Duration ¶
type Duration struct {
// contains filtered or unexported fields
}
Duration binds a parameter as an Arrow duration with nanosecond precision.
func DurationFromTime ¶
DurationFromTime returns a Duration from a Go time.Duration.
func DurationNanos ¶
DurationNanos returns a Duration from nanoseconds.
func (Duration) Nanoseconds ¶
Nanoseconds returns the duration as nanoseconds.
type Error ¶
type Error struct {
// Type identifies the driver operation that failed.
Type ErrorType
// NativeKind identifies native DataFusion failures when available.
NativeKind NativeErrorKind
// Message is the driver-level error message.
Message string
// Cause is the wrapped native or lower-level error.
Cause error
}
Error is the structured error type returned by this driver.
type ErrorType ¶
type ErrorType string
ErrorType identifies the operation that failed.
const ( // ErrorConnect marks failures while opening a database or connection. ErrorConnect ErrorType = "connect" // ErrorPrepare marks failures while parsing or preparing SQL. ErrorPrepare ErrorType = "prepare" // ErrorBind marks failures while binding SQL parameters. ErrorBind ErrorType = "bind" // ErrorExecute marks failures while executing SQL or reading result batches. ErrorExecute ErrorType = "execute" // ErrorScan marks failures while adapting Arrow values to database/sql rows. ErrorScan ErrorType = "scan" // ErrorClosed marks use of a closed connector, connection, statement, or reader. ErrorClosed ErrorType = "closed" // ErrorUnsupported marks operations DataFusion does not support through this driver. ErrorUnsupported ErrorType = "unsupported" // ErrorNative marks uncategorized native FFI failures. ErrorNative ErrorType = "native" )
type NativeErrorKind ¶
type NativeErrorKind string
NativeErrorKind is the stable public classification for native DataFusion errors.
const ( // NativeErrorKindCancelled indicates a query canceled through context cancellation. NativeErrorKindCancelled NativeErrorKind = "cancelled" // NativeErrorKindInvalidArgument indicates invalid SQL, parameters, or API input. NativeErrorKindInvalidArgument NativeErrorKind = "invalid_argument" // NativeErrorKindNative indicates an uncategorized native DataFusion failure. NativeErrorKindNative NativeErrorKind = "native" // NativeErrorKindPanic indicates a panic caught on the Rust side of the FFI boundary. NativeErrorKindPanic NativeErrorKind = "panic" )
type Null ¶
type Null struct {
// contains filtered or unexported fields
}
Null binds a typed null parameter.
func NullDecimal ¶
NullDecimal returns a typed decimal null.
func NullOf ¶
func NullOf(typ ParameterType) Null
NullOf returns a typed null for non-decimal parameter types.
func NullTimestamp ¶
NullTimestamp returns a typed timestamp null with an explicit Arrow time zone string.
func (Null) TimeZone ¶
TimeZone returns the Arrow timestamp timezone string for timestamp typed nulls.
func (Null) Type ¶
func (value Null) Type() ParameterType
Type returns the typed-null parameter type.
type ParameterType ¶
type ParameterType int
ParameterType names a DataFusion scalar type for typed null parameters.
const ( // ParameterBool identifies a boolean parameter. ParameterBool ParameterType = iota + 1 // ParameterInt64 identifies a signed 64-bit integer parameter. ParameterInt64 // ParameterUInt64 identifies an unsigned 64-bit integer parameter. ParameterUInt64 // ParameterFloat64 identifies a 64-bit floating point parameter. ParameterFloat64 // ParameterString identifies a UTF-8 string parameter. ParameterString // ParameterBinary identifies a binary parameter. ParameterBinary // ParameterDate identifies an Arrow Date32 parameter. ParameterDate // ParameterTime identifies an Arrow Time64 nanosecond parameter. ParameterTime // ParameterTimestamp identifies an Arrow timestamp parameter. ParameterTimestamp // ParameterDuration identifies an Arrow duration nanosecond parameter. ParameterDuration // ParameterDecimal identifies an Arrow Decimal128 parameter. ParameterDecimal )
type SQLExecerContext ¶
type SQLExecerContext interface {
ExecContext(ctx context.Context, query string, args ...any) (sql.Result, error)
}
SQLExecerContext is implemented by *sql.DB, *sql.Conn, and *sql.Tx.
type Stmt ¶
type Stmt struct {
// contains filtered or unexported fields
}
Stmt is a prepared DataFusion statement.
func (*Stmt) CheckNamedValue ¶
func (s *Stmt) CheckNamedValue(nv *driver.NamedValue) error
CheckNamedValue normalizes DataFusion-specific parameter wrapper types.
func (*Stmt) ExecContext ¶
ExecContext executes the statement with normalized named values.
func (*Stmt) NumInput ¶
NumInput returns the number of SQL parameters found while preparing the statement.
func (*Stmt) QueryContext ¶
QueryContext executes the statement with normalized named values.
type Time ¶
type Time struct {
// contains filtered or unexported fields
}
Time binds a parameter as an Arrow Time64 nanosecond value.
func NewTimeNanos ¶
NewTimeNanos validates and returns a Time from nanoseconds since midnight.
func TimeFromTime ¶
TimeFromTime returns a Time using only t's clock fields in t's location.
func (Time) Nanoseconds ¶
Nanoseconds returns the nanoseconds since midnight.
type Timestamp ¶
type Timestamp struct {
// contains filtered or unexported fields
}
Timestamp binds a parameter as an Arrow timestamp with nanosecond precision.
func TimestampFromTime ¶
TimestampFromTime returns a UTC timestamp parameter for t.
func TimestampWithTimeZone ¶
TimestampWithTimeZone returns a timestamp parameter with an explicit Arrow time zone string.
Source Files
¶
Directories
¶
| Path | Synopsis |
|---|---|
|
examples
|
|
|
arrow
command
|
|
|
parameters
command
|
|
|
simple
command
|
|
|
internal
|
|
|
tools/genversions
command
|