experiments

package

v0.1.1 Latest Latest Go to latest Published: Aug 31, 2024 License: MIT Imports: 28 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/arrowarc/arrowarc

Links

Open Source Insights

Documentation ¶

Index ¶

func AppendToDefaultStream(w io.Writer, projectID, datasetID, tableID string) error
func AppendToPendingStream(w io.Writer, projectID, datasetID, tableID string) error
func BuildAppendRowsRequest(data [][]byte) *storagepb.AppendRowsRequest
func GoTypeToArrowType(goType reflect.Type) arrow.DataType
func NormalizeDescriptor(in protoreflect.MessageDescriptor) (*descriptorpb.DescriptorProto, error)
func StorageSchemaToProto2Descriptor(inSchema *storagepb.TableSchema, scope string) (protoreflect.Descriptor, error)
func StorageSchemaToProto3Descriptor(inSchema *storagepb.TableSchema, scope string) (protoreflect.Descriptor, error)
func UniqueBQName(prefix string) (string, error)
func UniqueBucketName(prefix, projectID string) (string, error)
type ParquetRows
- func NewParquetRowsReader(ctx context.Context, filePath string) (*ParquetRows, error)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func AppendToDefaultStream ¶

func AppendToDefaultStream(w io.Writer, projectID, datasetID, tableID string) error

appendToDefaultStream demonstrates using the managedwriter package to write some example data to a default stream.

func AppendToPendingStream ¶

func AppendToPendingStream(w io.Writer, projectID, datasetID, tableID string) error

func BuildAppendRowsRequest ¶

func BuildAppendRowsRequest(data [][]byte) *storagepb.AppendRowsRequest

func GoTypeToArrowType ¶

func GoTypeToArrowType(goType reflect.Type) arrow.DataType

func NormalizeDescriptor ¶

func NormalizeDescriptor(in protoreflect.MessageDescriptor) (*descriptorpb.DescriptorProto, error)

NormalizeDescriptor builds a self-contained DescriptorProto suitable for communicating schema information with the BigQuery Storage write API. It's primarily used for cases where users are interested in sending data using a predefined protocol buffer message.

The storage API accepts a single DescriptorProto for decoding message data. In many cases, a message is comprised of multiple independent messages, from the same .proto file or from multiple sources. Rather than being forced to communicate all these messages independently, what this method does is rewrite the DescriptorProto to inline all messages as nested submessages. As the backend only cares about the types and not the namespaces when decoding, this is sufficient for the needs of the API's representation.

In addition to nesting messages, this method also handles some encapsulation of enum types to avoid possible conflicts due to ambiguities, and clears oneof indices as oneof isn't a concept that maps into BigQuery schemas.

To enable proto3 usage, this function will also rewrite proto3 descriptors into equivalent proto2 form. Such rewrites include setting the appropriate default values for proto3 fields.

func StorageSchemaToProto2Descriptor ¶

func StorageSchemaToProto2Descriptor(inSchema *storagepb.TableSchema, scope string) (protoreflect.Descriptor, error)

StorageSchemaToProto2Descriptor builds a protoreflect.Descriptor for a given table schema using proto2 syntax.

func StorageSchemaToProto3Descriptor ¶

func StorageSchemaToProto3Descriptor(inSchema *storagepb.TableSchema, scope string) (protoreflect.Descriptor, error)

StorageSchemaToProto3Descriptor builds a protoreflect.Descriptor for a given table schema using proto3 syntax.

NOTE: Currently the write API doesn't yet support proto3 behaviors (default value, wrapper types, etc), but this is provided for completeness.

func UniqueBQName ¶

func UniqueBQName(prefix string) (string, error)

UniqueBQName returns a more unique name for a BigQuery resource.

func UniqueBucketName ¶

func UniqueBucketName(prefix, projectID string) (string, error)

UniqueBucketName returns a more unique name cloud storage bucket.

Types ¶

type ParquetRows ¶

type ParquetRows struct {
	// contains filtered or unexported fields
}

ParquetRows represents a result set that reads from a Parquet file using Apache Arrow.

func NewParquetRowsReader ¶

func NewParquetRowsReader(ctx context.Context, filePath string) (*ParquetRows, error)

NewParquetReader initializes a new ParquetRows reader with the provided options.

func (*ParquetRows) Close ¶

func (p *ParquetRows) Close() error

Close releases all resources associated with the reader.

func (*ParquetRows) ColumnTypeDatabaseTypeName ¶

func (p *ParquetRows) ColumnTypeDatabaseTypeName(index int) string

ColumnTypeDatabaseTypeName returns the database type name of the column at the specified index.

func (*ParquetRows) ColumnTypeNullable ¶

func (p *ParquetRows) ColumnTypeNullable(index int) (nullable, ok bool)

ColumnTypeNullable returns whether the column at the specified index is nullable.

func (*ParquetRows) ColumnTypePrecisionScale ¶

func (p *ParquetRows) ColumnTypePrecisionScale(index int) (precision, scale int64, ok bool)

ColumnTypePrecisionScale returns the precision and scale for the column at the specified index.

func (*ParquetRows) ColumnTypeScanType ¶

func (p *ParquetRows) ColumnTypeScanType(index int) reflect.Type

func (*ParquetRows) Columns ¶

func (p *ParquetRows) Columns() []string

Columns returns the column names of the Parquet file.

func (*ParquetRows) Next ¶

func (p *ParquetRows) Next(dest []driver.Value) error

Next reads the next record from the Parquet file and stores the values in the dest slice.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL