csvpp

package module

v0.0.1 Latest Latest Go to latest Published: Jan 25, 2026 License: MIT Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/osamingo/go-csvpp

Links

Open Source Insights

README ¶

go-csvpp

A Go implementation of the IETF CSV++ specification (draft-mscaldas-csvpp-00).

CSV++ extends traditional CSV to support arrays and structured fields within cells, enabling complex data representation while maintaining CSV's simplicity.

Features

Full IETF CSV++ specification compliance
Wraps encoding/csv for RFC 4180 compatibility
Four field types: Simple, Array, Structured, ArrayStructured
Struct mapping with csvpp tags (Marshal/Unmarshal)
Configurable delimiters
Security-conscious design (nesting depth limits)

Requirements

Go 1.24 or later

Installation

go get github.com/osamingo/go-csvpp

Quick Start

Reading CSV++ Data

package main

import (
    "fmt"
    "io"
    "strings"

    "github.com/osamingo/go-csvpp"
)

func main() {
    input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
Bob,555-9999,40.7128^-74.0060
`

    reader := csvpp.NewReader(strings.NewReader(input))

    for {
        record, err := reader.Read()
        if err == io.EOF {
            break
        }
        if err != nil {
            panic(err)
        }

        name := record[0].Value
        phones := record[1].Values
        lat := record[2].Components[0].Value
        lon := record[2].Components[1].Value

        fmt.Printf("%s: phones=%v, location=(%s, %s)\n", name, phones, lat, lon)
    }
}

Output:

Alice: phones=[555-1234 555-5678], location=(34.0522, -118.2437)
Bob: phones=[555-9999], location=(40.7128, -74.0060)

Writing CSV++ Data

package main

import (
    "bytes"
    "fmt"

    "github.com/osamingo/go-csvpp"
)

func main() {
    var buf bytes.Buffer
    writer := csvpp.NewWriter(&buf)

    headers := []*csvpp.ColumnHeader{
        {Name: "name", Kind: csvpp.SimpleField},
        {Name: "tags", Kind: csvpp.ArrayField, ArrayDelimiter: '~'},
    }
    writer.SetHeaders(headers)

    if err := writer.WriteHeader(); err != nil {
        panic(err)
    }
    if err := writer.Write([]*csvpp.Field{
        {Value: "Alice"},
        {Values: []string{"go", "rust", "python"}},
    }); err != nil {
        panic(err)
    }
    writer.Flush()

    fmt.Print(buf.String())
}

Output:

name,tags[]
Alice,go~rust~python

Struct Mapping

package main

import (
    "fmt"
    "strings"

    "github.com/osamingo/go-csvpp"
)

type Person struct {
    Name   string   `csvpp:"name"`
    Phones []string `csvpp:"phone[]"`
    Geo    struct {
        Lat string
        Lon string
    } `csvpp:"geo(lat^lon)"`
}

func main() {
    input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
`

    var people []Person
    if err := csvpp.Unmarshal(strings.NewReader(input), &people); err != nil {
        panic(err)
    }

    for _, p := range people {
        fmt.Printf("%s: phones=%v, geo=(%s, %s)\n",
            p.Name, p.Phones, p.Geo.Lat, p.Geo.Lon)
    }
}

Output:

Alice: phones=[555-1234 555-5678], geo=(34.0522, -118.2437)

Field Types

CSV++ supports four field types in headers:

Type	Header Syntax	Example Data	Description
Simple	`name`	`Alice`	Plain text value
Array	`tags[]`	`go~rust~python`	Multiple values with delimiter
Structured	`geo(lat^lon)`	`34.05^-118.24`	Named components
ArrayStructured	`addr[](city^zip)`	`LA^90210~NY^10001`	Array of structures

Default Delimiters

Array delimiter: ~ (tilde)
Component delimiter: ^ (caret)

Custom delimiters can be specified in the header:

phone[|] - uses | as array delimiter
geo;(lat;lon) - uses ; as component delimiter

Delimiter Progression

For nested structures, the IETF specification recommends:

Level	Delimiter
1 (arrays)	`~`
2 (components)	`^`
3	`;`
4	`:`

API Reference

Reader

reader := csvpp.NewReader(r) // r is io.Reader

// Configuration (same as encoding/csv)
reader.Comma = ','           // Field delimiter
reader.Comment = '#'         // Comment character
reader.LazyQuotes = false    // Relaxed quote handling
reader.TrimLeadingSpace = false
reader.MaxNestingDepth = 10  // Nesting limit (security)

// Methods
headers, err := reader.Headers()  // Get parsed headers
record, err := reader.Read()      // Read one record
records, err := reader.ReadAll()  // Read all records

Writer

writer := csvpp.NewWriter(w) // w is io.Writer

// Configuration
writer.Comma = ','      // Field delimiter
writer.UseCRLF = false  // Use \r\n line endings

// Methods
writer.SetHeaders(headers)  // Set column headers
writer.WriteHeader()        // Write header row
writer.Write(record)        // Write one record
writer.WriteAll(records)    // Write all records
writer.Flush()              // Flush buffer

Marshal/Unmarshal

// Unmarshal CSV++ data into structs
var people []Person
err := csvpp.Unmarshal(reader, &people)

// Marshal structs to CSV++ data
err := csvpp.Marshal(writer, people)

Struct Tags

Use csvpp struct tags to map fields:

type Record struct {
    Name     string   `csvpp:"name"`           // Simple field
    Tags     []string `csvpp:"tags[]"`         // Array field
    Location struct {                          // Structured field
        Lat string
        Lon string
    } `csvpp:"geo(lat^lon)"`
    Addresses []Address `csvpp:"addr[](street^city)"` // Array structured
}

Compatibility

This package wraps encoding/csv and inherits:

Full RFC 4180 compliance
Quoted field handling
Configurable field/line delimiters
Comment support

Security

MaxNestingDepth: Limits nested structure depth (default: 10) to prevent stack overflow from malicious input
Header names are restricted to ASCII characters per IETF specification

Specification

This implementation follows the IETF CSV++ specification:

draft-mscaldas-csvpp-00

License

MIT License - see LICENSE for details.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Documentation ¶

Overview ¶

Package csvpp implements the IETF CSV++ specification (draft-mscaldas-csvpp-00).

CSV++ extends traditional CSV to support arrays and structured fields within cells, enabling complex data representation while maintaining CSV's simplicity. This package wraps encoding/csv and is fully compatible with RFC 4180.

Overview ¶

CSV++ introduces four field types beyond simple text values:

Simple: "name" - plain text value
Array: "tags[]" - multiple values separated by a delimiter (default: ~)
Structured: "geo(lat^lon)" - named components separated by a delimiter (default: ^)
ArrayStructured: "addresses[](street^city)" - array of structured values

These field types are represented by the FieldKind constants: SimpleField, ArrayField, StructuredField, and ArrayStructuredField.

Basic Usage ¶

Reading CSV++ data:

r := csvpp.NewReader(file)

// Get parsed headers
headers, err := r.Headers()
if err != nil {
    log.Fatal(err)
}

// Read records
for {
    record, err := r.Read()
    if err == io.EOF {
        break
    }
    if err != nil {
        log.Fatal(err)
    }
    // process record
}

Writing CSV++ data:

w := csvpp.NewWriter(file)
w.SetHeaders(headers)

if err := w.WriteHeader(); err != nil {
    log.Fatal(err)
}

for _, record := range records {
    if err := w.Write(record); err != nil {
        log.Fatal(err)
    }
}
w.Flush()
if err := w.Error(); err != nil {
    log.Fatal(err)
}

Struct Mapping ¶

Use Marshal and Unmarshal for automatic struct mapping with struct tags:

type Person struct {
    Name   string   `csvpp:"name"`
    Phones []string `csvpp:"phone[]"`
    Geo    struct {
        Lat string
        Lon string
    } `csvpp:"geo(lat^lon)"`
}

// Read into structs
var people []Person
if err := csvpp.Unmarshal(file, &people); err != nil {
    log.Fatal(err)
}

// Write from structs
var buf bytes.Buffer
if err := csvpp.Marshal(&buf, people); err != nil {
    log.Fatal(err)
}

Delimiter Conventions ¶

The IETF CSV++ specification recommends using specific delimiters for nested structures to avoid conflicts. The recommended progression is:

Level 1 (arrays): ~ (tilde)
Level 2 (components): ^ (caret)
Level 3: ; (semicolon)
Level 4: : (colon)

This package uses ~ and ^ as defaults, matching the IETF recommendation.

Compatibility with encoding/csv ¶

This package wraps encoding/csv and inherits its RFC 4180 compliance. The Reader and Writer types expose the same configuration options:

Comma: field delimiter (default: ',')
Comment: comment character (Reader only)
LazyQuotes: relaxed quote handling (Reader only)
TrimLeadingSpace: trim leading whitespace (Reader only)
UseCRLF: use \r\n line endings (Writer only)

Security Considerations ¶

The MaxNestingDepth option (default: 10) limits the depth of nested structures to prevent stack overflow attacks from maliciously crafted input.

Errors ¶

The package defines the following sentinel errors:

ErrNoHeader: returned when attempting to read without a header row
ErrInvalidHeader: returned when header format is invalid
ErrNestingTooDeep: returned when nesting exceeds MaxNestingDepth

Parse errors are wrapped in ParseError, which provides line/column information.

Constants ¶

Default delimiters follow IETF recommendations:

DefaultArrayDelimiter: ~ (tilde) for array fields
DefaultComponentDelimiter: ^ (caret) for structured fields
DefaultMaxNestingDepth: 10 (IETF recommended limit)

Specification Reference ¶

For the complete IETF CSV++ specification, see: https://datatracker.ietf.org/doc/draft-mscaldas-csvpp/

Example ¶

input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
Bob,555-9999,40.7128^-74.0060
`

reader := csvpp.NewReader(strings.NewReader(input))

// Get headers
headers, err := reader.Headers()
if err != nil {
	log.Fatal(err)
}

fmt.Printf("Headers: %s, %s, %s\n", headers[0].Name, headers[1].Name, headers[2].Name)

// Read all records
for {
	record, err := reader.Read()
	if err == io.EOF {
		break
	}
	if err != nil {
		log.Fatal(err)
	}

	name := record[0].Value
	phones := record[1].Values
	lat := record[2].Components[0].Value
	lon := record[2].Components[1].Value

	fmt.Printf("%s: phones=%v, location=(%s, %s)\n", name, phones, lat, lon)
}

Output:

Headers: name, phone, geo
Alice: phones=[555-1234 555-5678], location=(34.0522, -118.2437)
Bob: phones=[555-9999], location=(40.7128, -74.0060)

Index ¶

Constants
Variables
func Marshal(w io.Writer, src any) error
func MarshalWriter(w *Writer, src any) error
func Unmarshal(r io.Reader, dst any) error
func UnmarshalReader(r *Reader, dst any) error
type ColumnHeader
type Field
type FieldKind
- func (k FieldKind) String() string
type ParseError
- func (e *ParseError) Error() string
- func (e *ParseError) Unwrap() error
type Reader
- func NewReader(r io.Reader) *Reader
type Writer
- func NewWriter(w io.Writer) *Writer

Constants ¶

View Source

const (
	DefaultArrayDelimiter     = '~' // IETF Section 2.3.2: recommended for array fields
	DefaultComponentDelimiter = '^' // IETF Section 2.3.2: recommended for structured fields
)

Default delimiters as recommended in IETF CSV++ Section 2.3.2. The specification suggests delimiter progression: ~ → ^ → ; → : for nested structures.

View Source

const DefaultMaxNestingDepth = 10

DefaultMaxNestingDepth is the default maximum nesting depth. IETF Section 5 (Security Considerations) recommends limiting nesting depth to prevent stack overflow attacks from maliciously crafted input.

Variables ¶

View Source

var (
	ErrNoHeader       = errors.New("csvpp: header record is required")
	ErrInvalidHeader  = errors.New("csvpp: invalid column header format")
	ErrNestingTooDeep = errors.New("csvpp: nesting level exceeds limit")
)

Error definitions.

Functions ¶

func Marshal ¶

func Marshal(w io.Writer, src any) error

Marshal encodes a slice of structs to CSV++ data.

Example ¶

people := []Person{
	{Name: "Alice", Phones: []string{"555-1234", "555-5678"}},
	{Name: "Bob", Phones: []string{"555-9999"}},
}

var buf bytes.Buffer
if err := csvpp.Marshal(&buf, people); err != nil {
	log.Fatal(err)
}

fmt.Print(buf.String())

Output:

name,phone[]
Alice,555-1234~555-5678
Bob,555-9999

func MarshalWriter ¶

func MarshalWriter(w *Writer, src any) error

MarshalWriter encodes a slice of structs to a Writer.

func Unmarshal ¶

func Unmarshal(r io.Reader, dst any) error

Unmarshal decodes CSV++ data into a slice of structs. dst must be a pointer to a slice of structs.

Example ¶

input := `name,phone[]
Alice,555-1234~555-5678
Bob,555-9999
`

var people []Person
if err := csvpp.Unmarshal(strings.NewReader(input), &people); err != nil {
	log.Fatal(err)
}

for _, p := range people {
	fmt.Printf("%s: %v\n", p.Name, p.Phones)
}

Output:

Alice: [555-1234 555-5678]
Bob: [555-9999]

Example (Structured) ¶

input := `name,geo(lat^lon)
Los Angeles,34.0522^-118.2437
New York,40.7128^-74.0060
`

var locations []Location
if err := csvpp.Unmarshal(strings.NewReader(input), &locations); err != nil {
	log.Fatal(err)
}

for _, loc := range locations {
	fmt.Printf("%s: (%s, %s)\n", loc.Name, loc.Geo.Lat, loc.Geo.Lon)
}

Output:

Los Angeles: (34.0522, -118.2437)
New York: (40.7128, -74.0060)

func UnmarshalReader ¶

func UnmarshalReader(r *Reader, dst any) error

UnmarshalReader decodes from a Reader into a slice of structs.

Types ¶

type ColumnHeader ¶

type ColumnHeader struct {
	Name               string          // Field name (ABNF: name = 1*field-char)
	Kind               FieldKind       // Field type (IETF Section 2.2)
	ArrayDelimiter     rune            // Array delimiter (ABNF: delimiter)
	ComponentDelimiter rune            // Component delimiter (ABNF: component-delim)
	Components         []*ColumnHeader // Component list (ABNF: component-list)
}

ColumnHeader represents the declaration information for an individual field. It corresponds to the ABNF "field" rule in IETF CSV++ Section 2.2:

field = simple-field / array-field / struct-field / array-struct-field
name  = 1*field-char
field-char = ALPHA / DIGIT / "_" / "-"

type Field ¶

type Field struct {
	Value      string   // Value for SimpleField
	Values     []string // Values for ArrayField (IETF Section 2.2.2)
	Components []*Field // Components for StructuredField/ArrayStructuredField (IETF Section 2.2.3/2.2.4)
}

Field represents a parsed field value from a data row. The populated fields depend on the corresponding ColumnHeader.Kind:

SimpleField: Value is set
ArrayField: Values is set
StructuredField: Components is set (each component is a Field)
ArrayStructuredField: Components is set (each is a Field with its own Components)

type FieldKind ¶

type FieldKind int

FieldKind represents the type of field as defined in IETF CSV++ Section 2.2. See: https://datatracker.ietf.org/doc/draft-mscaldas-csvpp/

const (
	SimpleField          FieldKind = iota // IETF Section 2.2.1: simple-field = name
	ArrayField                            // IETF Section 2.2.2: array-field = name "[" [delimiter] "]"
	StructuredField                       // IETF Section 2.2.3: struct-field = name [component-delim] "(" component-list ")"
	ArrayStructuredField                  // IETF Section 2.2.4: array-struct-field = name "[" [delimiter] "]" [component-delim] "(" component-list ")"
)

func (FieldKind) String ¶

func (k FieldKind) String() string

String returns the string representation of FieldKind.

type ParseError ¶

type ParseError struct {
	Line   int    // Line number where the error occurred (1-based)
	Column int    // Column number where the error occurred (1-based)
	Field  string // Field name (if available)
	Err    error  // Original error
}

ParseError holds detailed information about an error that occurred during parsing.

func (*ParseError) Error ¶

func (e *ParseError) Error() string

Error returns the error message for ParseError.

func (*ParseError) Unwrap ¶

func (e *ParseError) Unwrap() error

Unwrap returns the original error.

type Reader ¶

type Reader struct {
	// Comma is the field delimiter (default: ',').
	Comma rune
	// Comment is the comment character (disabled if 0).
	Comment rune
	// LazyQuotes relaxes strict quote checking if true.
	LazyQuotes bool
	// TrimLeadingSpace trims leading whitespace from fields if true.
	TrimLeadingSpace bool
	// MaxNestingDepth is the maximum nesting depth for structured fields (default: 10).
	// This limit prevents stack overflow from deeply nested input (IETF Section 5).
	// If 0, DefaultMaxNestingDepth is used.
	MaxNestingDepth int
	// contains filtered or unexported fields
}

Reader reads CSV++ files according to the IETF CSV++ specification. It wraps encoding/csv.Reader and provides CSV++ header parsing and field parsing. The first row is always treated as the header row (IETF Section 2.1).

func NewReader ¶

func NewReader(r io.Reader) *Reader

NewReader creates a new Reader.

Example (CustomDelimiter) ¶

// Using semicolon as field delimiter (common in European locales)
input := `name;age
Alice;30
Bob;25
`

reader := csvpp.NewReader(strings.NewReader(input))
reader.Comma = ';'

records, err := reader.ReadAll()
if err != nil {
	log.Fatal(err)
}

for _, record := range records {
	fmt.Printf("%s is %s\n", record[0].Value, record[1].Value)
}

Output:

Alice is 30
Bob is 25

func (*Reader) Headers ¶

func (r *Reader) Headers() ([]*ColumnHeader, error)

Headers returns the parsed header information. If headers have not been parsed yet, the first row is read and parsed.

Example ¶

input := `id,name,tags[],address(street^city^zip)
1,Alice,go~rust,123 Main^LA^90210
`

reader := csvpp.NewReader(strings.NewReader(input))
headers, err := reader.Headers()
if err != nil {
	log.Fatal(err)
}

for _, h := range headers {
	fmt.Printf("%s: %s\n", h.Name, h.Kind)
}

Output:

id: SimpleField
name: SimpleField
tags: ArrayField
address: StructuredField

func (*Reader) Read ¶

func (r *Reader) Read() ([]*Field, error)

Read reads and returns one record's worth of fields. The header row is automatically parsed on the first call. Returns io.EOF when the end of file is reached.

Example ¶

input := `name,scores[]
Alice,100~95~88
Bob,77~82
`

reader := csvpp.NewReader(strings.NewReader(input))

for {
	record, err := reader.Read()
	if err == io.EOF {
		break
	}
	if err != nil {
		log.Fatal(err)
	}

	fmt.Printf("%s: %v\n", record[0].Value, record[1].Values)
}

Output:

Alice: [100 95 88]
Bob: [77 82]

func (*Reader) ReadAll ¶

func (r *Reader) ReadAll() ([][]*Field, error)

ReadAll reads and returns all records. The header row is automatically parsed on the first call.

Example ¶

input := `name,age
Alice,30
Bob,25
Charlie,35
`

reader := csvpp.NewReader(strings.NewReader(input))
records, err := reader.ReadAll()
if err != nil {
	log.Fatal(err)
}

fmt.Printf("Read %d records\n", len(records))
for _, record := range records {
	fmt.Printf("%s is %s years old\n", record[0].Value, record[1].Value)
}

Output:

Read 3 records
Alice is 30 years old
Bob is 25 years old
Charlie is 35 years old

type Writer ¶

type Writer struct {
	// Comma is the field delimiter (default: ',').
	Comma rune
	// UseCRLF uses \r\n as the line terminator if true.
	UseCRLF bool
	// contains filtered or unexported fields
}

Writer writes CSV++ files according to the IETF CSV++ specification. It wraps encoding/csv.Writer and serializes CSV++ fields using the delimiters defined in the headers. The output is RFC 4180 compliant.

Example ¶

var buf bytes.Buffer
writer := csvpp.NewWriter(&buf)

headers := []*csvpp.ColumnHeader{
	{Name: "name", Kind: csvpp.SimpleField},
	{Name: "tags", Kind: csvpp.ArrayField, ArrayDelimiter: '~'},
}
writer.SetHeaders(headers)

if err := writer.WriteHeader(); err != nil {
	log.Fatal(err)
}

records := [][]*csvpp.Field{
	{{Value: "Alice"}, {Values: []string{"go", "rust"}}},
	{{Value: "Bob"}, {Values: []string{"python"}}},
}

for _, record := range records {
	if err := writer.Write(record); err != nil {
		log.Fatal(err)
	}
}
writer.Flush()

fmt.Print(buf.String())

Output:

name,tags[]
Alice,go~rust
Bob,python

func NewWriter ¶

func NewWriter(w io.Writer) *Writer

NewWriter creates a new Writer.

func (*Writer) Error ¶

func (w *Writer) Error() error

Error returns any error that occurred during writing.

func (*Writer) Flush ¶

func (w *Writer) Flush()

Flush flushes the buffer.

func (*Writer) SetHeaders ¶

func (w *Writer) SetHeaders(headers []*ColumnHeader)

SetHeaders sets the header information. This must be called before WriteHeader or Write.

func (*Writer) Write ¶

func (w *Writer) Write(record []*Field) error

Write writes one record's worth of fields.

func (*Writer) WriteAll ¶

func (w *Writer) WriteAll(records [][]*Field) error

WriteAll writes all records. The header row is also written automatically.

Example ¶

var buf bytes.Buffer
writer := csvpp.NewWriter(&buf)

headers := []*csvpp.ColumnHeader{
	{Name: "name", Kind: csvpp.SimpleField},
	{Name: "score", Kind: csvpp.SimpleField},
}
writer.SetHeaders(headers)

records := [][]*csvpp.Field{
	{{Value: "Alice"}, {Value: "100"}},
	{{Value: "Bob"}, {Value: "95"}},
}

if err := writer.WriteAll(records); err != nil {
	log.Fatal(err)
}

fmt.Print(buf.String())

Output:

name,score
Alice,100
Bob,95

func (*Writer) WriteHeader ¶

func (w *Writer) WriteHeader() error

WriteHeader writes the header row.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL