Documentation
¶
Overview ¶
Package csvpp implements the IETF CSV++ specification (draft-mscaldas-csvpp-00).
CSV++ extends traditional CSV to support arrays and structured fields within cells, enabling complex data representation while maintaining CSV's simplicity. This package wraps encoding/csv and is fully compatible with RFC 4180.
Overview ¶
CSV++ introduces four field types beyond simple text values:
- Simple: "name" - plain text value
- Array: "tags[]" - multiple values separated by a delimiter (default: ~)
- Structured: "geo(lat^lon)" - named components separated by a delimiter (default: ^)
- ArrayStructured: "addresses[](street^city)" - array of structured values
These field types are represented by the FieldKind constants: SimpleField, ArrayField, StructuredField, and ArrayStructuredField.
Basic Usage ¶
Reading CSV++ data:
r := csvpp.NewReader(file)
// Get parsed headers
headers, err := r.Headers()
if err != nil {
log.Fatal(err)
}
// Read records
for {
record, err := r.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
// process record
}
Writing CSV++ data:
w := csvpp.NewWriter(file)
w.SetHeaders(headers)
if err := w.WriteHeader(); err != nil {
log.Fatal(err)
}
for _, record := range records {
if err := w.Write(record); err != nil {
log.Fatal(err)
}
}
w.Flush()
if err := w.Error(); err != nil {
log.Fatal(err)
}
Struct Mapping ¶
Use Marshal and Unmarshal for automatic struct mapping with struct tags:
type Person struct {
Name string `csvpp:"name"`
Phones []string `csvpp:"phone[]"`
Geo struct {
Lat string
Lon string
} `csvpp:"geo(lat^lon)"`
}
// Read into structs
var people []Person
if err := csvpp.Unmarshal(file, &people); err != nil {
log.Fatal(err)
}
// Write from structs
var buf bytes.Buffer
if err := csvpp.Marshal(&buf, people); err != nil {
log.Fatal(err)
}
Delimiter Conventions ¶
The IETF CSV++ specification recommends using specific delimiters for nested structures to avoid conflicts. The recommended progression is:
- Level 1 (arrays): ~ (tilde)
- Level 2 (components): ^ (caret)
- Level 3: ; (semicolon)
- Level 4: : (colon)
This package uses ~ and ^ as defaults, matching the IETF recommendation.
Compatibility with encoding/csv ¶
This package wraps encoding/csv and inherits its RFC 4180 compliance. The Reader and Writer types expose the same configuration options:
- Comma: field delimiter (default: ',')
- Comment: comment character (Reader only)
- LazyQuotes: relaxed quote handling (Reader only)
- TrimLeadingSpace: trim leading whitespace (Reader only)
- UseCRLF: use \r\n line endings (Writer only)
Security Considerations ¶
The MaxNestingDepth option (default: 10) limits the depth of nested structures to prevent stack overflow attacks from maliciously crafted input.
Errors ¶
The package defines the following sentinel errors:
- ErrNoHeader: returned when attempting to read without a header row
- ErrInvalidHeader: returned when header format is invalid
- ErrNestingTooDeep: returned when nesting exceeds MaxNestingDepth
Parse errors are wrapped in ParseError, which provides line/column information.
Constants ¶
Default delimiters follow IETF recommendations:
- DefaultArrayDelimiter: ~ (tilde) for array fields
- DefaultComponentDelimiter: ^ (caret) for structured fields
- DefaultMaxNestingDepth: 10 (IETF recommended limit)
Specification Reference ¶
For the complete IETF CSV++ specification, see: https://datatracker.ietf.org/doc/draft-mscaldas-csvpp/
Example ¶
input := `name,phone[],geo(lat^lon)
Alice,555-1234~555-5678,34.0522^-118.2437
Bob,555-9999,40.7128^-74.0060
`
reader := csvpp.NewReader(strings.NewReader(input))
// Get headers
headers, err := reader.Headers()
if err != nil {
log.Fatal(err)
}
fmt.Printf("Headers: %s, %s, %s\n", headers[0].Name, headers[1].Name, headers[2].Name)
// Read all records
for {
record, err := reader.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
name := record[0].Value
phones := record[1].Values
lat := record[2].Components[0].Value
lon := record[2].Components[1].Value
fmt.Printf("%s: phones=%v, location=(%s, %s)\n", name, phones, lat, lon)
}
Output: Headers: name, phone, geo Alice: phones=[555-1234 555-5678], location=(34.0522, -118.2437) Bob: phones=[555-9999], location=(40.7128, -74.0060)
Index ¶
Examples ¶
Constants ¶
const ( DefaultArrayDelimiter = '~' // IETF Section 2.3.2: recommended for array fields DefaultComponentDelimiter = '^' // IETF Section 2.3.2: recommended for structured fields )
Default delimiters as recommended in IETF CSV++ Section 2.3.2. The specification suggests delimiter progression: ~ → ^ → ; → : for nested structures.
const DefaultMaxNestingDepth = 10
DefaultMaxNestingDepth is the default maximum nesting depth. IETF Section 5 (Security Considerations) recommends limiting nesting depth to prevent stack overflow attacks from maliciously crafted input.
Variables ¶
var ( ErrNoHeader = errors.New("csvpp: header record is required") ErrInvalidHeader = errors.New("csvpp: invalid column header format") ErrNestingTooDeep = errors.New("csvpp: nesting level exceeds limit") )
Error definitions.
Functions ¶
func Marshal ¶
Marshal encodes a slice of structs to CSV++ data.
Example ¶
people := []Person{
{Name: "Alice", Phones: []string{"555-1234", "555-5678"}},
{Name: "Bob", Phones: []string{"555-9999"}},
}
var buf bytes.Buffer
if err := csvpp.Marshal(&buf, people); err != nil {
log.Fatal(err)
}
fmt.Print(buf.String())
Output: name,phone[] Alice,555-1234~555-5678 Bob,555-9999
func MarshalWriter ¶
MarshalWriter encodes a slice of structs to a Writer.
func Unmarshal ¶
Unmarshal decodes CSV++ data into a slice of structs. dst must be a pointer to a slice of structs.
Example ¶
input := `name,phone[]
Alice,555-1234~555-5678
Bob,555-9999
`
var people []Person
if err := csvpp.Unmarshal(strings.NewReader(input), &people); err != nil {
log.Fatal(err)
}
for _, p := range people {
fmt.Printf("%s: %v\n", p.Name, p.Phones)
}
Output: Alice: [555-1234 555-5678] Bob: [555-9999]
Example (Structured) ¶
input := `name,geo(lat^lon)
Los Angeles,34.0522^-118.2437
New York,40.7128^-74.0060
`
var locations []Location
if err := csvpp.Unmarshal(strings.NewReader(input), &locations); err != nil {
log.Fatal(err)
}
for _, loc := range locations {
fmt.Printf("%s: (%s, %s)\n", loc.Name, loc.Geo.Lat, loc.Geo.Lon)
}
Output: Los Angeles: (34.0522, -118.2437) New York: (40.7128, -74.0060)
func UnmarshalReader ¶
UnmarshalReader decodes from a Reader into a slice of structs.
Types ¶
type ColumnHeader ¶
type ColumnHeader struct {
Name string // Field name (ABNF: name = 1*field-char)
Kind FieldKind // Field type (IETF Section 2.2)
ArrayDelimiter rune // Array delimiter (ABNF: delimiter)
ComponentDelimiter rune // Component delimiter (ABNF: component-delim)
Components []*ColumnHeader // Component list (ABNF: component-list)
}
ColumnHeader represents the declaration information for an individual field. It corresponds to the ABNF "field" rule in IETF CSV++ Section 2.2:
field = simple-field / array-field / struct-field / array-struct-field name = 1*field-char field-char = ALPHA / DIGIT / "_" / "-"
type Field ¶
type Field struct {
Value string // Value for SimpleField
Values []string // Values for ArrayField (IETF Section 2.2.2)
Components []*Field // Components for StructuredField/ArrayStructuredField (IETF Section 2.2.3/2.2.4)
}
Field represents a parsed field value from a data row. The populated fields depend on the corresponding ColumnHeader.Kind:
- SimpleField: Value is set
- ArrayField: Values is set
- StructuredField: Components is set (each component is a Field)
- ArrayStructuredField: Components is set (each is a Field with its own Components)
type FieldKind ¶
type FieldKind int
FieldKind represents the type of field as defined in IETF CSV++ Section 2.2. See: https://datatracker.ietf.org/doc/draft-mscaldas-csvpp/
const ( SimpleField FieldKind = iota // IETF Section 2.2.1: simple-field = name ArrayField // IETF Section 2.2.2: array-field = name "[" [delimiter] "]" StructuredField // IETF Section 2.2.3: struct-field = name [component-delim] "(" component-list ")" ArrayStructuredField // IETF Section 2.2.4: array-struct-field = name "[" [delimiter] "]" [component-delim] "(" component-list ")" )
type ParseError ¶
type ParseError struct {
Line int // Line number where the error occurred (1-based)
Column int // Column number where the error occurred (1-based)
Field string // Field name (if available)
Err error // Original error
}
ParseError holds detailed information about an error that occurred during parsing.
func (*ParseError) Error ¶
func (e *ParseError) Error() string
Error returns the error message for ParseError.
type Reader ¶
type Reader struct {
// Comma is the field delimiter (default: ',').
Comma rune
// Comment is the comment character (disabled if 0).
Comment rune
// LazyQuotes relaxes strict quote checking if true.
LazyQuotes bool
// TrimLeadingSpace trims leading whitespace from fields if true.
TrimLeadingSpace bool
// MaxNestingDepth is the maximum nesting depth for structured fields (default: 10).
// This limit prevents stack overflow from deeply nested input (IETF Section 5).
// If 0, DefaultMaxNestingDepth is used.
MaxNestingDepth int
// contains filtered or unexported fields
}
Reader reads CSV++ files according to the IETF CSV++ specification. It wraps encoding/csv.Reader and provides CSV++ header parsing and field parsing. The first row is always treated as the header row (IETF Section 2.1).
func NewReader ¶
NewReader creates a new Reader.
Example (CustomDelimiter) ¶
// Using semicolon as field delimiter (common in European locales)
input := `name;age
Alice;30
Bob;25
`
reader := csvpp.NewReader(strings.NewReader(input))
reader.Comma = ';'
records, err := reader.ReadAll()
if err != nil {
log.Fatal(err)
}
for _, record := range records {
fmt.Printf("%s is %s\n", record[0].Value, record[1].Value)
}
Output: Alice is 30 Bob is 25
func (*Reader) Headers ¶
func (r *Reader) Headers() ([]*ColumnHeader, error)
Headers returns the parsed header information. If headers have not been parsed yet, the first row is read and parsed.
Example ¶
input := `id,name,tags[],address(street^city^zip)
1,Alice,go~rust,123 Main^LA^90210
`
reader := csvpp.NewReader(strings.NewReader(input))
headers, err := reader.Headers()
if err != nil {
log.Fatal(err)
}
for _, h := range headers {
fmt.Printf("%s: %s\n", h.Name, h.Kind)
}
Output: id: SimpleField name: SimpleField tags: ArrayField address: StructuredField
func (*Reader) Read ¶
Read reads and returns one record's worth of fields. The header row is automatically parsed on the first call. Returns io.EOF when the end of file is reached.
Example ¶
input := `name,scores[]
Alice,100~95~88
Bob,77~82
`
reader := csvpp.NewReader(strings.NewReader(input))
for {
record, err := reader.Read()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
fmt.Printf("%s: %v\n", record[0].Value, record[1].Values)
}
Output: Alice: [100 95 88] Bob: [77 82]
func (*Reader) ReadAll ¶
ReadAll reads and returns all records. The header row is automatically parsed on the first call.
Example ¶
input := `name,age
Alice,30
Bob,25
Charlie,35
`
reader := csvpp.NewReader(strings.NewReader(input))
records, err := reader.ReadAll()
if err != nil {
log.Fatal(err)
}
fmt.Printf("Read %d records\n", len(records))
for _, record := range records {
fmt.Printf("%s is %s years old\n", record[0].Value, record[1].Value)
}
Output: Read 3 records Alice is 30 years old Bob is 25 years old Charlie is 35 years old
type Writer ¶
type Writer struct {
// Comma is the field delimiter (default: ',').
Comma rune
// UseCRLF uses \r\n as the line terminator if true.
UseCRLF bool
// contains filtered or unexported fields
}
Writer writes CSV++ files according to the IETF CSV++ specification. It wraps encoding/csv.Writer and serializes CSV++ fields using the delimiters defined in the headers. The output is RFC 4180 compliant.
Example ¶
var buf bytes.Buffer
writer := csvpp.NewWriter(&buf)
headers := []*csvpp.ColumnHeader{
{Name: "name", Kind: csvpp.SimpleField},
{Name: "tags", Kind: csvpp.ArrayField, ArrayDelimiter: '~'},
}
writer.SetHeaders(headers)
if err := writer.WriteHeader(); err != nil {
log.Fatal(err)
}
records := [][]*csvpp.Field{
{{Value: "Alice"}, {Values: []string{"go", "rust"}}},
{{Value: "Bob"}, {Values: []string{"python"}}},
}
for _, record := range records {
if err := writer.Write(record); err != nil {
log.Fatal(err)
}
}
writer.Flush()
fmt.Print(buf.String())
Output: name,tags[] Alice,go~rust Bob,python
func (*Writer) SetHeaders ¶
func (w *Writer) SetHeaders(headers []*ColumnHeader)
SetHeaders sets the header information. This must be called before WriteHeader or Write.
func (*Writer) WriteAll ¶
WriteAll writes all records. The header row is also written automatically.
Example ¶
var buf bytes.Buffer
writer := csvpp.NewWriter(&buf)
headers := []*csvpp.ColumnHeader{
{Name: "name", Kind: csvpp.SimpleField},
{Name: "score", Kind: csvpp.SimpleField},
}
writer.SetHeaders(headers)
records := [][]*csvpp.Field{
{{Value: "Alice"}, {Value: "100"}},
{{Value: "Bob"}, {Value: "95"}},
}
if err := writer.WriteAll(records); err != nil {
log.Fatal(err)
}
fmt.Print(buf.String())
Output: name,score Alice,100 Bob,95
func (*Writer) WriteHeader ¶
WriteHeader writes the header row.