psv

package module

v0.3.0-alpha Latest Latest Go to latest Published: Jan 22, 2025 License: MIT Imports: 1 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

codeberg.org/japh/psv

Links

Open Source Insights

README ¶

PSV - Pipe Separated Values

Introduction

PSV (Pipe Separated Values) is a file format for encoding simple, tabular data in human-readable, plain text.

PSV is similar in concept to Comma-Separated Values (CSV), Tab-Separated Values (TSV) or Delimiter-Separated Values (DSV), but with the distinction that additional spaces are added so that that all rows

have the same number of columns
and all columns align vertically

e.g.:

CSV:

name,score
Alexander,3
Tim,5
Johannes,17

PSV:

| name      | score |
| --------- | ----: |
| Alexander |     3 |
| Tim       |     5 |
| Johannes  |    17 |

PSV tables are also used by Markdown, with some minor restrictions

Intended Use Cases

Three basic scenarios are supported:

Generating PSV tables from code:

doc := &psv.Document{} doc.AppendRow([]string{"name","score"}) doc.AppendRow([]string{"Alice","3"}) doc.AppendRow([]string{"Bob","2"}) ouput, _ := doc.MarshalText() fmt.Println(ouput)

| name | score | | Alice | 3 | | Bob | 2 |
Reading data from PSV tables into code:

input := | name | score | | Alice | 3 | | Bob | 2 |

// read the input doc := &psv.Document{} doc.UnmarshalText([]byte(input))

// the input may contain any number of tables // loop through each table looking for 'interesting' data for _, table := range doc.Tables() {
```
 // ColumnByNameFunc provides a convenient name => column mapping
 value := table.ColumnByName()

 // DataRows() returns all rows except the first row (assumed to be a header)
 // AllRows() returns all rows including the first row.
 for r, row := range table.DataRows() {
 	fmt.Printf("%s points were awarded to %q\n",
 		value(row,"score"),
 		value(row,"name"),
 	)
 }
```
}

// Output: // 3 points were awarded to "Alice" // 2 points were awarded to "Bob"
Re-formatting existing PSV tables via the psv command:

% cat input.txt | name | score | Alice | 3 | Bob | 2 % psv < input.txt > output.txt % cat output.txt | name | score | | Alice | 3 | | Bob | 2 |

PSV is very closely related to comma-separated values (CSV) or delimiter-separated values (DSV), however, PSV explicitly advocates the use of additional whitespace and decoration to improve readability. PSV should be easy for humans to write and easy for humans to read.

The psv unix command line utility and golang package in this project help automate the process of (re-)formatting PSV data and also the use of PSV as a source of data in your programs.

For example, boolean logic is often presented as a table of inputs and outcomes:

| A     | B     | A anb B |                     [][]string{
| ----- | ----- | ------- |                               {"A", "B", "A anb B" },
| false | false | false   |                               {"false", "false", "false" },
| false | true  | false   |      <== psv ==>              {"false", "true", "false" },
| true  | false | false   |                               {"true", "false", "false" },
| true  | true  | true    |                               {"true", "true", "true" },
                                                          }

Document Structure

parsing always returns a document
all tables in a document may be aligned with each other by enabling the align_all option
a ruler after the first row of data in a table is special. it can
- specify left,right,center,numeric data alignment per column
- specify that a column should be sorted before joining (actually, before encoding!)
- all other rulers are for decoration purposes only, and any additional markers within them will be ignored

Basic Formatting Rules

PSV is encoded in UTF-8
data rows
- data rows must begin with a | (ASCII 0x7c, Unicode U+007c)
- new columns are introduced by further | characters (one | per column)
  - a trailing | at the end of a data row is optonal
  - empty columns at the end of a line are always truncated
- empty columns inside a table may be removed by enabling the squash-empty option
- UTF-8 whitespace surrounding |s is ignored
- any other UTF-8 characters are considered data
  - whitespace within data is retained verbatim
  - whitespace and | can be included as data by preceding them with a \ (ASCII 0x5c, Unicode U+005c)
- \n (ASCII 0x0a, Unicode U+000a) separates data rows
  - \r (ASCII 0x0d, Unicode U+000a) is included as whitespace, and is thus ignored
  - a trailing \n at the end of a file is not required
any text lines which do not begin with a | are retained verbatim, but are not part of a PSV table

These rules are enough to produce simple PSV tables. Horizontal rulers are also available, however, they are "somewhat more complicated" and are thus explained in ruler formatting or psv_format.md

Introductory examples

Creating PSV Tables Manually

To write a PSV table, simply start a line with with a | and some text. Don't worry about spacing or indentation, the psv tool will fix that in a minute. For example, the following, deliberately sloppily entered table:

    |A| B     |     A   anb B
| -
  | false | false | false
|false| true        | false ||||||
  |true       |       false | false
    |true   | true  | true    | yay

will be turned into this:

    | A     | B     | A   anb B |     |
    | ----- | ----- | --------- | --- |
    | false | false | false     |     |
    | false | true  | false     |     |
    | true  | false | false     |     |
    | true  | true  | true      | yay |

with a single call to psv (in this case, the vim [^1] command: vip!psv [^2]).

Some things of note:

all table rows are indented to align with the first row
all rows have been trimmed to the same number of columns
all columns are vertically aligned
a trailing | is always included on every data row
the horizontal ruler has been resized to match the width of each column
the contents of the table has not changed
- e.g. the extra spacing between A and B was retained

(see ruler formatting)

[^1]: You don't have to use vim! psv can be used from any editor or shell script that lets you pipe text through shell commands.

[^2]: which translates to: - v start a visual selection ... - i select everything in ... - p the current paragraph - !psv and replace the current selection with whatever psv makes of it

Using psv Tables Programmatically

psv Tables can also help improve the readibility of test data.

Here is an example of an actual test suite (containing 14 individual unit tests) from psv's own unit testing code (sort_test.go):

func TestSingleSectionSorting(t *testing.T) {

    testTable := psv.TableFromString(`
        | 0 | b | 3  | partial
        | 1 | D
        | 2 | E | 5
        | 3 | a | 4  | unequal
        | 4 | c | 20
        | 5 | C | 10 | row | lengths
        | 6 | e | 5
        | 7 | d | 7
        `)

    testCases := sortingTestCasesFromTable(`
	| name                         | sort  | columns | exp-col | exp-rows        |
	| ---------------------------- | ----- | ------- | ------- | --------------- |
	| no sort                      | false |         |         | 0 1 2 3 4 5 6 7 |
	| default sort                 |       |         |         | 0 1 2 3 4 5 6 7 |
	| sort only when asked to      | false | 2       |         | 0 1 2 3 4 5 6 7 |
	| reverse default sort         |       | ~       |         | 7 6 5 4 3 2 1 0 |
	| reverse reverse default sort |       | ~~      |         | 0 1 2 3 4 5 6 7 |
	| indexed column sort          |       | 2       |         | 3 0 4 5 7 1 6 2 |
	| indexed column sort          |       | 2       | 2       | a b c C d D e E |
	| reverse column sort          |       | ~2      |         | 2 6 1 7 5 4 0 3 |
	| third column sort            |       | 3       |         | 1 5 4 0 3 2 6 7 |
	| numeric sort                 |       | #3      |         | 1 0 3 2 6 7 5 4 |
	| reverse numeric sort         |       | ~#3     |         | 4 5 7 6 2 3 0 1 |
	| numeric reverse sort         |       | #~3     |         | 4 5 7 6 2 3 0 1 |
	| reverse reverse column sort  |       | ~ #~3   |         | 1 0 3 2 6 7 5 4 |
	| partial column sort          |       | 4 2     |         | 4 7 1 6 2 0 5 3 |
	| non-existent column sort     |       | 9       |         | 0 1 2 3 4 5 6 7 |
	`)

    runSortingTestCases(t, testTable.AllRows(), testCases.DataRows())
}

In the example above, two tables are defined:

testTable is the reference table to be tested
- it simply contains a few rows of data, in various forms suitable for testing some features of psv
- testTable.AllRows() is used to get a [][]string containing all of the rows in the table.
testCases then defines a series of individual unit tests to be run on testTable
- the first rows (|name|...) is used as a header for the table
  - psv always refers to columns by the value in their first row
    - but the first row is treated the same as all other rows
  - testCases.DataRows() is used to get all of the rows except the first row
  - the second row in the table is a ruler
    - rulers are decorative in nature and may be used to influence column alignment and sorting preferences, but they do not appear in the [][]string array of data!

Detailed Description

psv reads, formats and writes simple tables of data in text files.

In doing so, psv focuses on human readibility and ease of use, rather than trying to provide a loss-less, ubiquitous, machine-readable data transfer format.

The same could be said of markdown, and indeed, psv can be used to generate github-style markdown tables that look nice in their markdown source code, and not just after they have been converted to HTML by the markdown renderer.

Another intended use case is data tables in Gherkin files, which are a central component of Behaviour Driven Development (BDD).

However, the real reason for creating psv was to be able to use text tables as the source of data for running automated tests. Hence the go package.

Main Features

normalisation of rows and columns, so that every row has the same number of cells
automatic table indentation and column alignment
the ability to automatically draw horizontal separation lines, called rulers
the ability to re-format existing tables, while leaving lines which "do not look like table rows" unchanged
a simple way to read data from tables into go programs via the psv go package
the (limited) ability to sort table data
- without interfering with the rest of the table's formatting
and more ...

Not Supported

psv is not intended to replace spreadsheets etc 😄

Among a myriad of other non-features, the following are definitely not supported by psv:

the inclusion of | characters in a cell's data
multi-line cell data
any kind of cell merging or splitting
sorting of complex data formats, including:
- date and/or timestamps (unless they are in ISO-8601 format, which sorts nicely)
- signed numbers (+ and - signs confuse go's collators 😦)
- floating point numbers
- scientific notation
- hexadecimal notation
...

Design Principles

self contained
- psv is a single go binary with no external dependencies
- the psv go package is a single package, also with no external dependecies other than go's standard packages
  - exception: I do include another package of mine to provide simplified testing with meaningful success and error messages.
- all psv actions occur locally (no network access required)
non-destructive
- if psv doesn't know how to interperet a line of text, the text remains unchanged
  - only data rows (lines beginning with a |) and rulers are re-formatted, all other lines remain unchanged
idempotent
- any table generated by psv can also be read be psv
- running a formatted table through psv again must not change the table in any way
easy of use
- normal use should not require any configuration or additional parameters

Markdown Support

Markdown's table format is a subset of the formatting options provided by psv

Specifically:

Markdown tables MUST begin with a row of column names
Markdown tables MUST have exactly one ruler as their second line
Markdown rulers MAY contain the alignment hints :- (left-aligned), -: (right-aligned) or :-: (centered)
Markdown tables MUST NOT have embedded rulers anywhere else

TODO's

add ability to configure the scanner
- allow auto-indent detection
  - -I detect indent by capturing the indent before the first | encountered
- explicitly specify ruler characters (for cli)
  - default autodetect
  - explicit rulers
    - turns off autodetection
    - allows the use of + and - as data
    - options:
      - -rh '-' horizontal ruler
      - -ro '|' outer ruler
      - -ri ':' inner ruler
      - -rc '+' corners
      - -rp 'ophi'
        
        o outer vertical ruler
        
        p padding character
        
        h horizontal ruler (default: same as padding character)
        
        i inner vertical ruler (default: same as outer ruler)
Replace table.Data with table.DataRows

Documentation Links

Installation

psv consists of two components: the psv command and the psv go package.

To use the psv command, you only need the psv binary in your PATH, e.g. ~/bin/psv (see binary installation below).

If you don't want to install "a binary, downloaded from the 'net", you can download the source, (inspect it 😄), and build your own version.

Source Installation

Prerequisites

go 1.18 or later
make (optional, but recommended)

Build Steps

Clone the psv git repository and use make to build, test and install psv in your $GOBIN directory (typically $GOPATH/bin or ~/Go/bin)

git clone -o codeberg https://codeberg.org/japh/psv
cd psv
make install
psv -v

Binary Installation

Note: currently only available for darwin amd64 (64-bit Intel Macs)

download the latest psv.gz from https://codeberg.org/japh/psv/releases
verify psv.gz with gpg --verify psv.gz.asc
compare psv.gz's checksums against those provided with shasum -c psv.gz.sha256
unpack psv.gz with gunzip psv.gz
copy psv to any directory in your $PATH, or use it directly via ./psv
don't forget to check that it is executable, e.g. chmod +x psv

Now you can use the psv command...

Using The `psv` Package In Go Projects

Prerequisites

go 1.18 or later

To use psv in your go project, simply import codeberg.org/japh/psv and go mod tidy will download it, build it and make it available for your project.

See the psv package documentation for the API and code examples.

Alternatives

csv, tsv and delimeter-separated-values tables | wikipedia
- generally, psv tables are just a single type of delimeter separated values format
ASCII Table Writer
- go package for creating tables of almost any form
- more traditional table.SetHeader, table.SetFooter() interface
- more features (incl. colors)
- does not read tables
  - no good for defining test cases etc in code
psv-spec
- an attempt to standardize a CSV replacement using pipes as the delimiter
- focuses on electronic data transfers
- does not provide a tabular layout
- escaping just |, \, \n and \r is nice
  - but does not allow for whitespace quoting
  - future: | " " | could be used by psv to represent a space

References

"There's no such thing as 'plain text'"

Copyright

Documentation ¶

Overview ¶

psv provides methods for handling tables of Pipe-Separated-Values (PSV)

Three basic use cases are supported:

1. Generating PSV tables from code:

doc := &psv.Document{}
doc.AppendRow([]string{"name","score"})
doc.AppendRow([]string{"Alice","3"})
doc.AppendRow([]string{"Bob","2"})
ouput, _ := doc.MarshalText()
fmt.Println(ouput)

| name  | score |
| Alice | 3     |
| Bob   | 2     |

2. Reading data from PSV tables into code:

input := `
	| name  | score |
	| Alice | 3     |
	| Bob   | 2     |
	`

// read the input
doc := &psv.Document{}
doc.UnmarshalText([]byte(input))

// the input may contain any number of tables
// loop through each table looking for 'interesting' data
for _, table := range doc.Tables() {

	// ColumnByNameFunc provides a convenient name => column mapping
	value := table.ColumnByNameFunc()

	// DataRows() returns all rows except the first row (assumed to be a header)
	// AllRows() returns all rows including the first row.
	for r, row := range table.DataRows() {
		fmt.Printf("%s points were awarded to %q\n",
			value(row,"score"),
			value(row,"name"),
		)
	}
}

// Output:
// 3 points were awarded to "Alice"
// 2 points were awarded to "Bob"

3. Re-formatting existing PSV tables via the `psv` command:

% cat input.txt
| name | score
| Alice | 3
| Bob | 2
% psv < input.txt > output.txt
% cat output.txt
| name  | score |
| Alice | 3     |
| Bob   | 2     |

Usage ¶

Document is the main aggregate for building or accessing PSV data. The Document type fulfills the encoding.TextMarshaler and encoding.TextUnmarshaler interfaces for conversion to and from the document's text form.

Documents are built incrementally via Append methods and may be read as a slice of rows. The ability to edit data or randomly access data is not provided.

Internally, a Document may contain any number of Table objects which can be accessed via the Document.Tables method.

Each table then has its own set of column names, prefix etc.

Ruler objects may be used to add separation lines to a table and may be placed anywhere within a table.

The [Markdown] formatter, however, will ignore all but the ruler that appears directly after the first row of data in the table, thus conforming to markdown's requirements.

e.g.

+---------+-------+
| name    | score |
| ------- | ----- |
| Alice   | 25    |
| Bob     | 17    |
| Charlie | 10    |
| ------- - ----- |
| Dave    | 9     |
+---------+-------+

When re-formatted for Markdown would become:

| name    | score |
| ------- | ----- |
| Alice   | 25    |
| Bob     | 17    |
| Charlie | 10    |
| Dave    | 9     |

Index ¶

type Document
type DocumentItem
type MarshalOption
type Prefix
- func (p *Prefix) SplitLine(line string) (*Prefix, string)
type Ruler
type Table
type Text

Examples ¶

Prefix
Table.AllRows
Table.ColumnNames
Table.DataRows

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Document ¶

type Document struct {
	*Prefix // prefix to apply to all tables within in the document
	// contains filtered or unexported fields
}

Document represents a single text file which may contain any number of tables.

func (*Document) AppendRow ¶

func (doc *Document) AppendRow(row []string)

AppendRow appends a row to the end of the current table.

A new table will be added to the document if necessary.

func (*Document) AppendRuler ¶

func (doc *Document) AppendRuler(ruler *Ruler)

AppendRuler appends a ruler to the end of the current table.

A new table will be added to the document if necessary.

func (*Document) AppendText ¶

func (doc *Document) AppendText(line string)

AppendText appends a line of text to the end of the document.

Text lines are in no-way modified and are reproduced verbatim.

Text lines separate multiple tables within a document.

func (*Document) MarshalText ¶

func (doc *Document) MarshalText() (text []byte, err error)

MarshalText returns a text representation of a document, which can be parsed by Document.UnmarshalText

Each table will be re-aligned according to the rules defined via Document.SetMarshalOptions

func (*Document) SetMarshalOptions ¶

func (doc *Document) SetMarshalOptions(opts ...MarshalOption)

TODO: 2025-01-22 define how marshaling / unmarshaling shoulg be configured

func (*Document) SetTablePrefixOnce ¶

func (doc *Document) SetTablePrefixOnce(prefix *Prefix)

SetTablePrefixOnce sets the prefix of the current table if it has not already been set. This is intended for use when parsing tables line-by-line, in which case the prefix of the first row should be used for all rows of the table.

A new table will be added to the document if necessary.

func (*Document) Tables ¶

func (doc *Document) Tables() []*Table

Tables returns the slice of tables within the document

func (*Document) UnmarshalText ¶

func (doc *Document) UnmarshalText(text []byte) error

UnmarshalText parses a psv table into an existing Table object

type DocumentItem ¶

type DocumentItem struct {
	*Text
	*Table
}

DocumentItem represents a single block of text or a table within a document.

If both Text and Table are not nil, then the Text is positioned before the table.

type MarshalOption ¶

type MarshalOption struct{}

TODO: 2025-01-22 define how marshaling / unmarshaling shoulg be configured

type Prefix ¶

type Prefix struct {
	Pattern string
	// contains filtered or unexported fields
}

Prefix can be used to remove or add a prefix from / to each row of a table.

This is useful for re-formatting tables which are embedded in e.g. code comments.

Example ¶

package main

import (
	"fmt"

	"codeberg.org/japh/psv"
)

func main() {
	text := `
        // verified scores
        | name  | scort |
        | Alice | 12    |


        // unverified scores:
        // | name | score |
        // | Adam | 6     |
    `

	doc := &psv.Document{
		Prefix: &psv.Prefix{Pattern: `//`},
	}

	// TODO(Steve): 2025-01-22 UnmarshalText not yet implemented
	doc.UnmarshalText([]byte(text))

	for tn, tbl := range doc.Tables() {
		fmt.Printf("%d: table with indent %q\n", tn, tbl.Prefix.Pattern)
	}

	// TODO: 2025-01-22 Output:
	// 0: table with indent "        "
	// 1: table with indent "     // "
}

func (*Prefix) SplitLine ¶

func (p *Prefix) SplitLine(line string) (*Prefix, string)

SplitLine attempts to match a line with a specific prefix.

If the prefix does not match, then the prefix returned will be nil and the line will be returned as-is.

type Ruler ¶

type Ruler struct {
	Line int
}

Ruler represents a horizontal separator line within a table.

Rulers grow and shrink depending on the width of the data in the column.

Rulers may also include formatting hints for each column individually, e.g. whether the column's data should be aligned to the left, right or center, and whether or not the rows should be sorted by the data in a column.

type Table ¶

type Table struct {
	*Prefix
	// contains filtered or unexported fields
}

Table represents a single table of data.

func (*Table) AllRows ¶

func (tbl *Table) AllRows() [][]string

Example ¶

package main

import (
	"fmt"

	"codeberg.org/japh/psv"
)

func main() {
	tbl := &psv.Table{}
	tbl.AppendRow([]string{"name", "score"})
	tbl.AppendRow([]string{"Adam", "6"})

	for r, row := range tbl.AllRows() {
		fmt.Printf("%d: %v\n", r, row)
	}

}

Output:

0: [name score]
1: [Adam 6]

func (*Table) AppendRow ¶

func (tbl *Table) AppendRow(row []string)

func (*Table) AppendRuler ¶ added in v0.1.1

func (tbl *Table) AppendRuler(ruler *Ruler)

func (*Table) ColumnByNameFunc ¶

func (tbl *Table) ColumnByNameFunc() func([]string, string) string

ColumnByNameFunc returns a function which returns a column's value from a row of data, indexed by the column name.

func (*Table) ColumnNames ¶ added in v0.1.1

func (tbl *Table) ColumnNames() []string

Example ¶

package main

import (
	"fmt"

	"codeberg.org/japh/psv"
)

func main() {
	tbl := &psv.Table{}
	tbl.AppendRow([]string{"name", "score"})
	tbl.AppendRow([]string{"Adam", "6"})

	names := tbl.ColumnNames()
	fmt.Printf("column names: %v\n", names)

}

Output:

column names: [name score]

func (*Table) DataRows ¶ added in v0.1.1

func (tbl *Table) DataRows() [][]string

Example ¶

package main

import (
	"fmt"

	"codeberg.org/japh/psv"
)

func main() {
	tbl := &psv.Table{}
	tbl.AppendRow([]string{"name", "score"})
	tbl.AppendRow([]string{"Adam", "6"})

	for r, row := range tbl.DataRows() {
		fmt.Printf("%d: %v\n", r, row)
	}

}

Output:

0: [Adam 6]

func (*Table) SetPrefixOnce ¶

func (tbl *Table) SetPrefixOnce(prefix *Prefix)

SetPrefixOnce sets the prefix for a table.

type Text ¶

type Text struct {
	Lines []string
}

Text is a collection of lines which appear between tables within a document.

Text lines are never modified in any way and are reproduced verbatim when marshaling a document.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
cmd
psv command psv command	psv command

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

README ¶

PSV - Pipe Separated Values

Introduction

Intended Use Cases

Document Structure

Basic Formatting Rules

Index

Introductory examples

Creating PSV Tables Manually

Using psv Tables Programmatically

Detailed Description

Main Features

Not Supported

Design Principles

Markdown Support

TODO's

Documentation Links

Installation

Source Installation

Prerequisites

Build Steps

Binary Installation

Using The psv Package In Go Projects

Prerequisites

Alternatives

References

Copyright

Documentation ¶

Overview ¶

Usage ¶

Index ¶

Examples ¶

Constants ¶

Variables ¶

Functions ¶

Types ¶

type Document ¶

func (*Document) AppendRow ¶

func (*Document) AppendRuler ¶

func (*Document) AppendText ¶

func (*Document) MarshalText ¶

func (*Document) SetMarshalOptions ¶

func (*Document) SetTablePrefixOnce ¶

func (*Document) Tables ¶

func (*Document) UnmarshalText ¶

type DocumentItem ¶

type MarshalOption ¶

type Prefix ¶

func (*Prefix) SplitLine ¶

type Ruler ¶

type Table ¶

func (*Table) AllRows ¶

func (*Table) AppendRow ¶

func (*Table) AppendRuler ¶ added in v0.1.1

func (*Table) ColumnByNameFunc ¶

func (*Table) ColumnNames ¶ added in v0.1.1

func (*Table) DataRows ¶ added in v0.1.1

func (*Table) SetPrefixOnce ¶

type Text ¶

Source Files ¶

Directories ¶

Using The `psv` Package In Go Projects