gobls

package module
v1.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 1, 2019 License: MIT Imports: 3 Imported by: 4

README

gobls

Gobls is a buffered line scanner for Go.

GoDoc

Description

Similar to bufio.Scanner, but wraps bufio.Reader.ReadLine so lines of arbitrary length can be scanned. It uses a hybrid approach so that in most cases, when lines are not unusually long, the fast code path is taken. When lines are unusually long, it uses the per-scanner pre-allocated byte slice to reassemble the fragments into a single slice of bytes.

Example

    var lines, characters int
    ls := gobls.NewScanner(os.Stdin)
    for ls.Scan() {
        lines++
        characters += len(ls.Bytes())
    }
    if err:= ls.Err(); err != nil {
        fmt.Fprintln(os.Stderr, "cannot scan:", err)
    }
    fmt.Println("Counted",lines,"lines and",characters,"characters.")

Performance

On my test system, gobls scanner takes from 2% to nearly 40% longer than bufio scanner, depending on the length of the lines to be scanned. The 40% longer times were only observed when line lengths were bufio.MaxScanTokenSize bytes long. Usually the performance penalty is 2% to 15% of bufio measurements.

Run go test -bench=. on your system for comparison. I'm sure the testing method could be improved. Suggestions are welcomed.

I recommend using standard library's bufio scanner for programs unless a specific program must be able to parse lines that exceed a very large constant, bufio.MaxScanTokenSize. In this case, the additional delay due to extremely long lines may be an acceptible tradeoff compared to the errors that would be returned by bufio.Scanner.

Documentation

Index

Constants

View Source
const DefaultBufferSize = 16 * 1024

DefaultBufferSize specifies the initial bytes size each gobls scanner will allocate to be used for aggregation of line fragments.

Variables

This section is empty.

Functions

This section is empty.

Types

type BufferScanner added in v1.3.0

type BufferScanner struct {
	// contains filtered or unexported fields
}

BufferScanner enumerates newline terminated strings from a provided slice of bytes faster than bufio.Scanner and gobls.Scanner. This is particular useful when a program already has the entire buffer in a slice of bytes. This structure uses newline as the line terminator, but returns nether the newline nor an optional carriage return from each discovered string.

func (*BufferScanner) Bytes added in v1.3.0

func (b *BufferScanner) Bytes() []byte

Bytes returns the byte slice that was just scanned. It does not return the terminating newline character, nor any optional preceding carriage return character.

func (*BufferScanner) Err added in v1.3.0

func (b *BufferScanner) Err() error

Err returns nil because scanning from a slice of bytes will never cause an error.

func (*BufferScanner) Scan added in v1.3.0

func (b *BufferScanner) Scan() bool

Scan will scan the text from the original slice of bytes, and return true if scanning ought to continue or false if scanning is complete, because of the end of the slice of bytes.

func (*BufferScanner) Text added in v1.3.0

func (b *BufferScanner) Text() string

Text returns the string representation of the byte slice returned by the most recent Scan call. It does not return the terminating newline character, nor any optional preceding carriage return character.

type Scanner

type Scanner interface {
	Bytes() []byte
	Err() error
	Scan() bool
	Text() string
}

Scanner provides an interface for reading newline-delimited lines of text. It is similar to bufio.Scanner, but wraps the ReadLine method of bufio.Reader so lines of arbitrary length can be scanned. Successive calls to the Scan method will step through the lines of a file, skipping the newline whitespace between lines.

Scanning stops unrecoverably at EOF, or at the first I/O error. Unlike bufio.Scanner, however, attempting to scan a line longer than bufio.MaxScanTokenSize will not result in an error, but will return the long line.

Also like bufio.Scanner, it is not necessary to check for errors by calling the Err method until after scanning stops, when the Scan method returns false.

This Scanner ought behave exactly like bufio.Scanner. All methods ought to have the exact same return values while stepping through the given the provided io.Reader.

func NewBufferScanner added in v1.3.0

func NewBufferScanner(buf []byte) Scanner

NewBufferScanner returns a BufferScanner that enumerates newline terminated strings from buf.

func NewScanner

func NewScanner(r io.Reader) Scanner

NewScanner returns a scanner that reads from the specified `io.Reader`. It allocates a scanning buffer with the default buffer size. This per-scanner buffer will grow to accomodate extremely long lines.

var lines, characters int
ls := gobls.NewScanner(os.Stdin)
for ls.Scan() {
    lines++
    characters += len(ls.Bytes())
}
if ls.Err() != nil {
    fmt.Fprintln(os.Stderr, "cannot scan:", ls.Err())
}
fmt.Println("Counted",lines,"and",characters,"characters.")

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL