gobls

package module
v1.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 31, 2018 License: MIT Imports: 2 Imported by: 4

README

gobls

Gobls is a buffered line scanner for Go.

GoDoc

Description

Similar to bufio.Scanner, but wraps bufio.Reader.ReadLine so lines of arbitrary length can be scanned. It uses a hybrid approach so that in most cases, when lines are not unusually long, the fast code path is taken. When lines are unusually long, it uses the per-scanner pre-allocated byte slice to reassemble the fragments into a single slice of bytes.

Example

    var lines, characters int
    ls := gobls.NewScanner(os.Stdin)
    for ls.Scan() {
        lines++
        characters += len(ls.Bytes())
    }
    if err:= ls.Err(); err != nil {
        fmt.Fprintln(os.Stderr, "cannot scan:", err)
    }
    fmt.Println("Counted",lines,"lines and",characters,"characters.")

Performance

On my test system, gobls scanner takes from 2% to nearly 40% longer than bufio scanner, depending on the length of the lines to be scanned. The 40% longer times were only observed when line lengths were bufio.MaxScanTokenSize bytes long. Usually the performance penalty is 2% to 15% of bufio measurements.

Run go test -bench=. on your system for comparison. I'm sure the testing method could be improved. Suggestions are welcomed.

I recommend using standard library's bufio scanner for programs unless a specific program must be able to parse lines that exceed a very large constant, bufio.MaxScanTokenSize. In this case, the additional delay due to extremely long lines may be an acceptible tradeoff compared to the errors that would be returned by bufio.Scanner.

Documentation

Index

Constants

View Source
const DefaultBufferSize = 16 * 1024

DefaultBufferSize specifies the initial bytes size each gobls scanner will allocate to be used for aggregation of line fragments.

Variables

This section is empty.

Functions

This section is empty.

Types

type Scanner

type Scanner interface {
	Bytes() []byte
	Err() error
	Scan() bool
	Text() string
}

Scanner provides an interface for reading newline-delimited lines of text. It is similar to bufio.Scanner, but wraps the ReadLine method of bufio.Reader so lines of arbitrary length can be scanned. Successive calls to the Scan method will step through the lines of a file, skipping the newline whitespace between lines.

Scanning stops unrecoverably at EOF, or at the first I/O error. Unlike bufio.Scanner, however, attempting to scan a line longer than bufio.MaxScanTokenSize will not result in an error, but will return the long line.

Also like bufio.Scanner, it is not necessary to check for errors by calling the Err method until after scanning stops, when the Scan method returns false.

This Scanner ought behave exactly like bufio.Scanner. All methods ought to have the exact same return values while stepping through the given the provided io.Reader.

func NewScanner

func NewScanner(r io.Reader) Scanner

NewScanner returns a scanner that reads from the specified `io.Reader`. It allocates a scanning buffer with the default buffer size. This per-scanner buffer will grow to accomodate extremely long lines.

var lines, characters int
ls := gobls.NewScanner(os.Stdin)
for ls.Scan() {
    lines++
    characters += len(ls.Bytes())
}
if ls.Err() != nil {
    fmt.Fprintln(os.Stderr, "cannot scan:", ls.Err())
}
fmt.Println("Counted",lines,"and",characters,"characters.")

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL