jsonextract

package module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 17, 2021 License: MIT Imports: 6 Imported by: 3

README

Tests Go Reference

jsonextract

jsonextract is a Go library for extracting JSON objects from any source. It can be used for data extraction tasks like web scraping.

Examples

Here is an example program that extracts all JSON objects from a file and prints them to the console:

package main

import (
	"fmt"
	"log"
	"os"

	"github.com/xarantolus/jsonextract"
)

func main() {
	file, err := os.Open("file.html")
	if err != nil {
		log.Fatalln(err.Error())
	}
	defer file.Close()

	// Print all JSON objects and arrays found in file.html
	err = jsonextract.Reader(file, func(b []byte) error {
		fmt.Println(string(b))

		return nil
	})
	if err != nil {
		log.Fatalln(err.Error())
	}
}
Extractor program

There's a small extractor program that uses this library to get data from URLs and files.

If you want to give it a try, you can just go-get it:

go get -u github.com/xarantolus/jsonextract/cmd/jsonx

You can use it on files or URLs, e.g. like this:

jsonx reader_test.go

or on URLs like this:

jsonx "https://stackoverflow.com/users/5728357/xarantolus?tab=topactivity"
Other examples

There are also examples in the examples subdirectory.

The string example shows how to use the package to quickly get all JSON objects/arrays in a string, it uses an strings.Reader for that.

The stackoverflow-chart example shows how to extract the reputation chart data of a StackOverflow user. Extracted data is then used to draw the same chart using Go:

Comparing chart from StackOverflow and the scraped and drawn result

Notes

After passing the io.Reader to functions of this package, you should no longer use it afterwards. It might be read to the end, but in cases of stopping (using ErrStop) some data might remain in the reader.

License

This is free as in freedom software. Do whatever you like with it.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	// ErrStop can be returned from a JSONCallback function to signal that processing should stop
	// at this object
	ErrStop = errors.New("stop processing json")
)

Functions

func Reader

func Reader(reader io.Reader, callback JSONCallback) (err error)

Reader reads all JSON objects from the input and calls callback for each of them. If callback returns an error, Reader will stop processing and return the error. If the returned error is ErrStop, Reader will return nil instead of the error.

func ReaderObjects

func ReaderObjects(reader io.Reader) (objects []json.RawMessage, err error)

ReaderObjects takes the given io.Reader and reads all possible JSON objects it can find in it. Assumes the stream to consist of utf8 bytes

Types

type JSONCallback

type JSONCallback func([]byte) error

JSONCallback is the callback function passed to Reader. Found JSON objects will be passed to it as bytes. If this function returns an error, processing will stop and return that error. If the returned error is ErrStop, processing will stop without an error.

Directories

Path Synopsis
cmd
jsonx command
examples
string command

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL