Documentation
¶
Index ¶
- Constants
- Variables
- func FileReader(file string, key x.Sensitive) (*bufio.Reader, func())
- func IsJSONData(r *bufio.Reader) (bool, error)
- func ParseJSON(b []byte, op int) ([]*api.NQuad, *pb.Metadata, error)
- func ParseRDF(line string, l *lex.Lexer) (api.NQuad, error)
- func ParseRDFs(b []byte) ([]*api.NQuad, *pb.Metadata, error)
- func StreamReader(file string, key x.Sensitive, f io.ReadCloser) (rd *bufio.Reader, cleanup func())
- type Chunker
- type InputFormat
- type NQuadBuffer
- func (buf *NQuadBuffer) Ch() <-chan []*api.NQuad
- func (buf *NQuadBuffer) FastParseJSON(b []byte, op int) error
- func (buf *NQuadBuffer) Flush()
- func (buf *NQuadBuffer) Metadata() *pb.Metadata
- func (buf *NQuadBuffer) ParseJSON(b []byte, op int) error
- func (buf *NQuadBuffer) Push(nqs ...*api.NQuad)
- func (buf *NQuadBuffer) PushPredHint(pred string, hint pb.Metadata_HintType)
Constants ¶
const ( // SetNquads is the constant used to indicate that the parsed NQuads are meant to be added. SetNquads = iota // DeleteNquads is the constant used to indicate that the parsed NQuads are meant to be // deleted. DeleteNquads )
Variables ¶
var ( // ErrEmpty indicates that the parser encountered a harmless error (e.g empty line or comment). ErrEmpty = errors.New("RDF: harmless error, e.g. comment line") )
Functions ¶
func FileReader ¶
FileReader returns an open reader on the given file. Gzip-compressed input is detected and decompressed automatically even without the gz extension. The key, if non-nil, is used to decrypt the file. The caller is responsible for calling the returned cleanup function when done with the reader.
func IsJSONData ¶
IsJSONData returns true if the reader, which should be at the start of the stream, is reading a JSON stream, false otherwise.
func ParseJSON ¶
ParseJSON is a convenience wrapper function to get all NQuads in one call. This can however, lead to high memory usage. So be careful using this.
func ParseRDF ¶
ParseRDF parses a mutation string and returns the N-Quad representation for it. It parses N-Quad statements based on http://www.w3.org/TR/n-quads/.
func ParseRDFs ¶
ParseRDFs is a convenience wrapper function to get all NQuads in one call. This can however, lead to high memory usage. So, be careful using this.
func StreamReader ¶
func StreamReader(file string, key x.Sensitive, f io.ReadCloser) ( rd *bufio.Reader, cleanup func())
StreamReader returns a bufio given a ReadCloser. The file is passed just to check for .gz files
Types ¶
type Chunker ¶
type Chunker interface {
Chunk(r *bufio.Reader) (*bytes.Buffer, error)
Parse(chunkBuf *bytes.Buffer) error
NQuads() *NQuadBuffer
}
Chunker describes the interface to parse and process the input to the live and bulk loaders.
func NewChunker ¶
func NewChunker(inputFormat InputFormat, batchSize int) Chunker
NewChunker returns a new chunker for the specified format.
type InputFormat ¶
type InputFormat byte
InputFormat represents the multiple formats supported by Chunker.
const ( // UnknownFormat is a constant to denote a format not supported by the bulk/live loaders. UnknownFormat InputFormat = iota // RdfFormat is a constant to denote the input to the live/bulk loader is in the RDF format. RdfFormat // JsonFormat is a constant to denote the input to the live/bulk loader is in the JSON format. JsonFormat )
func DataFormat ¶
func DataFormat(filename string, format string) InputFormat
DataFormat returns a file's data format (RDF, JSON, or unknown) based on the filename or the user-provided format option. The file extension has precedence.
type NQuadBuffer ¶
type NQuadBuffer struct {
// contains filtered or unexported fields
}
NQuadBuffer batches up batchSize NQuads per push to channel, accessible via Ch(). If batchSize is negative, it only does one push to Ch() during Flush.
func NewNQuadBuffer ¶
func NewNQuadBuffer(batchSize int) *NQuadBuffer
NewNQuadBuffer returns a new NQuadBuffer instance with the specified batch size.
func (*NQuadBuffer) Ch ¶
func (buf *NQuadBuffer) Ch() <-chan []*api.NQuad
Ch returns a channel containing slices of NQuads which can be consumed by the caller.
func (*NQuadBuffer) FastParseJSON ¶
func (buf *NQuadBuffer) FastParseJSON(b []byte, op int) error
FastParseJSON currently parses NQuads about 30% faster than ParseJSON.
This function is very similar to buf.ParseJSON, but we just replace encoding/json with simdjson-go.
func (*NQuadBuffer) Flush ¶
func (buf *NQuadBuffer) Flush()
Flush must be called at the end to push out all the buffered NQuads to the channel. Once Flush is called, this instance of NQuadBuffer should no longer be used.
func (*NQuadBuffer) Metadata ¶
func (buf *NQuadBuffer) Metadata() *pb.Metadata
Metadata returns the parse metadata that has been aggregated so far..
func (*NQuadBuffer) ParseJSON ¶
func (buf *NQuadBuffer) ParseJSON(b []byte, op int) error
ParseJSON parses the given byte slice and pushes the parsed NQuads into the buffer.
func (*NQuadBuffer) Push ¶
func (buf *NQuadBuffer) Push(nqs ...*api.NQuad)
Push can be passed one or more NQuad pointers, which get pushed to the buffer.
func (*NQuadBuffer) PushPredHint ¶
func (buf *NQuadBuffer) PushPredHint(pred string, hint pb.Metadata_HintType)
PushPredHint pushes and aggregates hints about the type of the predicate derived during the parsing. This metadata is expected to be a lot smaller than the set of NQuads so it's not necessary to send them in batches.