document

package
v1.1.8 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 25, 2025 License: MIT Imports: 10 Imported by: 0

Documentation

Overview

Package document contains Document structs and Parsers prepare for RAG

Index

Constants

This section is empty.

Variables

View Source
var ErrReading = errors.New("document is reading")

Functions

func EscapeMarkdown added in v1.1.8

func EscapeMarkdown(s string) string

EscapeMarkdown escapes special characters in a string for Markdown

func StripUnprintable added in v1.1.8

func StripUnprintable(s string) string

Types

type ClosableDocument

type ClosableDocument interface {
	Close() error
}

type Document

type Document struct {
	// contains filtered or unexported fields
}

Document is a document container with metadata

func (*Document) Content added in v1.1.8

func (d *Document) Content() string

func (*Document) Meta

func (d *Document) Meta() map[string]string

func (*Document) Reader

func (d *Document) Reader() *bytes.Reader

type File

type File struct {
	Document
	// contains filtered or unexported fields
}

func NewFile

func NewFile(fname string) (*File, error)

func (*File) Close

func (d *File) Close() error

func (*File) Read

func (d *File) Read() (chan<- []byte, error)

func (*File) ReadAll

func (d *File) ReadAll() error

func (*File) ReadStatus

func (d *File) ReadStatus() ReadStatus

type Http

type Http struct {
	Document
	// contains filtered or unexported fields
}

func NewHttp

func NewHttp(opts ...HttpOption) (*Http, error)

func (*Http) Read

func (h *Http) Read() (chan<- []byte, error)

func (*Http) ReadAll

func (h *Http) ReadAll() error

func (*Http) ReadStatus

func (h *Http) ReadStatus() ReadStatus

type HttpConfig

type HttpConfig struct {
	// contains filtered or unexported fields
}

type HttpOption

type HttpOption func(*HttpConfig)

func WithHttpClient

func WithHttpClient(client *http.Client) HttpOption

func WithHttpMethod

func WithHttpMethod(method string) HttpOption

func WithHttpURL

func WithHttpURL(link string) HttpOption

func WithPayload

func WithPayload(payload io.Reader) HttpOption

type IDocument added in v1.1.8

type IDocument interface {
	Content() string
	Meta() map[string]string
	Reader() *bytes.Reader
}

type Parser

type Parser interface {
	Parse(context.Context, *bytes.Reader, io.Writer) error
}

type ReadStatus

type ReadStatus = int32
const (
	Unread ReadStatus = iota
	Reading
	ReadCompleted
)

type ReadableDocument

type ReadableDocument interface {
	ReadAll() error
	Read() (chan<- []byte, error)
}

Directories

Path Synopsis
Package parsers include different parsers implementation
Package parsers include different parsers implementation
docx
Package docx a parser for docx
Package docx a parser for docx
html
Package html a parser for html
Package html a parser for html
pdf
Package pdf a parser for PDF
Package pdf a parser for PDF
pptx
Package pptx a Parser for pptx
Package pptx a Parser for pptx
xlsx
Package xlsx a xlsx parser
Package xlsx a xlsx parser

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL