Documentation
¶
Index ¶
- type ContentExtractor
- type GenericStreamWrapper
- func (w *GenericStreamWrapper) GetSupportedTypes() []string
- func (w *GenericStreamWrapper) Parse(ctx context.Context, content []byte, metadata map[string]any) (*core.Document, error)
- func (w *GenericStreamWrapper) ParseStream(ctx context.Context, r io.Reader, metadata map[string]any) (<-chan *core.Document, error)
- func (w *GenericStreamWrapper) Supports(contentType string) bool
- type TriplesExtractor
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ContentExtractor ¶
ContentExtractor is a simple function signature that takes an io.Reader and returns the full text. This is used for legacy parsers or file types that cannot be easily streamed (like Docx/PDF).
type GenericStreamWrapper ¶
type GenericStreamWrapper struct {
// contains filtered or unexported fields
}
GenericStreamWrapper wraps a full-read extractor into a core.Parser (streaming interface). It's a bridge for legacy parsers.
func NewGenericStreamWrapper ¶
func NewGenericStreamWrapper(name string, types []string, extractor ContentExtractor) *GenericStreamWrapper
func (*GenericStreamWrapper) GetSupportedTypes ¶
func (w *GenericStreamWrapper) GetSupportedTypes() []string
func (*GenericStreamWrapper) Parse ¶ added in v1.1.3
func (w *GenericStreamWrapper) Parse(ctx context.Context, content []byte, metadata map[string]any) (*core.Document, error)
Parse implements core.Parser using the stream parser internally.
func (*GenericStreamWrapper) ParseStream ¶
func (*GenericStreamWrapper) Supports ¶ added in v1.1.3
func (w *GenericStreamWrapper) Supports(contentType string) bool
Supports checks if the content type is supported.
type TriplesExtractor ¶
type TriplesExtractor struct {
// contains filtered or unexported fields
}
TriplesExtractor uses an LLM to extract knowledge triples from text.
func NewTriplesExtractor ¶
func NewTriplesExtractor(llm chat.Client) *TriplesExtractor
Click to show internal directories.
Click to hide internal directories.