Documentation
¶
Index ¶
- Constants
- func NormalizeEntityKey(label, text string) string
- func NormalizeRelationKey(label, headEntityKey, tailEntityKey string) string
- type Enricher
- type Entity
- type EntityRecord
- type ExtractOptions
- type Extraction
- type Extractor
- type Option
- func WithBatchSize(batchSize int) Option
- func WithEntityLabels(labels []string) Option
- func WithEntityThreshold(threshold float32) Option
- func WithRelationLabels(labels []string) Option
- func WithRelationThreshold(threshold float32) Option
- func WithTextBuilder(fn func(docsaf.DocumentSection) string) Option
- type Relation
- type RelationRecord
- type Result
Constants ¶
const DefaultBatchSize = 32
Variables ¶
This section is empty.
Functions ¶
func NormalizeEntityKey ¶
NormalizeEntityKey creates a stable, sortable key from an entity label and text.
func NormalizeRelationKey ¶
NormalizeRelationKey creates a stable key for a relation between two entity keys.
Types ¶
type Enricher ¶
type Enricher struct {
// contains filtered or unexported fields
}
Enricher runs batched entity extraction over docsaf sections.
func NewEnricher ¶
NewEnricher creates a new entity enricher.
type EntityRecord ¶
EntityRecord tracks a canonical entity node and how often it was mentioned.
func (EntityRecord) ToDocument ¶
func (r EntityRecord) ToDocument() map[string]any
ToDocument converts an EntityRecord to a storage-ready document map.
type ExtractOptions ¶
ExtractOptions configures an Extractor request.
type Extraction ¶
Extraction contains the entities and relations extracted from one input text.
type Extractor ¶
type Extractor interface {
Extract(ctx context.Context, texts []string, opts ExtractOptions) ([]Extraction, error)
}
Extractor extracts entities and optionally relations from a batch of texts.
type Option ¶
type Option func(*Enricher)
Option configures an Enricher.
func WithBatchSize ¶
WithBatchSize sets the number of sections per extractor request.
func WithEntityLabels ¶
WithEntityLabels sets entity labels used by the extractor.
func WithEntityThreshold ¶
WithEntityThreshold sets the minimum entity score to keep.
func WithRelationLabels ¶
WithRelationLabels sets relation labels used by the extractor.
func WithRelationThreshold ¶
WithRelationThreshold sets the minimum relation score to keep.
func WithTextBuilder ¶
func WithTextBuilder(fn func(docsaf.DocumentSection) string) Option
WithTextBuilder overrides how section text is prepared for extraction.
type RelationRecord ¶
type RelationRecord struct {
ID string
Label string
HeadEntity string
TailEntity string
HeadName string
TailName string
HeadLabel string
TailLabel string
Weight float64
MentionCount int
}
RelationRecord tracks a canonical relation node and the sections that mention it.
func (RelationRecord) ToDocument ¶
func (r RelationRecord) ToDocument() map[string]any
ToDocument converts a RelationRecord to a storage-ready document map.
type Result ¶
type Result struct {
EntityRecords map[string]EntityRecord
SectionEntityKeys map[string][]string
RelationRecords map[string]RelationRecord
SectionRelationKeys map[string][]string
}
Result contains extracted entities and relations grouped by section ID.
func (*Result) EntityLabelCounts ¶
EntityLabelCounts summarizes unique entities by label.
func (*Result) RelationLabelCounts ¶
RelationLabelCounts summarizes unique relations by label.