Documentation
¶
Index ¶
- func BOW(doc Document) []int
- func CalculateBM25Scores(query string, documents []string, avgdl float64, k1 float64, b float64) []float64
- func Cosine(a []float64, b []float64) float64
- func MakeCorpus(a []string) (map[string]int, []string)
- func TF(doc Document) []float64
- type Doc
- type DocScore
- type DocScores
- type Document
- type ScoreFn
- type TFIDF
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func BOW ¶
BOW turns a document into a bag of words. The words of the document will have been deduplicated. A unique list of word IDs is then returned.
Types ¶
type DocScores ¶
type DocScores []DocScore
DocScores is a list of DocScore
type TFIDF ¶
type TFIDF struct {
// Term Frequency
TF map[int]float64
// Inverse Document Frequency
IDF map[int]float64
// Docs is the count of documents
Docs int
// Len is the total length of docs
Len int
sync.Mutex
}
TFIDF is a structure holding the relevant state information about TF/IDF
func (*TFIDF) CalculateIDF ¶
func (tf *TFIDF) CalculateIDF()
CalculateIDF calculates the inverse document frequency
Click to show internal directories.
Click to hide internal directories.