segment

package
v0.0.0-...-173d032 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 8, 2026 License: AGPL-3.0 Imports: 11 Imported by: 1

README

README

注册驱动

import (
	_ "github.com/coscms/webfront/library/search/segment/gojieba"
	_ "github.com/coscms/webfront/library/search/segment/jiebago"
    _ "github.com/coscms/webfront/library/search/segment/sego"
)

Documentation

Index

Constants

This section is empty.

Variables

View Source
var (
	DefaultEngine = atomic.NewString(`sego`) // gojieba / sego / jiebago

	Filters []Filter
)
View Source
var (

	// DictFile 分词词典文件
	DictFile string
)

Functions

func AddFilter

func AddFilter(filter Filter)

AddFilter appends a new filter to the global Filters collection.

func ApplySegmentConfig

func ApplySegmentConfig(c *config.Config)

ApplySegmentConfig applies segment configuration from the given config object. It initializes or updates the segment engine based on the configuration. If the engine is 'api', it will register a new API segment with the provided URL and key. The function handles engine switching and cleanup of previous segment instances.

func CleanStopWords

func CleanStopWords(v string) string

CleanStopWords removes all stop words from the input string and replaces them with spaces. It returns the cleaned string with stop words removed.

func CleanStopWordsFromSlice

func CleanStopWordsFromSlice(v []string) (r []string)

CleanStopWordsFromSlice removes stop words from the input string slice and returns a new slice containing only non-stop words. Stop words are loaded from StopWords() if not already initialized.

func DoFilter

func DoFilter(v string) bool

DoFilter checks if the given string passes all registered filters. Returns true if all filters accept the string, false otherwise.

func Has

func Has(name string) bool

Has checks if a segment with the given name exists in the segments collection.

func IsInitialized

func IsInitialized() bool

IsInitialized reports whether the default segment has been initialized.

func IsNop

func IsNop(segment Segment) bool

IsNop reports whether the given segment is a no-operation segment.

func LoadStopWordsDict

func LoadStopWordsDict(stopWordsFile string, args ...bool)

LoadStopWordsDict loads stop words from the specified file into memory. If rebuild is true, it clears existing stop words before loading new ones. Non-empty lines in the file are treated as stop words after trimming whitespace.

func NewAPI

func NewAPI(apiURL string, apiKey string) *apiSegment

NewAPI creates a new apiSegment instance with the provided API URL and API key.

func Register

func Register(name string, c func() Segment)

Register adds a new segment type with the given name and constructor function.

func ReloadDict

func ReloadDict(dictFiles ...string) error

ReloadDict 重新加载词典

func ResetSegment

func ResetSegment()

ResetSegment closes the current segment if it exists and resets the segment initialization state. This allows for a new segment to be created on next use.

func ResetStopwords

func ResetStopwords()

ResetStopwords resets the stopwords initialization state, allowing stopwords to be reloaded on next use.

func SplitWords

func SplitWords(b []byte, args ...string) []string

SplitWords 分词

func SplitWordsAsString

func SplitWordsAsString(b []byte, args ...string) string

SplitWordsAsString 将分词结果作为字串返回

func SplitWordsBy

func SplitWordsBy(b []byte, mode string, args ...string) []string

SplitWordsBy 按模式分词

func StopWords

func StopWords() []string

StopWords returns the list of stop words that are loaded once and cached. The stop words are initialized on first call and returned from cache on subsequent calls.

func Unregister

func Unregister(name string)

Unregister removes the segment with the specified name from the registry

Types

type Filter

type Filter func(string) bool

type Segment

type Segment interface {
	//载入词典(词典路径,词典类型)
	LoadDict(string, ...string) error

	//分词(文本,词性)
	Segment(string, ...string) []string

	//分词(文本,分词模式,词性)
	SegmentBy(string, string, ...string) []string

	//关闭或释放资源
	Close() error
}

Segment interface

func Default

func Default() Segment

Default returns the default Segment instance, initializing it if necessary. The initialization is thread-safe and will only occur once.

func Get

func Get(name string) Segment

Get returns the segment engine with the given name. If the specified segment engine is not found, it returns a default no-op segment and logs an error message.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL