textutil

package

v0.3.1 Latest Latest Go to latest Published: Jun 1, 2026 License: MIT Imports: 7 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/TrebuchetDynamics/goncho

Links

Open Source Insights

Documentation ¶

Index ¶

func ApproxTokens(content string) int
func CloneStrings(in []string) []string
func CollapseWhitespace(value string) string
func CompactWhitespace(value string, limit int, empty string) string
func ContainsAllSubstringsFold(value string, markers []string) bool
func ContainsAnySubstring(value string, markers []string) bool
func ContainsAnySubstringFold(value string, markers []string) bool
func ContainsEitherSubstring(a, b string) bool
func ContainsEitherSubstringFold(a, b string) bool
func ContainsEqualFoldTrimmed(values []string, want string) bool
func ContainsTrimmed(values []string, want string) bool
func CutAnyPrefixFold(value string, prefixes []string) (tail string, ok bool)
func CutAroundAnySubstringFold(value string, markers []string) (before, after string, ok bool)
func CutAroundAnySubstringFoldMatch(value string, markers []string) (before, marker, after string, ok bool)
func CutBeforeAnySubstringFold(value string, markers ...string) (string, bool)
func EqualFoldTrimmed(a, b string) bool
func EqualTrimmed(a, b string) bool
func FirstNonBlank(values ...string) string
func FirstWords(content string, n int) string
func FitsTokenBudget(used, cost, budget int, allowFirstOverBudget bool) bool
func HasAnyPrefixFold(value string, prefixes ...string) bool
func IsBlank(value string) bool
func LowerTrimmed(value string) string
func LowerTrimmedSet(values []string) map[string]struct{}
func MatchesOptionalTrimmed(value, filter string) bool
func MatchesOptionalTrimmedOrEmpty(value, filter string) bool
func NonBlank(value string) bool
func NormalizeUnique(values []string, normalize Normalizer, sortOutput bool) []string
func Set(values []string, normalize Normalizer) map[string]struct{}
func SortedSetValues(values map[string]struct{}, normalize Normalizer) []string
func TrimQuestionPhraseBoundary(value string) string
func TrimQuestionPunctuation(value string) string
func TrimSentenceBoundary(value string) string
func TrimSpaceAndQuotes(value string) string
func TrimmedSet(values []string) map[string]struct{}
func TruncateUTF8Bytes(value string, limit int) string
func UniqueLowerTrimmed(values []string, sortOutput bool) []string
func UniqueTrimmed(values []string, sortOutput bool) []string
func UpperTrimmed(value string) string
func WordCount(content string) int
type Normalizer

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ApproxTokens ¶

func ApproxTokens(content string) int

ApproxTokens returns Goncho's stable, low-cost token estimate for budgeting. Blank content is treated as one token so callers never undercount an empty but present field.

func CloneStrings ¶

func CloneStrings(in []string) []string

CloneStrings returns a shallow copy of a string slice.

func CollapseWhitespace ¶

func CollapseWhitespace(value string) string

CollapseWhitespace trims leading/trailing whitespace and converts any run of Unicode whitespace to a single ASCII space.

func CompactWhitespace ¶

func CompactWhitespace(value string, limit int, empty string) string

CompactWhitespace collapses whitespace and limits the result to limit bytes, trimming a partial trailing word/space boundary the same way existing preview callers historically did.

func ContainsAllSubstringsFold ¶

func ContainsAllSubstringsFold(value string, markers []string) bool

ContainsAllSubstringsFold reports whether value contains every non-blank marker after trimming markers and applying the same simple case-fold policy used by Goncho text filters. Blank markers are ignored.

func ContainsAnySubstring ¶

func ContainsAnySubstring(value string, markers []string) bool

ContainsAnySubstring reports whether value contains at least one marker.

func ContainsAnySubstringFold ¶

func ContainsAnySubstringFold(value string, markers []string) bool

ContainsAnySubstringFold reports whether value contains at least one marker, comparing with the same simple case-fold policy used by Goncho text filters.

func ContainsEitherSubstring ¶

func ContainsEitherSubstring(a, b string) bool

ContainsEitherSubstring reports whether either value contains the other.

func ContainsEitherSubstringFold ¶

func ContainsEitherSubstringFold(a, b string) bool

ContainsEitherSubstringFold reports whether either value contains the other after applying the same simple case-fold policy used by Goncho text filters.

func ContainsEqualFoldTrimmed ¶

func ContainsEqualFoldTrimmed(values []string, want string) bool

ContainsEqualFoldTrimmed reports whether values contains want after trimming ASCII/Unicode whitespace and applying Unicode case-folding.

func ContainsTrimmed ¶

func ContainsTrimmed(values []string, want string) bool

ContainsTrimmed reports whether values contains want after trimming ASCII/ Unicode whitespace on both sides.

func CutAnyPrefixFold ¶

func CutAnyPrefixFold(value string, prefixes []string) (tail string, ok bool)

CutAnyPrefixFold removes the first matching prefix using the same simple case-fold policy as Goncho text classifiers. The returned tail preserves the original casing and spacing from value.

func CutAroundAnySubstringFold ¶

func CutAroundAnySubstringFold(value string, markers []string) (before, after string, ok bool)

CutAroundAnySubstringFold splits value around the first matching marker using simple case-folding. The returned parts preserve the original casing and spacing from value.

func CutAroundAnySubstringFoldMatch ¶

func CutAroundAnySubstringFoldMatch(value string, markers []string) (before, marker, after string, ok bool)

CutAroundAnySubstringFoldMatch is like CutAroundAnySubstringFold and also returns the matching policy marker.

func CutBeforeAnySubstringFold ¶

func CutBeforeAnySubstringFold(value string, markers ...string) (string, bool)

CutBeforeAnySubstringFold returns value before the first matching marker, using case-insensitive matching. Empty markers are ignored.

func EqualFoldTrimmed ¶

func EqualFoldTrimmed(a, b string) bool

EqualFoldTrimmed reports whether two strings are equal after trimming ASCII/ Unicode whitespace and applying Unicode case-folding.

func EqualTrimmed ¶

func EqualTrimmed(a, b string) bool

EqualTrimmed reports whether two strings are equal after trimming ASCII/ Unicode whitespace.

func FirstNonBlank ¶

func FirstNonBlank(values ...string) string

func FirstWords ¶

func FirstWords(content string, n int) string

FirstWords returns the first n whitespace-delimited words from content. When content has n or fewer words, it preserves the caller-visible trimmed text instead of rebuilding spacing between words.

func FitsTokenBudget ¶

func FitsTokenBudget(used, cost, budget int, allowFirstOverBudget bool) bool

FitsTokenBudget reports whether an item with cost can be added after used. When allowFirstOverBudget is true, the first item is admitted even when it exceeds the budget so callers can return at least one relevant result.

func HasAnyPrefixFold ¶

func HasAnyPrefixFold(value string, prefixes ...string) bool

HasAnyPrefixFold reports whether value starts with any prefix, case-insensitively. Empty prefixes are ignored.

func IsBlank ¶

func IsBlank(value string) bool

IsBlank reports whether value is empty after trimming Unicode whitespace.

func LowerTrimmed ¶

func LowerTrimmed(value string) string

LowerTrimmed trims surrounding whitespace and applies simple lower-casing.

func LowerTrimmedSet ¶

func LowerTrimmedSet(values []string) map[string]struct{}

LowerTrimmedSet returns distinct non-empty strings after trimming and lower-casing.

func MatchesOptionalTrimmed ¶

func MatchesOptionalTrimmed(value, filter string) bool

MatchesOptionalTrimmed reports whether value satisfies an optional exact-match filter after trimming the filter. An empty filter matches every value.

func MatchesOptionalTrimmedOrEmpty ¶

func MatchesOptionalTrimmedOrEmpty(value, filter string) bool

MatchesOptionalTrimmedOrEmpty reports whether value satisfies an optional exact-match filter after trimming the filter, treating an empty value as legacy unscoped data that should not be excluded by the filter.

func NonBlank ¶

func NonBlank(value string) bool

NonBlank reports whether value has non-whitespace content.

func NormalizeUnique ¶

func NormalizeUnique(values []string, normalize Normalizer, sortOutput bool) []string

NormalizeUnique returns non-empty normalized strings, preserving first-seen order unless sortOutput is true.

func Set ¶

func Set(values []string, normalize Normalizer) map[string]struct{}

Set returns normalized non-empty strings as a set. It preserves nil for empty input or when every normalized value is empty.

func SortedSetValues ¶

func SortedSetValues(values map[string]struct{}, normalize Normalizer) []string

SortedSetValues returns the sorted non-empty keys in values after optional normalization.

func TrimQuestionPhraseBoundary ¶

func TrimQuestionPhraseBoundary(value string) string

TrimQuestionPhraseBoundary removes question punctuation, dots, and spaces as boundary characters, matching classifiers that accept loosely spaced prompts.

func TrimQuestionPunctuation ¶

func TrimQuestionPunctuation(value string) string

TrimQuestionPunctuation removes leading/trailing question punctuation before trimming whitespace, matching the policy used by fact-question classifiers.

func TrimSentenceBoundary ¶

func TrimSentenceBoundary(value string) string

TrimSentenceBoundary removes the sentence punctuation policy used by recall fact classifiers, then trims surrounding whitespace. It intentionally keeps punctuation inside the value unchanged.

func TrimSpaceAndQuotes ¶

func TrimSpaceAndQuotes(value string) string

TrimSpaceAndQuotes trims surrounding whitespace, then removes quote-like boundary characters used by fact extraction and prompt classifiers.

func TrimmedSet ¶

func TrimmedSet(values []string) map[string]struct{}

TrimmedSet returns distinct non-empty strings after whitespace trimming.

func TruncateUTF8Bytes ¶

func TruncateUTF8Bytes(value string, limit int) string

TruncateUTF8Bytes returns value truncated to at most limit bytes without splitting a UTF-8 encoded rune.

func UniqueLowerTrimmed ¶

func UniqueLowerTrimmed(values []string, sortOutput bool) []string

UniqueLowerTrimmed returns distinct non-empty strings after trimming and lower-casing.

func UniqueTrimmed ¶

func UniqueTrimmed(values []string, sortOutput bool) []string

UniqueTrimmed returns distinct non-empty strings after whitespace trimming.

func UpperTrimmed ¶

func UpperTrimmed(value string) string

UpperTrimmed trims surrounding whitespace and converts the value to upper case.

func WordCount ¶

func WordCount(content string) int

WordCount returns the number of whitespace-delimited words in content.

Types ¶

type Normalizer ¶

type Normalizer = stringnorm.Normalizer

Normalizer is the shared contract for callers that canonicalize string values before set/unique operations. A nil Normalizer preserves values as-is.

Source Files ¶

View all Source files

Directories ¶

Path	Synopsis
foldcase
foldmatch
stringlist
stringnorm
substrmatch
trimmed
utf8limit
wordspace

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL