mdtext

package
v0.21.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 19, 2026 License: MIT Imports: 8 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func CountCharacters

func CountCharacters(text string) int

CountCharacters counts letters and digits in text (no spaces or punctuation).

func CountSentences

func CountSentences(text string) int

CountSentences counts sentences by splitting on sentence-ending punctuation (., !, ?) followed by whitespace or end of text. Returns at least 1 for non-empty text.

func CountWords

func CountWords(text string) int

CountWords counts whitespace-delimited words in text. It is exactly len(strings.Fields(text)) — a word is a maximal run of non-space runes, space being IsSpace (exactly unicode.IsSpace) — but counts in a single rune scan instead of allocating the []string. CountWords is called per sentence, per paragraph, per file; the slice strings.Fields built only to be discarded was ~0.48 GB over the 600-file check gate (plan 175 profiling).

func ExtractPlainText

func ExtractPlainText(node ast.Node, source []byte) string

ExtractPlainText extracts readable text from a goldmark AST node, stripping markdown syntax. Keeps: text content, link display text, emphasis inner text, image alt text, code span text.

func IsSpace added in v0.21.0

func IsSpace(r rune) bool

IsSpace reports whether r is a Unicode space, with exactly the result unicode.IsSpace gives but an inlinable ASCII fast path: for r < utf8.RuneSelf the only spaces are ' ' and '\t'..'\r', so two integer comparisons decide it and only genuine non-ASCII runes pay for unicode.IsSpace's table lookup. It is called per rune of every word of every file on the check hot path, where unicode.IsSpace alone was ~5.5% of CPU (plan 175 profiling).

func Slugify added in v0.6.0

func Slugify(s string) string

Slugify converts heading text to a GitHub-compatible URL anchor slug. Lowercase, letters/digits preserved, spaces and hyphens become a single dash.

func SplitSentences

func SplitSentences(text string) []string

SplitSentences splits text into individual sentences using a Punkt sentence tokenizer. Handles abbreviations, decimals, and ellipses.

Types

type TOCItem added in v0.6.0

type TOCItem struct {
	Level  int
	Text   string
	Anchor string
}

TOCItem represents a single heading entry for table-of-contents generation.

func CollectTOCItems added in v0.6.0

func CollectTOCItems(root ast.Node, source []byte) []TOCItem

CollectTOCItems returns all headings from the AST as TOC items, in document order. Anchors are disambiguated by insertion order: first occurrence keeps the plain slug, subsequent duplicates get -1, -2, … suffixes — matching the anchor computation in crossfilereferenceintegrity. Tracks used anchors (not just base slugs) to guarantee unique anchors even when a later heading's base slug matches an earlier heading's disambiguated anchor.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL