mdtext

package

v0.21.0 Latest Latest Go to latest Published: May 19, 2026 License: MIT Imports: 8 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/jeduden/mdsmith

Links

Open Source Insights

Documentation ¶

Index ¶

func CountCharacters(text string) int
func CountSentences(text string) int
func CountWords(text string) int
func ExtractPlainText(node ast.Node, source []byte) string
func IsSpace(r rune) bool
func Slugify(s string) string
func SplitSentences(text string) []string
type TOCItem
- func CollectTOCItems(root ast.Node, source []byte) []TOCItem

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func CountCharacters ¶

func CountCharacters(text string) int

CountCharacters counts letters and digits in text (no spaces or punctuation).

func CountSentences ¶

func CountSentences(text string) int

CountSentences counts sentences by splitting on sentence-ending punctuation (., !, ?) followed by whitespace or end of text. Returns at least 1 for non-empty text.

func CountWords ¶

func CountWords(text string) int

CountWords counts whitespace-delimited words in text. It is exactly len(strings.Fields(text)) — a word is a maximal run of non-space runes, space being IsSpace (exactly unicode.IsSpace) — but counts in a single rune scan instead of allocating the []string. CountWords is called per sentence, per paragraph, per file; the slice strings.Fields built only to be discarded was ~0.48 GB over the 600-file check gate (plan 175 profiling).

func ExtractPlainText ¶

func ExtractPlainText(node ast.Node, source []byte) string

ExtractPlainText extracts readable text from a goldmark AST node, stripping markdown syntax. Keeps: text content, link display text, emphasis inner text, image alt text, code span text.

func IsSpace ¶ added in v0.21.0

func IsSpace(r rune) bool

IsSpace reports whether r is a Unicode space, with exactly the result unicode.IsSpace gives but an inlinable ASCII fast path: for r < utf8.RuneSelf the only spaces are ' ' and '\t'..'\r', so two integer comparisons decide it and only genuine non-ASCII runes pay for unicode.IsSpace's table lookup. It is called per rune of every word of every file on the check hot path, where unicode.IsSpace alone was ~5.5% of CPU (plan 175 profiling).

func Slugify ¶ added in v0.6.0

func Slugify(s string) string

Slugify converts heading text to a GitHub-compatible URL anchor slug. Lowercase, letters/digits preserved, spaces and hyphens become a single dash.

func SplitSentences ¶

func SplitSentences(text string) []string

SplitSentences splits text into individual sentences using a Punkt sentence tokenizer. Handles abbreviations, decimals, and ellipses.

Types ¶

type TOCItem ¶ added in v0.6.0

type TOCItem struct {
	Level  int
	Text   string
	Anchor string
}

TOCItem represents a single heading entry for table-of-contents generation.

func CollectTOCItems ¶ added in v0.6.0

func CollectTOCItems(root ast.Node, source []byte) []TOCItem

CollectTOCItems returns all headings from the AST as TOC items, in document order. Anchors are disambiguated by insertion order: first occurrence keeps the plain slug, subsequent duplicates get -1, -2, … suffixes — matching the anchor computation in crossfilereferenceintegrity. Tracks used anchors (not just base slugs) to guarantee unique anchors even when a later heading's base slug matches an earlier heading's disambiguated anchor.

Source Files ¶

View all Source files

mdtext.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL