tgmd

package module
v0.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 26, 2026 License: MIT Imports: 10 Imported by: 0

README

telegramify-markdown-go

Go Reference Test Go Report Card Coverage

A Go library that converts Markdown to Telegram-compatible plain text with MessageEntity objects.

Uses goldmark for Markdown parsing.

Why entities instead of MarkdownV2?

Telegram Bot API offers two ways to format messages:

1. ParseMode (MarkdownV2 or HTML) — you embed formatting markers in the text:

Text: "*bold* and _italic_"
ParseMode: "MarkdownV2"

This requires escaping 18 special characters (_ * [ ] ( ) ~ > # + - = | { } . !) with context-dependent rules. Different rules apply inside code blocks, URLs, and normal text. A single unescaped .or!` will make Telegram reject the entire message. This is the #1 source of bugs in every Telegram formatting library.

2. Entities — you send plain text plus an array of entity objects with numeric offsets:

Text: "bold and italic"
Entities: [
  {"type": "bold", "offset": 0, "length": 4},
  {"type": "italic", "offset": 9, "length": 6}
]

No markers in the text. No escaping. No parse failures. Telegram applies formatting based on UTF-16 offsets. The complexity shifts to computing offsets correctly — which is deterministic math, not fragile string manipulation.

This library uses the entities approach exclusively.

Install

go get github.com/eekstunt/telegramify-markdown-go

Usage

import tgmd "github.com/eekstunt/telegramify-markdown-go"

// Convert markdown to plain text + entities
msg := tgmd.Convert("**bold** and *italic*")
// msg.Text     = "bold and italic"
// msg.Entities = [{Bold, 0, 4}, {Italic, 9, 6}]

// Convert and split into <=4096 UTF-16 unit messages
msgs := tgmd.ConvertAndSplit(longMarkdown)
Example with go-telegram/bot
import (
    "github.com/go-telegram/bot"
    "github.com/go-telegram/bot/models"
    tgmd "github.com/eekstunt/telegramify-markdown-go"
)

// Convert and send
msgs := tgmd.ConvertAndSplit(markdown)
for _, msg := range msgs {
    b.SendMessage(ctx, &bot.SendMessageParams{
        ChatID:   chatID,
        Text:     msg.Text,
        Entities: toEntities(msg.Entities),
    })
}

// tgmd.Entity fields map 1:1 to Telegram's MessageEntity,
// so the conversion is a straightforward struct copy:
func toEntities(ents []tgmd.Entity) []models.MessageEntity {
    out := make([]models.MessageEntity, len(ents))
    for i, e := range ents {
        out[i] = models.MessageEntity{
            Type:     models.MessageEntityType(e.Type),
            Offset:   e.Offset,
            Length:   e.Length,
            URL:      e.URL,
            Language: e.Language,
        }
    }
    return out
}

Supported Markdown elements

  • Bold, italic, strikethrough
  • inline code and fenced code blocks (with language)
  • Links and images (rendered as links)
  • Headings (bold/underline/italic depending on level, with configurable emoji prefix)
  • Ordered and unordered lists (with nesting)
  • Blockquotes
  • GFM tables (rendered as monospace pre blocks)
  • Task lists (with unicode checkmarks)
  • Horizontal rules

Configuration

msg := tgmd.Convert(markdown,
    tgmd.WithHeadingSymbols([6]string{"📌", "✏", "📚", "🔖", "", ""}),
    tgmd.WithTaskMarkers("✅", "☐"),
    tgmd.WithBulletMarker("⦁"),
    tgmd.WithMaxMessageLen(4096),
)

License

MIT

Acknowledgments

This library is inspired by the excellent telegramify-markdown Python library by @sudoskys. Their pioneering work on the entity-based approach (moving away from MarkdownV2 strings in v1.0.0) directly influenced the architecture of this Go implementation. Thank you for building such a well-thought-out solution and sharing it with the community.

Documentation

Overview

Package tgmd converts Markdown to Telegram-compatible plain text with MessageEntity objects.

Instead of producing MarkdownV2 strings (which require escaping 18 special characters with context-dependent rules), tgmd outputs plain text paired with entity objects that use UTF-16 offsets — matching what the Telegram Bot API expects natively.

All functions are safe for concurrent use.

Index

Examples

Constants

This section is empty.

Variables

This section is empty.

Functions

func SplitAtUTF16

func SplitAtUTF16(s string, offset int) (prefix, suffix string)

SplitAtUTF16 splits s at the given UTF-16 offset. Returns the prefix (up to the offset) and the suffix (from the offset onward). If offset is beyond the string length, returns (s, "").

func UTF16Len

func UTF16Len(s string) int

UTF16Len returns the number of UTF-16 code units needed to encode s. BMP characters (< U+10000) take 1 unit; supplementary characters (emoji, etc.) take 2.

func UTF16RuneLen

func UTF16RuneLen(r rune) int

UTF16RuneLen returns the number of UTF-16 code units for a single rune.

Types

type Entity

type Entity struct {
	Type     EntityType
	Offset   int    // UTF-16 offset from start of text
	Length   int    // length in UTF-16 code units
	URL      string // only for TextLink
	Language string // only for Pre (fenced code block language)
}

Entity represents a Telegram MessageEntity. Offset and Length are in UTF-16 code units (Telegram's native encoding).

type EntityType

type EntityType string

EntityType represents the type of a Telegram message entity. Values match Telegram Bot API MessageEntity type strings.

const (
	Bold          EntityType = "bold"
	Italic        EntityType = "italic"
	Underline     EntityType = "underline"
	Strikethrough EntityType = "strikethrough"
	Code          EntityType = "code"
	Pre           EntityType = "pre"
	TextLink      EntityType = "text_link"
	Blockquote    EntityType = "blockquote"
)

type Message

type Message struct {
	Text     string
	Entities []Entity
}

Message is a single Telegram message: plain text plus formatting entities.

func Convert

func Convert(markdown string, opts ...Option) Message

Convert parses Markdown text and returns plain text with Telegram entities.

Example
package main

import (
	"fmt"

	tgmd "github.com/eekstunt/telegramify-markdown-go"
)

func main() {
	msg := tgmd.Convert("**bold** and *italic*")
	fmt.Println(msg.Text)
	for _, e := range msg.Entities {
		fmt.Printf("%s at %d len %d\n", e.Type, e.Offset, e.Length)
	}
}
Output:
bold and italic
bold at 0 len 4
italic at 9 len 6
Example (CodeBlock)
package main

import (
	"fmt"

	tgmd "github.com/eekstunt/telegramify-markdown-go"
)

func main() {
	msg := tgmd.Convert("```go\nfmt.Println(\"hello\")\n```")
	fmt.Println(msg.Text)
	for _, e := range msg.Entities {
		fmt.Printf("%s lang=%q at %d len %d\n", e.Type, e.Language, e.Offset, e.Length)
	}
}
Output:
fmt.Println("hello")
pre lang="go" at 0 len 20
Example (Options)
package main

import (
	"fmt"

	tgmd "github.com/eekstunt/telegramify-markdown-go"
)

func main() {
	msg := tgmd.Convert("# Hello",
		tgmd.WithHeadingSymbols([6]string{">>", "", "", "", "", ""}),
	)
	fmt.Println(msg.Text)
}
Output:
>> Hello

func ConvertAndSplit

func ConvertAndSplit(markdown string, opts ...Option) []Message

ConvertAndSplit parses Markdown and splits the result into messages that each fit within the configured max message length (UTF-16 units).

Example
package main

import (
	"fmt"

	tgmd "github.com/eekstunt/telegramify-markdown-go"
)

func main() {
	msgs := tgmd.ConvertAndSplit("short message")
	fmt.Println(len(msgs))
	fmt.Println(msgs[0].Text)
}
Output:
1
short message

func Split

func Split(msg Message, maxLen int) []Message

Split divides a Message into multiple Messages, each within maxLen UTF-16 code units. Entities spanning a split boundary are clipped into both chunks.

type Option

type Option func(*config)

Option configures the converter.

func WithBulletMarker

func WithBulletMarker(s string) Option

WithBulletMarker sets the bullet character for unordered lists.

func WithHeadingSymbols

func WithHeadingSymbols(symbols [6]string) Option

WithHeadingSymbols sets the emoji prefix for each heading level (h1 through h6).

func WithMaxMessageLen

func WithMaxMessageLen(n int) Option

WithMaxMessageLen sets the maximum message length in UTF-16 code units for splitting.

func WithTaskMarkers

func WithTaskMarkers(checked, unchecked string) Option

WithTaskMarkers sets the unicode markers for checked/unchecked task list items.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL