splitter

package
v0.2.3 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 3, 2026 License: MIT Imports: 20 Imported by: 0

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ExtToLanguage = map[string]string{
	".go":    "go",
	".py":    "python",
	".js":    "javascript",
	".jsx":   "javascript",
	".ts":    "typescript",
	".tsx":   "typescript",
	".java":  "java",
	".c":     "c",
	".h":     "c",
	".cpp":   "cpp",
	".hpp":   "cpp",
	".cc":    "cpp",
	".cs":    "csharp",
	".rs":    "rust",
	".rb":    "ruby",
	".php":   "php",
	".swift": "swift",
	".kt":    "kotlin",
	".scala": "scala",
	".m":     "objc",
	".mm":    "objc",
	".lua":   "lua",
	".sh":    "bash",
	".bash":  "bash",
	".md":    "markdown",
}

语言与文件扩展名的映射

Functions

func DetectLanguage

func DetectLanguage(filePath string) string

DetectLanguage 根据文件扩展名检测编程语言

Types

type Splitter

type Splitter interface {
	// Split 将文件内容按语义边界分块
	Split(content string, filePath string) ([]model.CodeChunk, error)

	// SupportedLanguages 返回该分块器支持的语言列表
	SupportedLanguages() []string
}

Splitter 代码分块器接口

type TreeSitterSplitter

type TreeSitterSplitter struct {
	// contains filtered or unexported fields
}

TreeSitterSplitter 基于 tree-sitter AST 的多语言代码分块器

func NewTreeSitterSplitter

func NewTreeSitterSplitter(maxChunkSize, overlap int) *TreeSitterSplitter

NewTreeSitterSplitter 创建 tree-sitter 分块器

func (*TreeSitterSplitter) Split

func (s *TreeSitterSplitter) Split(content string, filePath string) ([]model.CodeChunk, error)

Split 将文件内容按 AST 语义边界分块

func (*TreeSitterSplitter) SupportedLanguages

func (s *TreeSitterSplitter) SupportedLanguages() []string

SupportedLanguages 返回支持的语言列表

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL