lexer

package

v0.0.21 Latest Latest Go to latest Published: Mar 23, 2026 License: MIT Imports: 4 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/duber000/kukicha

Links

Open Source Insights

Documentation ¶

Index ¶

func IsKeyword(s string) bool
func Keywords() []string
type Lexer
- func NewLexer(source string, filename string) *Lexer
- func (l *Lexer) ScanTokens() ([]Token, error)
type Token
- func (t Token) String() string
type TokenType
- func LookupKeyword(identifier string) TokenType
- func (t TokenType) String() string

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func IsKeyword ¶

func IsKeyword(s string) bool

IsKeyword checks if a string is a keyword

func Keywords ¶ added in v0.0.9

func Keywords() []string

Keywords returns all keyword strings from the canonical keywords map. This is the single source of truth for keyword completion in the LSP. The result is computed once and cached for subsequent calls.

Types ¶

type Lexer ¶

type Lexer struct {
	// contains filtered or unexported fields
}

Lexer tokenizes Kukicha source code.

ARCHITECTURE NOTE: Kukicha uses Python-style indentation-based blocks. The lexer converts 4-space indentation changes into INDENT and DEDENT tokens, which the parser then uses to determine block structure.

The indentStack tracks nesting levels. When indentation increases by 4 spaces, an INDENT token is emitted and the level is pushed. When it decreases, DEDENT tokens are emitted (possibly multiple) until the stack matches the new level.

Why 4 spaces only (no tabs)?

Consistency: Eliminates "tabs vs spaces" debates
Beginner-friendly: One clear rule, no configuration needed
Error prevention: Mixed tabs/spaces is a common Python mistake

func NewLexer ¶

func NewLexer(source string, filename string) *Lexer

NewLexer creates a new lexer for the given source code

func (*Lexer) ScanTokens ¶

func (l *Lexer) ScanTokens() ([]Token, error)

ScanTokens scans all tokens from the source

type Token ¶

type Token struct {
	Type   TokenType
	Lexeme string
	Line   int
	Column int
	File   string
}

Token represents a single token in the source code

func (Token) String ¶

func (t Token) String() string

String returns a string representation of the token

type TokenType ¶

type TokenType int

TokenType represents the type of a token

const (
	// Literals
	TOKEN_IDENTIFIER TokenType = iota
	TOKEN_INTEGER
	TOKEN_FLOAT
	TOKEN_STRING
	TOKEN_STRING_HEAD // Leading literal of an interpolated string (before first {expr})
	TOKEN_STRING_MID  // Middle literal between two interpolations (between }...{)
	TOKEN_STRING_TAIL // Trailing literal after last interpolation (after last })
	TOKEN_TRUE
	TOKEN_FALSE

	// Keywords
	TOKEN_PETIOLE
	TOKEN_IMPORT
	TOKEN_TYPE
	TOKEN_INTERFACE
	TOKEN_VAR
	TOKEN_FUNC
	TOKEN_RETURN
	TOKEN_IF
	TOKEN_ELSE
	TOKEN_FOR
	TOKEN_CONTINUE
	TOKEN_BREAK
	TOKEN_IN
	TOKEN_FROM
	TOKEN_TO
	TOKEN_THROUGH
	TOKEN_SWITCH
	TOKEN_CASE
	TOKEN_DEFAULT
	TOKEN_GO
	TOKEN_DEFER
	TOKEN_MAKE
	TOKEN_LIST
	TOKEN_MAP
	TOKEN_CHANNEL
	TOKEN_SEND
	TOKEN_RECEIVE
	TOKEN_CLOSE
	TOKEN_PANIC
	TOKEN_RECOVER
	TOKEN_ERROR
	TOKEN_EMPTY
	TOKEN_REFERENCE
	TOKEN_DEREFERENCE
	TOKEN_ON
	TOKEN_DISCARD
	TOKEN_OF
	TOKEN_AS
	TOKEN_SKILL
	TOKEN_SELECT

	// Variadic keyword
	TOKEN_MANY

	// Const keyword
	TOKEN_CONST

	// Operators
	TOKEN_WALRUS         // :=
	TOKEN_ASSIGN         // =
	TOKEN_EQUALS         // equals
	TOKEN_DOUBLE_EQUALS  // ==
	TOKEN_NOT_EQUALS     // !=
	TOKEN_LT             // <
	TOKEN_GT             // >
	TOKEN_LTE            // <=
	TOKEN_GTE            // >=
	TOKEN_PLUS           // +
	TOKEN_PLUS_PLUS      // ++
	TOKEN_MINUS          // -
	TOKEN_MINUS_MINUS    // --
	TOKEN_STAR           // *
	TOKEN_SLASH          // /
	TOKEN_PERCENT        // %
	TOKEN_AND            // and
	TOKEN_AND_AND        // &&
	TOKEN_BIT_AND        // &
	TOKEN_BIT_AND_ASSIGN // &=
	TOKEN_OR             // or
	TOKEN_OR_OR          // ||
	TOKEN_BIT_OR         // | (for Go flag combinations like os.O_APPEND | os.O_CREATE)
	TOKEN_RUNE           // 'a' (character/rune literal)
	TOKEN_ONERR          // onerr
	TOKEN_EXPLAIN        // explain
	TOKEN_NOT            // not
	TOKEN_BANG           // !
	TOKEN_PIPE           // |>
	TOKEN_FAT_ARROW      // =>
	TOKEN_ARROW_LEFT     // <-

	// Delimiters
	TOKEN_LPAREN   // (
	TOKEN_RPAREN   // )
	TOKEN_LBRACKET // [
	TOKEN_RBRACKET // ]
	TOKEN_LBRACE   // {
	TOKEN_RBRACE   // }
	TOKEN_COMMA    // ,
	TOKEN_DOT      // .
	TOKEN_COLON    // :

	// Special
	TOKEN_NEWLINE
	TOKEN_INDENT
	TOKEN_DEDENT
	TOKEN_EOF
	TOKEN_COMMENT   // # comment (standalone on its own line)
	TOKEN_DIRECTIVE // # kuki:deprecated "msg" or # kuki:fix inline
	TOKEN_SEMICOLON // ; (for Go-style syntax support)
)

func LookupKeyword ¶

func LookupKeyword(identifier string) TokenType

LookupKeyword returns the token type for a keyword, or TOKEN_IDENTIFIER if not a keyword

func (TokenType) String ¶

func (t TokenType) String() string

String returns a string representation of the token type

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL