onig

package module
v1.2.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 6, 2024 License: MIT, Unlicense Imports: 5 Imported by: 0

README

Oniguruma Go

Go bindings for the Oniguruma regex library, a powerful and mature regular expression library with support for a wide range of character sets and language syntaxes. Oniguruma is written in C.

Dual-licensed under MIT or the UNLICENSE.

Installation

Prerequisites:

In order to install onig-go, you need to have the Oniguruma library installed on your system.

You can install it using Homebrew:

brew install oniguruma

Or on Ubuntu you can install it using apt:

sudo apt install libonig-dev libonig5

Installation:

To install onig-go, use the following command:

go get github.com/tmikus/onig-go

Example Usage

package main

import (
    "fmt"
    "github.com/tmikus/onig-go"
)

func main() {
    regex, _ := onig.NewRegex("e(l+)")
    captures, _ := regex.Captures("hello")
    for text := range captures.All() {
        fmt.Println(text)
    }
}

For more examples, see the regex_test.go file.

Documentation

The API documentation is available at https://pkg.go.dev/github.com/tmikus/onig-go.

Documentation

Index

Constants

View Source
const REGEX_OPTION_CALLBACK_EACH_MATCH = (REGEX_OPTION_NOT_BEGIN_POSITION << 1)
View Source
const REGEX_OPTION_CAPTURE_GROUP = (REGEX_OPTION_DONT_CAPTURE_GROUP << 1)

REGEX_OPTION_CAPTURE_GROUP named and no-named group captured.

View Source
const REGEX_OPTION_CHECK_VALIDITY_OF_STRING = (REGEX_OPTION_POSIX_REGION << 1)

REGEX_OPTION_CHECK_VALIDITY_OF_STRING

View Source
const REGEX_OPTION_DIGIT_IS_ASCII = (REGEX_OPTION_WORD_IS_ASCII << 1)

REGEX_OPTION_DIGIT_IS_ASCII ASCII only digit (\d, \p{Digit}, [[:digit:]])

View Source
const REGEX_OPTION_DONT_CAPTURE_GROUP = (REGEX_OPTION_NEGATE_SINGLELINE << 1)

REGEX_OPTION_DONT_CAPTURE_GROUP only named group captured.

View Source
const REGEX_OPTION_EXTEND = (REGEX_OPTION_IGNORECASE << 1)

REGEX_OPTION_EXTEND extended pattern form

View Source
const REGEX_OPTION_FIND_LONGEST = (REGEX_OPTION_SINGLELINE << 1)

REGEX_OPTION_FIND_LONGEST find the longest match The REGEX_OPTION_FIND_LONGEST option doesn't work properly during backward search.

View Source
const REGEX_OPTION_FIND_NOT_EMPTY = (REGEX_OPTION_FIND_LONGEST << 1)

REGEX_OPTION_FIND_NOT_EMPTY ignore empty match

View Source
const REGEX_OPTION_IGNORECASE_IS_ASCII = (REGEX_OPTION_CHECK_VALIDITY_OF_STRING << 3)

REGEX_OPTION_IGNORECASE_IS_ASCII Limit IGNORECASE((?i)) to a range of ASCII characters

View Source
const REGEX_OPTION_MATCH_WHOLE_STRING = (REGEX_OPTION_CALLBACK_EACH_MATCH << 1)
View Source
const REGEX_OPTION_MAXBIT = REGEX_OPTION_MATCH_WHOLE_STRING
View Source
const REGEX_OPTION_MULTILINE = (REGEX_OPTION_EXTEND << 1)

REGEX_OPTION_MULTILINE '.' match with newline

View Source
const REGEX_OPTION_NEGATE_SINGLELINE = (REGEX_OPTION_FIND_NOT_EMPTY << 1)

REGEX_OPTION_NEGATE_SINGLELINE clear REGEX_OPTION_SINGLELINE which is enabled on SyntaxPosixBasic/SyntaxExtended/SyntaxPerl/SyntaxPerlNG/SyntaxPython/SyntaxJava

View Source
const REGEX_OPTION_NOTBOL = (REGEX_OPTION_CAPTURE_GROUP << 1)

REGEX_OPTION_NOTBOL

View Source
const REGEX_OPTION_NOTEOL = (REGEX_OPTION_NOTBOL << 1)

REGEX_OPTION_NOTEOL

View Source
const REGEX_OPTION_NOT_BEGIN_POSITION = (REGEX_OPTION_NOT_END_STRING << 1)
View Source
const REGEX_OPTION_NOT_BEGIN_STRING = (REGEX_OPTION_TEXT_SEGMENT_WORD << 1)
View Source
const REGEX_OPTION_NOT_END_STRING = (REGEX_OPTION_NOT_BEGIN_STRING << 1)
View Source
const REGEX_OPTION_POSIX_IS_ASCII = (REGEX_OPTION_SPACE_IS_ASCII << 1)

REGEX_OPTION_POSIX_IS_ASCII ASCII only POSIX properties (includes word, digit, space) (alnum, alpha, blank, cntrl, digit, graph, lower, print, punct, space, upper, xdigit, word)

View Source
const REGEX_OPTION_POSIX_REGION = (REGEX_OPTION_NOTEOL << 1)

REGEX_OPTION_POSIX_REGION

View Source
const REGEX_OPTION_SINGLELINE = (REGEX_OPTION_MULTILINE << 1)

REGEX_OPTION_SINGLELINE '^' -> '\A', '$' -> '\Z'

View Source
const REGEX_OPTION_SPACE_IS_ASCII = (REGEX_OPTION_DIGIT_IS_ASCII << 1)

REGEX_OPTION_SPACE_IS_ASCII ASCII only space (\s, \p{Space}, [[:space:]])

View Source
const REGEX_OPTION_TEXT_SEGMENT_EXTENDED_GRAPHEME_CLUSTER = (REGEX_OPTION_POSIX_IS_ASCII << 1)

REGEX_OPTION_TEXT_SEGMENT_EXTENDED_GRAPHEME_CLUSTER Extended Grapheme Cluster mode

View Source
const REGEX_OPTION_TEXT_SEGMENT_WORD = (REGEX_OPTION_TEXT_SEGMENT_EXTENDED_GRAPHEME_CLUSTER << 1)

REGEX_OPTION_TEXT_SEGMENT_WORD Word mode

View Source
const REGEX_OPTION_WORD_IS_ASCII = (REGEX_OPTION_IGNORECASE_IS_ASCII << 1)

REGEX_OPTION_WORD_IS_ASCII ASCII only word (\w, \p{Word}, [[:word:]])

ASCII only word bound (\b)

Variables

View Source
var SyntaxAsis = &Syntax{
	raw: C.ONIG_SYNTAX_ASIS,
}

SyntaxAsis is the plain text syntax.

View Source
var SyntaxDefault = &Syntax{
	raw: C.ONIG_SYNTAX_DEFAULT,
}

SyntaxDefault is the default syntax (Ruby syntax).

View Source
var SyntaxEmacs = &Syntax{
	raw: C.ONIG_SYNTAX_EMACS,
}

SyntaxEmacs is the Emacs regular expression syntax.

View Source
var SyntaxGnuRegex = &Syntax{
	raw: C.ONIG_SYNTAX_GNU_REGEX,
}

SyntaxGnuRegex is the GNU regex regular expression syntax.

View Source
var SyntaxGrep = &Syntax{
	raw: C.ONIG_SYNTAX_GREP,
}

SyntaxGrep is the grep regular expression syntax.

View Source
var SyntaxJava = &Syntax{
	raw: C.ONIG_SYNTAX_JAVA,
}

SyntaxJava is the Java (Sun java.util.regex) regular expression syntax.

View Source
var SyntaxOniguruma = &Syntax{
	raw: C.ONIG_SYNTAX_ONIGURUMA,
}

SyntaxOniguruma is the Oniguruma regular expression syntax.

View Source
var SyntaxPerl = &Syntax{
	raw: C.ONIG_SYNTAX_PERL,
}

SyntaxPerl is the Perl regular expression syntax.

View Source
var SyntaxPerlNG = &Syntax{
	raw: C.ONIG_SYNTAX_PERL_NG,
}

SyntaxPerlNG is the Perl + named group regular expression syntax.

View Source
var SyntaxPosixBasic = &Syntax{
	raw: C.ONIG_SYNTAX_POSIX_BASIC,
}

SyntaxPosixBasic is the POSIX Basic regular expression syntax.

View Source
var SyntaxPosixExtended = &Syntax{
	raw: C.ONIG_SYNTAX_POSIX_EXTENDED,
}

SyntaxPosixExtended is the POSIX Extended regular expression syntax.

View Source
var SyntaxPython = &Syntax{
	raw: C.ONIG_SYNTAX_PYTHON,
}

SyntaxPython is the Python syntax.

View Source
var SyntaxRuby = &Syntax{
	raw: C.ONIG_SYNTAX_RUBY,
}

SyntaxRuby is the Ruby regular expression syntax.

Functions

This section is empty.

Types

type Captures

type Captures struct {
	Offset uint
	Region *Region
	Text   string
}

Captures represents a group of captured strings for a single match.

The 0th capture always corresponds to the entire match. Each subsequent index corresponds to the next capture group in the regex. Positions returned from a capture group are always byte indices.

func (*Captures) All

func (c *Captures) All() iter.Seq[string]

All returns all the capture groups in order of appearance in the regular expression.

func (*Captures) AllPos

func (c *Captures) AllPos() iter.Seq[*Range]

AllPos returns all the capture group positions in order of appearance in the regular expression. Positions are byte indices in terms of the original string matched.

func (*Captures) AllPosWithIndex added in v1.1.0

func (c *Captures) AllPosWithIndex() iter.Seq2[int, *Range]

AllPosWithIndex returns all the capture group positions in order of appearance in the regular expression. Positions are byte indices in terms of the original string matched.

func (*Captures) AllWithIndex added in v1.1.0

func (c *Captures) AllWithIndex() iter.Seq2[int, string]

AllWithIndex returns all the capture groups in order of appearance in the regular expression.

func (*Captures) At

func (c *Captures) At(i int) string

At returns the matched string for the capture group i. If it isn’t a valid capture group or didn’t match anything, then an empty string is returned.

func (*Captures) IsEmpty

func (c *Captures) IsEmpty() bool

IsEmpty returns true if and only if there are no captured groups.

func (*Captures) Len

func (c *Captures) Len() int

Len returns the number of captured groups.

func (*Captures) Pos

func (c *Captures) Pos(i int) *Range

Pos returns the start and end positions of the Nth capture group. Returns nil if it is not a valid capture group or if the capture group did not match anything. The positions returned are always byte indices with respect to the original string matched.

type CapturesIterator added in v1.1.0

type CapturesIterator struct {
	// contains filtered or unexported fields
}

CapturesIterator is an iterator over non-overlapping capture groups matched in text.

func (*CapturesIterator) All added in v1.1.0

func (c *CapturesIterator) All() iter.Seq[*Captures]

All iterates over all non-overlapping capture groups matched in text.

func (*CapturesIterator) AllWithIndex added in v1.1.0

func (c *CapturesIterator) AllWithIndex() iter.Seq2[int, *Captures]

AllWithIndex iterates over all non-overlapping capture groups matched in text.

func (*CapturesIterator) Err added in v1.1.0

func (c *CapturesIterator) Err() error

Err returns the error, if any, that occurred during iteration.

type FindMatchesIterator added in v1.1.0

type FindMatchesIterator struct {
	// contains filtered or unexported fields
}

FindMatchesIterator is an iterator over each non-overlapping match in text.

func (*FindMatchesIterator) All added in v1.1.0

func (c *FindMatchesIterator) All() iter.Seq[*Range]

All iterates over each non-overlapping match in text.

func (*FindMatchesIterator) AllWithIndex added in v1.1.0

func (c *FindMatchesIterator) AllWithIndex() iter.Seq2[int, *Range]

AllWithIndex iterates over each non-overlapping match in text.

func (*FindMatchesIterator) Err added in v1.1.0

func (c *FindMatchesIterator) Err() error

Err returns the error, if any, that occurred during iteration.

type MatchParam

type MatchParam struct {
	// contains filtered or unexported fields
}

MatchParam contains parameters for a Match or Search.

func NewMatchParam

func NewMatchParam() *MatchParam

NewMatchParam creates a new MatchParam.

func (*MatchParam) SetMatchStackLimit

func (p *MatchParam) SetMatchStackLimit(limit uint32)

SetMatchStackLimit sets the match stack limit.

func (*MatchParam) SetRetryLimitInMatch

func (p *MatchParam) SetRetryLimitInMatch(limit uint32)

SetRetryLimitInMatch sets the retry limit in match.

type Range

type Range struct {
	From int // the start index of the match
	To   int // the end index of the match
}

Range is a struct that contains the regex match start and end indices.

func NewRange

func NewRange(from int, to int) *Range

NewRange creates a new Range given the from and to indices.

type Regex

type Regex struct {
	// contains filtered or unexported fields
}

Regex represents a regular expression. It is a wrapper around the Oniguruma regex library.

func NewRegex

func NewRegex(pattern string) (*Regex, error)

NewRegex creates a new Regex object.

func NewRegexWithOptions

func NewRegexWithOptions(
	pattern string,
	options RegexOptions,
) (*Regex, error)

NewRegexWithOptions creates a new Regex object with the given options.

func NewRegexWithOptionsAndSyntax added in v1.1.0

func NewRegexWithOptionsAndSyntax(
	pattern string,
	options RegexOptions,
	syntax *Syntax,
) (*Regex, error)

NewRegexWithOptionsAndSyntax creates a new Regex object with the given options and syntax.

func NewRegexWithSyntax added in v1.1.0

func NewRegexWithSyntax(pattern string, syntax *Syntax) (*Regex, error)

NewRegexWithSyntax creates a new Regex object with the given syntax.

func (*Regex) AllCaptures

func (r *Regex) AllCaptures(text string) ([]Captures, error)

AllCaptures returns a list of all non-overlapping capture groups matched in text. This is operationally the same as FindMatches, except it yields information about submatches.

func (*Regex) AllCapturesIter added in v1.1.0

func (r *Regex) AllCapturesIter(text string) *CapturesIterator

AllCapturesIter returns an iterator of all non-overlapping capture groups matched in text. This is operationally the same as FindMatches, except it yields information about submatches.

func (*Regex) CaptureNames

func (r *Regex) CaptureNames() []string

CaptureNames returns a list of the names of all capture groups in the regular expression.

func (*Regex) Captures

func (r *Regex) Captures(text string) (*Captures, error)

Captures returns the capture groups corresponding to the leftmost-first match in text. Capture group 0 always corresponds to the entire match. If no match is found, then nil is returned.

func (*Regex) FindMatches

func (r *Regex) FindMatches(text string) ([]*Range, error)

FindMatches returns a list containing each non-overlapping match in text, returning the start and end byte indices with respect to text.

func (*Regex) FindMatchesIter added in v1.1.0

func (r *Regex) FindMatchesIter(text string) *FindMatchesIterator

FindMatchesIter returns an iterator containing each non-overlapping match in text, returning the start and end byte indices with respect to text.

func (*Regex) MustAllCaptures added in v1.1.0

func (r *Regex) MustAllCaptures(text string) []Captures

MustAllCaptures returns a list of all non-overlapping capture groups matched in text. This is operationally the same as FindMatches, except it yields information about submatches. Compared to AllCaptures, this method panics on error.

func (*Regex) MustCaptures added in v1.1.0

func (r *Regex) MustCaptures(text string) *Captures

MustCaptures returns the capture groups corresponding to the leftmost-first match in text. Capture group 0 always corresponds to the entire match. If no match is found, then nil is returned. Compared to Captures, this method panics on error.

func (*Regex) MustFindMatches added in v1.1.0

func (r *Regex) MustFindMatches(text string) []*Range

MustFindMatches returns a list containing each non-overlapping match in text, returning the start and end byte indices with respect to text. Compared to FindMatches, this method panics on error.

func (*Regex) MustReplace added in v1.1.0

func (r *Regex) MustReplace(text string, replacement string) string

MustReplace replaces the leftmost-first match with the replacement provided. If no match is found, then a copy of the string is returned unchanged. Compared to Replace, this method panics on error.

func (*Regex) MustReplaceAll added in v1.1.0

func (r *Regex) MustReplaceAll(text string, replacement string) string

MustReplaceAll replaces all non-overlapping matches in text with the replacement provided. This is the same as calling ReplaceN with limit set to 0. See the documentation for Replace for details on how to access submatches in the replacement string. Compared to ReplaceAll, this method panics on error.

func (*Regex) MustReplaceAllFunc added in v1.1.0

func (r *Regex) MustReplaceAllFunc(text string, replacement ReplacementFunc) string

MustReplaceAllFunc replaces all non-overlapping matches in text with the replacement function provided. This is the same as calling ReplaceNFunc with limit set to 0. See the documentation for Replace for details on how to access submatches in the replacement string. Compared to ReplaceAllFunc, this method panics on error.

func (*Regex) MustReplaceFunc added in v1.1.0

func (r *Regex) MustReplaceFunc(text string, replacement ReplacementFunc) string

MustReplaceFunc replaces the leftmost-first match with the replacement provided. The replacement is a function that takes the matches Captures and returns the replaced string. If no match is found, then a copy of the string is returned unchanged. Compared to ReplaceFunc, this method panics on error.

func (*Regex) MustReplaceN added in v1.1.0

func (r *Regex) MustReplaceN(text string, replacement string, limit int) string

MustReplaceN replaces at most limit non-overlapping matches in text with the replacement provided. If limit is 0, then all non-overlapping matches are replaced. See the documentation for Replace for details on how to access submatches in the replacement string. Compared to ReplaceN, this method panics on error.

func (*Regex) MustReplaceNFunc added in v1.1.0

func (r *Regex) MustReplaceNFunc(text string, replacement ReplacementFunc, limit int) string

MustReplaceNFunc replaces at most limit non-overlapping matches in text with the replacement provided. If limit is 0, then all non-overlapping matches are replaced. See the documentation for Replace for details on how to access submatches in the replacement string. Compared to ReplaceNFunc, this method panics on error.

func (*Regex) MustSearchWithParam added in v1.1.0

func (r *Regex) MustSearchWithParam(
	text string,
	from uint,
	to uint,
	options RegexOptions,
	region *Region,
	matchParam *MatchParam,
) *uint

MustSearchWithParam searches pattern in string with match param.

Search for matches the regex in a string. This method will return the index of the first match of the regex within the string, if there is one. If from is less than to, then search is performed in forward order, otherwise – in backward order.

For more information see [Match vs Search](https://docs.rs/onig/latest/onig/index.html#match-vs-search)

The encoding of the buffer passed to search in must match the encoding of the regex. Compared to SearchWithParam, this method panics on error.

func (*Regex) MustSplit added in v1.1.0

func (r *Regex) MustSplit(text string) []string

MustSplit returns a list of substrings of text delimited by a match of the regular expression. Namely, each element of the iterator corresponds to text that isn’t matched by the regular expression. Compared to Split, this method panics on error.

func (*Regex) MustSplitN added in v1.1.0

func (r *Regex) MustSplitN(text string, limit int) []string

MustSplitN returns a list of at most `limit` substrings of text delimited by a match of the regular expression. A limit of 0 will return no substrings. Namely, each element of the iterator corresponds to text that isn’t matched by the regular expression. The remainder of the string that is not split will be the last element in the iterator. Compared to SplitN, this method panics on error.

func (*Regex) Replace

func (r *Regex) Replace(text string, replacement string) (string, error)

Replace replaces the leftmost-first match with the replacement provided. If no match is found, then a copy of the string is returned unchanged.

func (*Regex) ReplaceAll

func (r *Regex) ReplaceAll(text string, replacement string) (string, error)

ReplaceAll replaces all non-overlapping matches in text with the replacement provided. This is the same as calling ReplaceN with limit set to 0. See the documentation for Replace for details on how to access submatches in the replacement string.

func (*Regex) ReplaceAllFunc

func (r *Regex) ReplaceAllFunc(text string, replacement ReplacementFunc) (string, error)

ReplaceAllFunc replaces all non-overlapping matches in text with the replacement function provided. This is the same as calling ReplaceNFunc with limit set to 0. See the documentation for Replace for details on how to access submatches in the replacement string.

func (*Regex) ReplaceFunc

func (r *Regex) ReplaceFunc(text string, replacement ReplacementFunc) (string, error)

ReplaceFunc replaces the leftmost-first match with the replacement provided. The replacement is a function that takes the matches Captures and returns the replaced string. If no match is found, then a copy of the string is returned unchanged.

func (*Regex) ReplaceN

func (r *Regex) ReplaceN(text string, replacement string, limit int) (string, error)

ReplaceN replaces at most limit non-overlapping matches in text with the replacement provided. If limit is 0, then all non-overlapping matches are replaced. See the documentation for Replace for details on how to access submatches in the replacement string.

func (*Regex) ReplaceNFunc

func (r *Regex) ReplaceNFunc(text string, replacement ReplacementFunc, limit int) (string, error)

ReplaceNFunc replaces at most limit non-overlapping matches in text with the replacement provided. If limit is 0, then all non-overlapping matches are replaced. See the documentation for Replace for details on how to access submatches in the replacement string.

func (*Regex) SearchWithParam

func (r *Regex) SearchWithParam(
	text string,
	from uint,
	to uint,
	options RegexOptions,
	region *Region,
	matchParam *MatchParam,
) (*uint, error)

SearchWithParam searches pattern in string with match param.

Search for matches the regex in a string. This method will return the index of the first match of the regex within the string, if there is one. If from is less than to, then search is performed in forward order, otherwise – in backward order.

For more information see [Match vs Search](https://docs.rs/onig/latest/onig/index.html#match-vs-search)

The encoding of the buffer passed to search in must match the encoding of the regex.

func (*Regex) Split

func (r *Regex) Split(text string) ([]string, error)

Split returns a list of substrings of text delimited by a match of the regular expression. Namely, each element of the iterator corresponds to text that isn’t matched by the regular expression.

func (*Regex) SplitN

func (r *Regex) SplitN(text string, limit int) ([]string, error)

SplitN returns a list of at most `limit` substrings of text delimited by a match of the regular expression. A limit of 0 will return no substrings. Namely, each element of the iterator corresponds to text that isn’t matched by the regular expression. The remainder of the string that is not split will be the last element in the iterator.

type RegexOptions

type RegexOptions uint
const REGEX_OPTION_IGNORECASE RegexOptions = 1

REGEX_OPTION_IGNORECASE ambiguity match on

const REGEX_OPTION_NONE RegexOptions = 0

REGEX_OPTION_NONE no option

type Region

type Region struct {
	// contains filtered or unexported fields
}

Region represents a set of capture groups found in a search or match.

func NewRegion

func NewRegion() *Region

NewRegion creates a new empty Region.

func (*Region) Clear

func (r *Region) Clear()

Clear can be used to clear out a region so it can be used again. See [onig_sys::onig_region_clear](https://docs.rs/onig/latest/onig/onig_sys/fn.onig_region_clear.html)

func (*Region) Len

func (r *Region) Len() int

Len returns the number of registers in the region.

func (*Region) Pos

func (r *Region) Pos(index int) *Range

Pos returns the start and end positions of the Nth capture group. Returns nil if index is not a valid capture group or if the capture group did not match anything. The positions returned are always byte indices with respect to the original string matched.

type ReplacementFunc

type ReplacementFunc func(capture *Captures) string

ReplacementFunc is a function that takes the matches Captures and returns the replaced string.

type Syntax

type Syntax struct {
	// contains filtered or unexported fields
}

Syntax is a wrapper for Onig Syntax

Each syntax defines a flavour of regex syntax. This type allows interaction with the built-in syntaxes through the static accessor functions (Syntax::emacs(), Syntax::default() etc.) and the creation of custom syntaxes.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL