pi

package
v0.5.10 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 30, 2019 License: BSD-3-Clause Imports: 16 Imported by: 6

Documentation

Overview

Package pi provides the overall integration and coordination of the several sub-packages that comprise the interactive parser (GoPi) system. It has the end-user high-level functions in the pi.Parser class for processing files through the Lexer and Parser stages, and specific custom functions for processing supported languages, via a common interface.

The basic machinery of lexing and parsing is implemented in the lex and parse sub-packages, which each work in a completely general-purpose manner across all supported languages, and can be re-used independently for any other such specific purpose outside of the full pi system. Each of them depend on the token.Tokens constants defined in the token package, which provides a "universal language" of lexical tokens used across all supported languages and syntax highlighting cases, based originally on pygments via chroma and since expanded and systematized from there.

The parse package produces an abstract syntax tree (AST) representation of the source, and lists (maps) of symbols that can be used for name lookup and completion (types, variables, functions, etc). Those symbol structures are defined in the syms sub-package.

To more effectively manage and organize the symbols from parsing, language-specific logic is required, and this is supported by the Lang interface, which is implemented for each of the supported languages (see lang.go and langs/* e.g., golang/golang.go).

The LangSupport variable provides the hub for accessing interfaces for supported languages, using the StdLangProps map which provides a lookup from the filecat.Supported language name to its associated Lang interface and pi.Parser parser. Thus you can go from the GoGi giv.FileInfo.Sup field to its associated GoPi methods using this map (and associated LangSupport methods). This map is extensible and other supported languages can be added in other packages. This requires a dependency on gi/filecat sub-module in GoGi, which defines a broad set of supported file categories and associated mime types, etc, which are generally supported within the GoGi gui framework -- a subset of these are the languages and file formats supported by GoPi parsing / lexing.

The piv sub-package provides the GUI for constructing and testing a lexer and parser interactively. It is the only sub-package with significant dependencies, especially on GoGi and Gide.

Index

Constants

View Source
const (
	Version     = "v0.5.10"
	GitCommit   = "3bbcf69"          // the commit JUST BEFORE the release
	VersionDate = "2019-12-30 08:45" // UTC
)

Variables

View Source
var KiT_LangFlags = kit.Enums.AddEnum(LangFlagsN, kit.NotBitFlag, nil)
View Source
var LangSupport = LangSupporter{}

LangSupport is the main language support hub for accessing GoPi support interfaces for each supported language

View Source
var StdLangProps = map[filecat.Supported]*LangProps{
	filecat.Ada:        {filecat.Ada, "--", "", "", nil, nil, nil},
	filecat.Bash:       {filecat.Bash, "# ", "", "", nil, nil, nil},
	filecat.Csh:        {filecat.Csh, "# ", "", "", nil, nil, nil},
	filecat.C:          {filecat.C, "// ", "/* ", " */", nil, nil, nil},
	filecat.CSharp:     {filecat.CSharp, "// ", "/* ", " */", nil, nil, nil},
	filecat.D:          {filecat.D, "// ", "/* ", " */", nil, nil, nil},
	filecat.ObjC:       {filecat.ObjC, "// ", "/* ", " */", nil, nil, nil},
	filecat.Go:         {filecat.Go, "// ", "/* ", " */", []LangFlags{IndentTab}, nil, nil},
	filecat.Java:       {filecat.Java, "// ", "/* ", " */", nil, nil, nil},
	filecat.JavaScript: {filecat.JavaScript, "// ", "/* ", " */", nil, nil, nil},
	filecat.Eiffel:     {filecat.Eiffel, "--", "", "", nil, nil, nil},
	filecat.Haskell:    {filecat.Haskell, "--", "{- ", "-}", nil, nil, nil},
	filecat.Lisp:       {filecat.Lisp, "; ", "", "", nil, nil, nil},
	filecat.Lua:        {filecat.Lua, "--", "---[[ ", "--]]", nil, nil, nil},
	filecat.Makefile:   {filecat.Makefile, "# ", "", "", []LangFlags{IndentTab}, nil, nil},
	filecat.Matlab:     {filecat.Matlab, "% ", "%{ ", " %}", nil, nil, nil},
	filecat.OCaml:      {filecat.OCaml, "", "(* ", " *)", nil, nil, nil},
	filecat.Pascal:     {filecat.Pascal, "// ", " ", " }", nil, nil, nil},
	filecat.Perl:       {filecat.Perl, "# ", "", "", nil, nil, nil},
	filecat.Python:     {filecat.Python, "# ", "", "", []LangFlags{IndentSpace}, nil, nil},
	filecat.Php:        {filecat.Php, "// ", "/* ", " */", nil, nil, nil},
	filecat.R:          {filecat.R, "# ", "", "", nil, nil, nil},
	filecat.Ruby:       {filecat.Ruby, "# ", "", "", nil, nil, nil},
	filecat.Rust:       {filecat.Rust, "// ", "/* ", " */", nil, nil, nil},
	filecat.Scala:      {filecat.Scala, "// ", "/* ", " */", nil, nil, nil},
	filecat.Html:       {filecat.Html, "", "<!-- ", " -->", nil, nil, nil},
	filecat.TeX:        {filecat.TeX, "% ", "", "", nil, nil, nil},
	filecat.Markdown:   {filecat.Markdown, "", "<!--- ", " -->", []LangFlags{IndentSpace}, nil, nil},
	filecat.Yaml:       {filecat.Yaml, "#", "", "", []LangFlags{IndentSpace}, nil, nil},
}

StdLangProps is the standard compiled-in set of langauge properties

Functions

func VersionInfo

func VersionInfo() string

VersionInfo returns Pi version information

Types

type FileState

type FileState struct {
	Src        lex.File     `json:"-" xml:"-" desc:"the source to be parsed -- also holds the full lexed tokens"`
	LexState   lex.State    `json:"_" xml:"-" desc:"state for lexing"`
	TwoState   lex.TwoState `json:"-" xml:"-" desc:"state for second pass nesting depth and EOS matching"`
	ParseState parse.State  `json:"-" xml:"-" desc:"state for parsing"`
	Ast        parse.Ast    `json:"-" xml:"-" desc:"ast output tree from parsing"`
	Syms       syms.SymMap  `` /* 234-byte string literal not displayed */
	ExtSyms    syms.SymMap  `` /* 227-byte string literal not displayed */
	SymsMu     sync.RWMutex `json:"-" xml:"-" desc:"mutex protecting updates / reading of Syms symbols"`
}

FileState contains the full lexing and parsing state information for a given file. It is the master state record for everything that happens in GoPi. One of these should be maintained for each file -- giv.TextBuf has one as PiState field.

Separate State structs are maintained for each stage (Lexing, PassTwo, Parsing) and the final output of Parsing goes into the Ast and Syms fields.

The Src lex.File field maintains all the info about the source file, and the basic tokenized version of the source produced initially by lexing and updated by the remaining passes. It has everything that is maintained at a line-by-line level.

func NewFileState

func NewFileState() *FileState

NewFileState returns a new initialized file state

func (*FileState) FindAnyChildren

func (fs *FileState) FindAnyChildren(sym *syms.Symbol, seed string, scope syms.SymMap, kids *syms.SymMap) bool

FindAnyChildren fills out map with either direct children of given symbol or those of the type of this symbol -- useful for completion. If seed is non-empty it is used as a prefix for filtering children names. Returns false if no children were found.

func (*FileState) FindChildren added in v0.5.7

func (fs *FileState) FindChildren(sym *syms.Symbol, seed string, scope syms.SymMap, kids *syms.SymMap) bool

FindChildren fills out map with direct children of given symbol If seed is non-empty it is used as a prefix for filtering children names. Returns false if no children were found.

func (*FileState) FindNamePrefixScoped

func (fs *FileState) FindNamePrefixScoped(seed string, scope syms.SymMap, matches *syms.SymMap)

FindNamePrefixScoped looks for given symbol name prefix within given map first (if non nil) and then in fs.Syms and ExtSyms maps, and any children on those global maps that are of subcategory token.NameScope (i.e., namespace, module, package, library) adds to given matches map (which can be nil), for more efficient recursive use

func (*FileState) FindNameScoped

func (fs *FileState) FindNameScoped(nm string, scope syms.SymMap) (*syms.Symbol, bool)

FindNameScoped looks for given symbol name within given map first (if non nil) and then in fs.Syms and ExtSyms maps, and any children on those global maps that are of subcategory token.NameScope (i.e., namespace, module, package, library)

func (*FileState) Init

func (fs *FileState) Init()

Init initializes the file state

func (*FileState) LexAtEnd

func (fs *FileState) LexAtEnd() bool

LexAtEnd returns true if lexing state is now at end of source

func (*FileState) LexErrReport

func (fs *FileState) LexErrReport() string

LexErrReport returns a report of all the lexing errors -- these should only occur during development of lexer so we use a detailed report format

func (*FileState) LexHasErrs

func (fs *FileState) LexHasErrs() bool

LexHasErrs returns true if there were errors from lexing

func (*FileState) LexLine

func (fs *FileState) LexLine(ln int) lex.Line

LexLine returns the lexing output for given line, combining comments and all other tokens and allocating new memory using clone

func (*FileState) LexLineString

func (fs *FileState) LexLineString() string

LexLineString returns a string rep of the current lexing output for the current line

func (*FileState) LexNextSrcLine

func (fs *FileState) LexNextSrcLine() string

LexNextSrcLine returns the next line of source that the lexer is currently at

func (*FileState) ParseAtEnd

func (fs *FileState) ParseAtEnd() bool

ParseAtEnd returns true if parsing state is now at end of source

func (*FileState) ParseErrReport

func (fs *FileState) ParseErrReport() string

ParseErrReport returns at most 10 parsing errors in end-user format, sorted

func (*FileState) ParseErrReportAll

func (fs *FileState) ParseErrReportAll() string

ParseErrReportAll returns all parsing errors in end-user format, sorted

func (*FileState) ParseErrReportDetailed

func (fs *FileState) ParseErrReportDetailed() string

ParseErrReportDetailed returns at most 10 parsing errors in detailed format, sorted

func (*FileState) ParseHasErrs

func (fs *FileState) ParseHasErrs() bool

ParseHasErrs returns true if there were errors from parsing

func (*FileState) ParseNextSrcLine

func (fs *FileState) ParseNextSrcLine() string

ParseNextSrcLine returns the next line of source that the parser is currently at

func (*FileState) ParseRuleString

func (fs *FileState) ParseRuleString(full bool) string

RuleString returns the rule info for entire source -- if full then it includes the full stack at each point -- otherwise just the top of stack

func (*FileState) PassTwoErrReport

func (fs *FileState) PassTwoErrReport() string

PassTwoErrString returns all the pass two errors as a string -- these should only occur during development so we use a detailed report format

func (*FileState) PassTwoHasErrs

func (fs *FileState) PassTwoHasErrs() bool

PassTwoHasErrs returns true if there were errors from pass two processing

func (*FileState) SetSrc

func (fs *FileState) SetSrc(src *[][]rune, fname string, sup filecat.Supported)

SetSrc sets source to be parsed, and filename it came from, and also the base path for project for reporting filenames relative to (if empty, path to filename is used)

type Lang

type Lang interface {
	// Parser returns the pi.Parser for this language
	Parser() *Parser

	// ParseFile does the complete processing of a given single file, as appropriate
	// for the language -- e.g., runs the lexer followed by the parser, and
	// manages any symbol output from parsing as appropriate for the language / format.
	ParseFile(fs *FileState)

	// LexLine does just the lexing of a given line of the file, using existing context
	// if available from prior lexing / parsing. Line is in 0-indexed "internal" line indexes.
	// The rune source information is assumed to have already been updated in FileState.
	// languages can run the parser on the line to augment the lex token output as appropriate.
	LexLine(fs *FileState, line int) lex.Line

	// ParseLine does complete parser processing of a single line from given file, and returns
	// the FileState for just that line.  Line is in 0-indexed "internal" line indexes.
	// The rune source information is assumed to have already been updated in FileState
	// Existing context information from full-file parsing is used as appropriate, but
	// the results will NOT be used to update any existing full-file Ast representation --
	// should call ParseFile to update that as appropriate.
	ParseLine(fs *FileState, line int) *FileState

	// HiLine does the lexing and potentially parsing of a given line of the file,
	// for purposes of syntax highlighting -- uses existing context
	// if available from prior lexing / parsing. Line is in 0-indexed "internal" line indexes.
	// The rune source information is assumed to have already been updated in FileState.
	// languages can run the parser on the line to augment the lex token output as appropriate.
	HiLine(fs *FileState, line int) lex.Line

	// CompleteLine provides the list of relevant completions for given text
	// which is at given position within the file.
	// Typically the language will call ParseLine on that line, and use the Ast
	// to guide the selection of relevant symbols that can complete the code at
	// the given point.  A stack (slice) of symbols is returned so that the completer
	// can control the order of items presented, as compared to the SymMap.
	CompleteLine(fs *FileState, text string, pos lex.Pos) complete.MatchData

	// CompleteEdit returns the completion edit data for integrating the selected completion
	// into the source
	CompleteEdit(fs *FileState, text string, cp int, comp complete.Completion, seed string) (ed complete.EditData)

	// ParseDir does the complete processing of a given directory, optionally including
	// subdirectories, and optionally forcing the re-processing of the directory(s),
	// instead of using cached symbols.  Typically the cache will be used unless files
	// have a more recent modification date than the cache file.  This returns the
	// language-appropriate set of symbols for the directory(s), which could then provide
	// the symbols for a given package, library, or module at that path.
	ParseDir(path string, opts LangDirOpts) *syms.Symbol
}

Lang provides a general interface for language-specific management of the lexing, parsing, and symbol lookup process. The GoPi lexer and parser machinery is entirely language-general but specific languages may need specific ways of managing these processes, and processing their outputs, to best support the features of those languages. That is what this interface provides.

Each language defines a type supporting this interface, which is in turn registered with the StdLangProps map. Each supported language has its own .go file in this pi package that defines its own implementation of the interface and any other associated functionality.

The Lang is responsible for accessing the appropriate pi.Parser for this language (initialized and managed via LangSupport.OpenStd() etc) and the pi.FileState structure contains all the input and output state information for a given file.

This interface is likely to evolve as we expand the range of supported languages.

type LangDirOpts

type LangDirOpts struct {
	Subdirs bool `desc:"process subdirectories -- otherwise not"`
	Rebuild bool `desc:"rebuild the symbols by reprocessing from scratch instead of using cache"`
	Nocache bool `desc:"do not update the cache with results from processing"`
}

LangDirOpts provides options for Lang ParseDir method

type LangFlags

type LangFlags int

LangFlags are special properties of a given language

const (
	// NoFlags = nothing special
	NoFlags LangFlags = iota

	// IndentSpace means that spaces must be used for this language
	IndentSpace

	// IndentTab means that tabs must be used for this language
	IndentTab

	LangFlagsN
)

LangFlags

func (LangFlags) MarshalJSON

func (ev LangFlags) MarshalJSON() ([]byte, error)

func (*LangFlags) UnmarshalJSON

func (ev *LangFlags) UnmarshalJSON(b []byte) error

type LangProps

type LangProps struct {
	Sup       filecat.Supported `desc:"language -- must be a supported one from Supported list"`
	CommentLn string            `desc:"character(s) that start a single-line comment -- if empty then multi-line comment syntax will be used"`
	CommentSt string            `desc:"character(s) that start a multi-line comment or one that requires both start and end"`
	CommentEd string            `desc:"character(s) that end a multi-line comment or one that requires both start and end"`
	Flags     []LangFlags       `desc:"special properties for this language"`
	Lang      Lang              `json:"-" xml:"-" desc:"Lang interface for this language"`
	Parser    *Parser           `json:"-" xml:"-" desc:"parser for this language -- initialized in OpenStd"`
}

LangProps contains properties of languages supported by the Pi parser framework

type LangSupporter

type LangSupporter struct {
}

LangSupporter provides general support for supported languages. e.g., looking up lexers and parsers by name. Also implements the lex.LangLexer interface to provide access to other Guest Lexers

func (*LangSupporter) LexerByName

func (ll *LangSupporter) LexerByName(lang string) *lex.Rule

LexerByName looks up Lexer for given language by name (with case-insensitive fallback). Returns nil if not supported.

func (*LangSupporter) OpenStd

func (ll *LangSupporter) OpenStd() error

OpenStd opens all the standard parsers for languages, from the langs/ directory

func (*LangSupporter) Props

func (ll *LangSupporter) Props(sup filecat.Supported) (*LangProps, error)

Props looks up language properties by filecat.Supported const int type

func (*LangSupporter) PropsByName

func (ll *LangSupporter) PropsByName(lang string) (*LangProps, error)

PropsByName looks up language properties by string name of language (with case-insensitive fallback). Returns error if not supported.

type Parser

type Parser struct {
	Lexer      lex.Rule    `desc:"lexer rules for first pass of lexing file"`
	PassTwo    lex.PassTwo `desc:"second pass after lexing -- computes nesting depth and EOS finding"`
	Parser     parse.Rule  `desc:"parser rules for parsing lexed tokens"`
	Filename   string      `desc:"file name for overall parser (not file being parsed!)"`
	ReportErrs bool        `desc:"if true, reports errors after parsing, to stdout"`
	ModTime    time.Time   `` /* 149-byte string literal not displayed */
}

Parser is the overall parser for managing the parsing

func NewParser

func NewParser() *Parser

NewParser returns a new initialized parser

func (*Parser) DoPassTwo

func (pr *Parser) DoPassTwo(fs *FileState)

DoPassTwo does the second pass after lexing

func (*Parser) Init

func (pr *Parser) Init()

Init initializes the parser -- must be called after creation

func (*Parser) InitAll

func (pr *Parser) InitAll()

InitAll initializes everything about the parser -- call this when setting up a new parser after it has been loaded etc

func (*Parser) LexAll

func (pr *Parser) LexAll(fs *FileState)

LexAll runs a complete pass of the lexer and pass two, on current state

func (*Parser) LexInit

func (pr *Parser) LexInit(fs *FileState)

LexInit gets the lexer ready to start lexing

func (*Parser) LexLine

func (pr *Parser) LexLine(fs *FileState, ln int) lex.Line

LexLine runs lexer for given single line of source, returns merged regular and token comment lines, cloned and ready for use

func (*Parser) LexNext

func (pr *Parser) LexNext(fs *FileState) *lex.Rule

LexNext does next step of lexing -- returns lowest-level rule that matched, and nil when nomatch err or at end of source input

func (*Parser) LexNextLine

func (pr *Parser) LexNextLine(fs *FileState) *lex.Rule

LexNextLine does next line of lexing -- returns lowest-level rule that matched at end, and nil when nomatch err or at end of source input

func (*Parser) LexRun

func (pr *Parser) LexRun(fs *FileState)

LexRun keeps running LextNext until it stops

func (*Parser) OpenJSON

func (pr *Parser) OpenJSON(filename string) error

OpenJSON opens lexer and parser rules to current filename, in a standard JSON-formatted file

func (*Parser) ParseAll

func (pr *Parser) ParseAll(fs *FileState)

ParseAll does full parsing, including ParseInit and ParseRun, assuming LexAll has been done already

func (*Parser) ParseLine

func (pr *Parser) ParseLine(fs *FileState, ln int) *FileState

ParseLine runs parser for given single line of source does Parsing in a separate FileState and returns that with Ast etc (or nil if nothing). Assumes LexLine has already been run on given line.

func (*Parser) ParseNext

func (pr *Parser) ParseNext(fs *FileState) *parse.Rule

ParseNext does next step of parsing -- returns lowest-level rule that matched or nil if no match error or at end

func (*Parser) ParseRun

func (pr *Parser) ParseRun(fs *FileState)

ParseRun continues running the parser until the end of the file

func (*Parser) ParseString

func (pr *Parser) ParseString(str string, fname string, sup filecat.Supported) *FileState

ParseString runs lexer and parser on given string of text, returning FileState of results (can be nil if string is empty or no lexical tokens). Also takes supporting contextual info for file / language that this string is associated with (only for reference)

func (*Parser) ParserInit

func (pr *Parser) ParserInit(fs *FileState) bool

ParserInit initializes the parser prior to running

func (*Parser) SaveGrammar

func (pr *Parser) SaveGrammar(filename string) error

SaveGrammar saves lexer and parser grammar rules to BNF-like .pig file

func (*Parser) SaveJSON

func (pr *Parser) SaveJSON(filename string) error

SaveJSON saves lexer and parser rules, in a standard JSON-formatted file

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL