Documentation
¶
Overview ¶
Package pi provides the overall integration and coordination of the several sub-packages that comprise the interactive parser (GoPi) system. It has the end-user high-level functions in the pi.Parser class for processing files through the Lexer and Parser stages, and specific custom functions for processing supported languages, via a common interface.
The basic machinery of lexing and parsing is implemented in the lex and parse sub-packages, which each work in a completely general-purpose manner across all supported languages, and can be re-used independently for any other such specific purpose outside of the full pi system. Each of them depend on the token.Tokens constants defined in the token package, which provides a "universal language" of lexical tokens used across all supported languages and syntax highlighting cases, based originally on pygments via chroma and since expanded and systematized from there.
The parse package produces an abstract syntax tree (AST) representation of the source, and lists (maps) of symbols that can be used for name lookup and completion (types, variables, functions, etc). Those symbol structures are defined in the syms sub-package.
To more effectively manage and organize the symbols from parsing, language-specific logic is required, and this is supported by the Lang interface, which is implemented for each of the supported languages (see lang.go and langs/* e.g., golang/golang.go).
The LangSupport variable provides the hub for accessing interfaces for supported languages, using the StdLangProps map which provides a lookup from the filecat.Supported language name to its associated Lang interface and pi.Parser parser. Thus you can go from the GoGi giv.FileInfo.Sup field to its associated GoPi methods using this map (and associated LangSupport methods). This map is extensible and other supported languages can be added in other packages. This requires a dependency on gi/filecat sub-module in GoGi, which defines a broad set of supported file categories and associated mime types, etc, which are generally supported within the GoGi gui framework -- a subset of these are the languages and file formats supported by GoPi parsing / lexing.
The piv sub-package provides the GUI for constructing and testing a lexer and parser interactively. It is the only sub-package with significant dependencies, especially on GoGi and Gide.
Index ¶
- Constants
- Variables
- func VersionInfo() string
- type FileState
- func (fs *FileState) FindAnyChildren(sym *syms.Symbol, seed string, scope syms.SymMap, kids *syms.SymMap) bool
- func (fs *FileState) FindChildren(sym *syms.Symbol, seed string, scope syms.SymMap, kids *syms.SymMap) bool
- func (fs *FileState) FindNamePrefixScoped(seed string, scope syms.SymMap, matches *syms.SymMap)
- func (fs *FileState) FindNameScoped(nm string, scope syms.SymMap) (*syms.Symbol, bool)
- func (fs *FileState) Init()
- func (fs *FileState) LexAtEnd() bool
- func (fs *FileState) LexErrReport() string
- func (fs *FileState) LexHasErrs() bool
- func (fs *FileState) LexLine(ln int) lex.Line
- func (fs *FileState) LexLineString() string
- func (fs *FileState) LexNextSrcLine() string
- func (fs *FileState) ParseAtEnd() bool
- func (fs *FileState) ParseErrReport() string
- func (fs *FileState) ParseErrReportAll() string
- func (fs *FileState) ParseErrReportDetailed() string
- func (fs *FileState) ParseHasErrs() bool
- func (fs *FileState) ParseNextSrcLine() string
- func (fs *FileState) ParseRuleString(full bool) string
- func (fs *FileState) PassTwoErrReport() string
- func (fs *FileState) PassTwoHasErrs() bool
- func (fs *FileState) SetSrc(src *[][]rune, fname string, sup filecat.Supported)
- type Lang
- type LangDirOpts
- type LangFlags
- type LangProps
- type LangSupporter
- type Parser
- func (pr *Parser) DoPassTwo(fs *FileState)
- func (pr *Parser) Init()
- func (pr *Parser) InitAll()
- func (pr *Parser) LexAll(fs *FileState)
- func (pr *Parser) LexInit(fs *FileState)
- func (pr *Parser) LexLine(fs *FileState, ln int) lex.Line
- func (pr *Parser) LexNext(fs *FileState) *lex.Rule
- func (pr *Parser) LexNextLine(fs *FileState) *lex.Rule
- func (pr *Parser) LexRun(fs *FileState)
- func (pr *Parser) OpenJSON(filename string) error
- func (pr *Parser) ParseAll(fs *FileState)
- func (pr *Parser) ParseLine(fs *FileState, ln int) *FileState
- func (pr *Parser) ParseNext(fs *FileState) *parse.Rule
- func (pr *Parser) ParseRun(fs *FileState)
- func (pr *Parser) ParseString(str string, fname string, sup filecat.Supported) *FileState
- func (pr *Parser) ParserInit(fs *FileState) bool
- func (pr *Parser) SaveGrammar(filename string) error
- func (pr *Parser) SaveJSON(filename string) error
Constants ¶
const ( Version = "v0.5.10" GitCommit = "3bbcf69" // the commit JUST BEFORE the release VersionDate = "2019-12-30 08:45" // UTC )
Variables ¶
var KiT_LangFlags = kit.Enums.AddEnum(LangFlagsN, kit.NotBitFlag, nil)
var LangSupport = LangSupporter{}
LangSupport is the main language support hub for accessing GoPi support interfaces for each supported language
var StdLangProps = map[filecat.Supported]*LangProps{ filecat.Ada: {filecat.Ada, "--", "", "", nil, nil, nil}, filecat.Bash: {filecat.Bash, "# ", "", "", nil, nil, nil}, filecat.Csh: {filecat.Csh, "# ", "", "", nil, nil, nil}, filecat.C: {filecat.C, "// ", "/* ", " */", nil, nil, nil}, filecat.CSharp: {filecat.CSharp, "// ", "/* ", " */", nil, nil, nil}, filecat.D: {filecat.D, "// ", "/* ", " */", nil, nil, nil}, filecat.ObjC: {filecat.ObjC, "// ", "/* ", " */", nil, nil, nil}, filecat.Go: {filecat.Go, "// ", "/* ", " */", []LangFlags{IndentTab}, nil, nil}, filecat.Java: {filecat.Java, "// ", "/* ", " */", nil, nil, nil}, filecat.JavaScript: {filecat.JavaScript, "// ", "/* ", " */", nil, nil, nil}, filecat.Eiffel: {filecat.Eiffel, "--", "", "", nil, nil, nil}, filecat.Haskell: {filecat.Haskell, "--", "{- ", "-}", nil, nil, nil}, filecat.Lisp: {filecat.Lisp, "; ", "", "", nil, nil, nil}, filecat.Lua: {filecat.Lua, "--", "---[[ ", "--]]", nil, nil, nil}, filecat.Makefile: {filecat.Makefile, "# ", "", "", []LangFlags{IndentTab}, nil, nil}, filecat.Matlab: {filecat.Matlab, "% ", "%{ ", " %}", nil, nil, nil}, filecat.OCaml: {filecat.OCaml, "", "(* ", " *)", nil, nil, nil}, filecat.Pascal: {filecat.Pascal, "// ", " ", " }", nil, nil, nil}, filecat.Perl: {filecat.Perl, "# ", "", "", nil, nil, nil}, filecat.Python: {filecat.Python, "# ", "", "", []LangFlags{IndentSpace}, nil, nil}, filecat.Php: {filecat.Php, "// ", "/* ", " */", nil, nil, nil}, filecat.R: {filecat.R, "# ", "", "", nil, nil, nil}, filecat.Ruby: {filecat.Ruby, "# ", "", "", nil, nil, nil}, filecat.Rust: {filecat.Rust, "// ", "/* ", " */", nil, nil, nil}, filecat.Scala: {filecat.Scala, "// ", "/* ", " */", nil, nil, nil}, filecat.Html: {filecat.Html, "", "<!-- ", " -->", nil, nil, nil}, filecat.TeX: {filecat.TeX, "% ", "", "", nil, nil, nil}, filecat.Markdown: {filecat.Markdown, "", "<!--- ", " -->", []LangFlags{IndentSpace}, nil, nil}, filecat.Yaml: {filecat.Yaml, "#", "", "", []LangFlags{IndentSpace}, nil, nil}, }
StdLangProps is the standard compiled-in set of langauge properties
Functions ¶
Types ¶
type FileState ¶
type FileState struct {
Src lex.File `json:"-" xml:"-" desc:"the source to be parsed -- also holds the full lexed tokens"`
LexState lex.State `json:"_" xml:"-" desc:"state for lexing"`
TwoState lex.TwoState `json:"-" xml:"-" desc:"state for second pass nesting depth and EOS matching"`
ParseState parse.State `json:"-" xml:"-" desc:"state for parsing"`
Ast parse.Ast `json:"-" xml:"-" desc:"ast output tree from parsing"`
Syms syms.SymMap `` /* 234-byte string literal not displayed */
ExtSyms syms.SymMap `` /* 227-byte string literal not displayed */
SymsMu sync.RWMutex `json:"-" xml:"-" desc:"mutex protecting updates / reading of Syms symbols"`
}
FileState contains the full lexing and parsing state information for a given file. It is the master state record for everything that happens in GoPi. One of these should be maintained for each file -- giv.TextBuf has one as PiState field.
Separate State structs are maintained for each stage (Lexing, PassTwo, Parsing) and the final output of Parsing goes into the Ast and Syms fields.
The Src lex.File field maintains all the info about the source file, and the basic tokenized version of the source produced initially by lexing and updated by the remaining passes. It has everything that is maintained at a line-by-line level.
func NewFileState ¶
func NewFileState() *FileState
NewFileState returns a new initialized file state
func (*FileState) FindAnyChildren ¶
func (fs *FileState) FindAnyChildren(sym *syms.Symbol, seed string, scope syms.SymMap, kids *syms.SymMap) bool
FindAnyChildren fills out map with either direct children of given symbol or those of the type of this symbol -- useful for completion. If seed is non-empty it is used as a prefix for filtering children names. Returns false if no children were found.
func (*FileState) FindChildren ¶ added in v0.5.7
func (fs *FileState) FindChildren(sym *syms.Symbol, seed string, scope syms.SymMap, kids *syms.SymMap) bool
FindChildren fills out map with direct children of given symbol If seed is non-empty it is used as a prefix for filtering children names. Returns false if no children were found.
func (*FileState) FindNamePrefixScoped ¶
FindNamePrefixScoped looks for given symbol name prefix within given map first (if non nil) and then in fs.Syms and ExtSyms maps, and any children on those global maps that are of subcategory token.NameScope (i.e., namespace, module, package, library) adds to given matches map (which can be nil), for more efficient recursive use
func (*FileState) FindNameScoped ¶
FindNameScoped looks for given symbol name within given map first (if non nil) and then in fs.Syms and ExtSyms maps, and any children on those global maps that are of subcategory token.NameScope (i.e., namespace, module, package, library)
func (*FileState) LexErrReport ¶
LexErrReport returns a report of all the lexing errors -- these should only occur during development of lexer so we use a detailed report format
func (*FileState) LexHasErrs ¶
LexHasErrs returns true if there were errors from lexing
func (*FileState) LexLine ¶
LexLine returns the lexing output for given line, combining comments and all other tokens and allocating new memory using clone
func (*FileState) LexLineString ¶
LexLineString returns a string rep of the current lexing output for the current line
func (*FileState) LexNextSrcLine ¶
LexNextSrcLine returns the next line of source that the lexer is currently at
func (*FileState) ParseAtEnd ¶
ParseAtEnd returns true if parsing state is now at end of source
func (*FileState) ParseErrReport ¶
ParseErrReport returns at most 10 parsing errors in end-user format, sorted
func (*FileState) ParseErrReportAll ¶
ParseErrReportAll returns all parsing errors in end-user format, sorted
func (*FileState) ParseErrReportDetailed ¶
ParseErrReportDetailed returns at most 10 parsing errors in detailed format, sorted
func (*FileState) ParseHasErrs ¶
ParseHasErrs returns true if there were errors from parsing
func (*FileState) ParseNextSrcLine ¶
ParseNextSrcLine returns the next line of source that the parser is currently at
func (*FileState) ParseRuleString ¶
RuleString returns the rule info for entire source -- if full then it includes the full stack at each point -- otherwise just the top of stack
func (*FileState) PassTwoErrReport ¶
PassTwoErrString returns all the pass two errors as a string -- these should only occur during development so we use a detailed report format
func (*FileState) PassTwoHasErrs ¶
PassTwoHasErrs returns true if there were errors from pass two processing
type Lang ¶
type Lang interface {
// Parser returns the pi.Parser for this language
Parser() *Parser
// ParseFile does the complete processing of a given single file, as appropriate
// for the language -- e.g., runs the lexer followed by the parser, and
// manages any symbol output from parsing as appropriate for the language / format.
ParseFile(fs *FileState)
// LexLine does just the lexing of a given line of the file, using existing context
// if available from prior lexing / parsing. Line is in 0-indexed "internal" line indexes.
// The rune source information is assumed to have already been updated in FileState.
// languages can run the parser on the line to augment the lex token output as appropriate.
LexLine(fs *FileState, line int) lex.Line
// ParseLine does complete parser processing of a single line from given file, and returns
// the FileState for just that line. Line is in 0-indexed "internal" line indexes.
// The rune source information is assumed to have already been updated in FileState
// Existing context information from full-file parsing is used as appropriate, but
// the results will NOT be used to update any existing full-file Ast representation --
// should call ParseFile to update that as appropriate.
ParseLine(fs *FileState, line int) *FileState
// HiLine does the lexing and potentially parsing of a given line of the file,
// for purposes of syntax highlighting -- uses existing context
// if available from prior lexing / parsing. Line is in 0-indexed "internal" line indexes.
// The rune source information is assumed to have already been updated in FileState.
// languages can run the parser on the line to augment the lex token output as appropriate.
HiLine(fs *FileState, line int) lex.Line
// CompleteLine provides the list of relevant completions for given text
// which is at given position within the file.
// Typically the language will call ParseLine on that line, and use the Ast
// to guide the selection of relevant symbols that can complete the code at
// the given point. A stack (slice) of symbols is returned so that the completer
// can control the order of items presented, as compared to the SymMap.
CompleteLine(fs *FileState, text string, pos lex.Pos) complete.MatchData
// CompleteEdit returns the completion edit data for integrating the selected completion
// into the source
CompleteEdit(fs *FileState, text string, cp int, comp complete.Completion, seed string) (ed complete.EditData)
// ParseDir does the complete processing of a given directory, optionally including
// subdirectories, and optionally forcing the re-processing of the directory(s),
// instead of using cached symbols. Typically the cache will be used unless files
// have a more recent modification date than the cache file. This returns the
// language-appropriate set of symbols for the directory(s), which could then provide
// the symbols for a given package, library, or module at that path.
ParseDir(path string, opts LangDirOpts) *syms.Symbol
}
Lang provides a general interface for language-specific management of the lexing, parsing, and symbol lookup process. The GoPi lexer and parser machinery is entirely language-general but specific languages may need specific ways of managing these processes, and processing their outputs, to best support the features of those languages. That is what this interface provides.
Each language defines a type supporting this interface, which is in turn registered with the StdLangProps map. Each supported language has its own .go file in this pi package that defines its own implementation of the interface and any other associated functionality.
The Lang is responsible for accessing the appropriate pi.Parser for this language (initialized and managed via LangSupport.OpenStd() etc) and the pi.FileState structure contains all the input and output state information for a given file.
This interface is likely to evolve as we expand the range of supported languages.
type LangDirOpts ¶
type LangDirOpts struct {
Subdirs bool `desc:"process subdirectories -- otherwise not"`
Rebuild bool `desc:"rebuild the symbols by reprocessing from scratch instead of using cache"`
Nocache bool `desc:"do not update the cache with results from processing"`
}
LangDirOpts provides options for Lang ParseDir method
type LangFlags ¶
type LangFlags int
LangFlags are special properties of a given language
const ( // NoFlags = nothing special NoFlags LangFlags = iota // IndentSpace means that spaces must be used for this language IndentSpace // IndentTab means that tabs must be used for this language IndentTab LangFlagsN )
LangFlags
func (LangFlags) MarshalJSON ¶
func (*LangFlags) UnmarshalJSON ¶
type LangProps ¶
type LangProps struct {
Sup filecat.Supported `desc:"language -- must be a supported one from Supported list"`
CommentLn string `desc:"character(s) that start a single-line comment -- if empty then multi-line comment syntax will be used"`
CommentSt string `desc:"character(s) that start a multi-line comment or one that requires both start and end"`
CommentEd string `desc:"character(s) that end a multi-line comment or one that requires both start and end"`
Flags []LangFlags `desc:"special properties for this language"`
Lang Lang `json:"-" xml:"-" desc:"Lang interface for this language"`
Parser *Parser `json:"-" xml:"-" desc:"parser for this language -- initialized in OpenStd"`
}
LangProps contains properties of languages supported by the Pi parser framework
type LangSupporter ¶
type LangSupporter struct {
}
LangSupporter provides general support for supported languages. e.g., looking up lexers and parsers by name. Also implements the lex.LangLexer interface to provide access to other Guest Lexers
func (*LangSupporter) LexerByName ¶
func (ll *LangSupporter) LexerByName(lang string) *lex.Rule
LexerByName looks up Lexer for given language by name (with case-insensitive fallback). Returns nil if not supported.
func (*LangSupporter) OpenStd ¶
func (ll *LangSupporter) OpenStd() error
OpenStd opens all the standard parsers for languages, from the langs/ directory
func (*LangSupporter) Props ¶
func (ll *LangSupporter) Props(sup filecat.Supported) (*LangProps, error)
Props looks up language properties by filecat.Supported const int type
func (*LangSupporter) PropsByName ¶
func (ll *LangSupporter) PropsByName(lang string) (*LangProps, error)
PropsByName looks up language properties by string name of language (with case-insensitive fallback). Returns error if not supported.
type Parser ¶
type Parser struct {
Lexer lex.Rule `desc:"lexer rules for first pass of lexing file"`
PassTwo lex.PassTwo `desc:"second pass after lexing -- computes nesting depth and EOS finding"`
Parser parse.Rule `desc:"parser rules for parsing lexed tokens"`
Filename string `desc:"file name for overall parser (not file being parsed!)"`
ReportErrs bool `desc:"if true, reports errors after parsing, to stdout"`
ModTime time.Time `` /* 149-byte string literal not displayed */
}
Parser is the overall parser for managing the parsing
func (*Parser) Init ¶
func (pr *Parser) Init()
Init initializes the parser -- must be called after creation
func (*Parser) InitAll ¶
func (pr *Parser) InitAll()
InitAll initializes everything about the parser -- call this when setting up a new parser after it has been loaded etc
func (*Parser) LexLine ¶
LexLine runs lexer for given single line of source, returns merged regular and token comment lines, cloned and ready for use
func (*Parser) LexNext ¶
LexNext does next step of lexing -- returns lowest-level rule that matched, and nil when nomatch err or at end of source input
func (*Parser) LexNextLine ¶
LexNextLine does next line of lexing -- returns lowest-level rule that matched at end, and nil when nomatch err or at end of source input
func (*Parser) OpenJSON ¶
OpenJSON opens lexer and parser rules to current filename, in a standard JSON-formatted file
func (*Parser) ParseAll ¶
ParseAll does full parsing, including ParseInit and ParseRun, assuming LexAll has been done already
func (*Parser) ParseLine ¶
ParseLine runs parser for given single line of source does Parsing in a separate FileState and returns that with Ast etc (or nil if nothing). Assumes LexLine has already been run on given line.
func (*Parser) ParseNext ¶
ParseNext does next step of parsing -- returns lowest-level rule that matched or nil if no match error or at end
func (*Parser) ParseString ¶
ParseString runs lexer and parser on given string of text, returning FileState of results (can be nil if string is empty or no lexical tokens). Also takes supporting contextual info for file / language that this string is associated with (only for reference)
func (*Parser) ParserInit ¶
ParserInit initializes the parser prior to running
func (*Parser) SaveGrammar ¶
SaveGrammar saves lexer and parser grammar rules to BNF-like .pig file