Documentation
¶
Overview ¶
Package parser provides HTML and JavaScript parsing for the crawler.
Index ¶
- func ExtractURLsFromText(text string) []string
- type APIEndpoint
- type APIInfo
- type APIParser
- type APIType
- type AnalyzeResult
- type ButtonInfo
- type Endpoint
- type Form
- type FormAnalyzer
- type FormInfo
- type FormInput
- type FormType
- type FunctionInfo
- type HTMLParser
- type InputInfo
- type JSParseResult
- type JSParser
- type Link
- type Parameter
- type ParseResult
- type PotentialSecret
- type Route
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func ExtractURLsFromText ¶
ExtractURLsFromText extracts URLs from plain text.
Types ¶
type APIEndpoint ¶
type APIEndpoint struct {
URL string
Method string
Parameters []string
SourceLine int
Context string
}
APIEndpoint represents a discovered API endpoint.
type APIInfo ¶
type APIInfo struct {
Endpoints []Endpoint
BaseURL string
Version string
Type APIType // REST, GraphQL, SOAP, etc.
}
APIInfo represents discovered API information.
type APIParser ¶
type APIParser struct{}
APIParser extracts API information from various sources.
func (*APIParser) ParseFromResponse ¶
ParseFromResponse extracts API information from a response.
func (*APIParser) ParseGraphQLSchema ¶
ParseGraphQLSchema parses a GraphQL schema.
type AnalyzeResult ¶
type AnalyzeResult struct {
Form Form
FormType FormType
HasCSRF bool
CSRFField string
HasCaptcha bool
IsLogin bool
IsSignup bool
IsSearch bool
IsContact bool
IsPayment bool
IsUpload bool
Complexity int // 1-10 complexity score
}
AnalyzeResult contains detailed form analysis results.
type ButtonInfo ¶
ButtonInfo represents a form button.
type Endpoint ¶
type Endpoint struct {
URL string
Method string
Source string
Depth int
Parameters []Parameter
Headers map[string]string
DiscoveredFrom string
StatusCode int
ContentType string
ResponseSize int64
Timestamp time.Time
}
Endpoint represents a discovered API endpoint.
type Form ¶
type Form struct {
URL string
Action string
Method string
Enctype string
Inputs []FormInput
HasCSRF bool
Depth int
Timestamp time.Time
}
Form represents an HTML form discovered during crawling.
type FormAnalyzer ¶
type FormAnalyzer struct{}
FormAnalyzer provides comprehensive form analysis.
func NewFormAnalyzer ¶
func NewFormAnalyzer() *FormAnalyzer
NewFormAnalyzer creates a new form analyzer.
func (*FormAnalyzer) Analyze ¶
func (a *FormAnalyzer) Analyze(form FormInfo, pageURL string) *AnalyzeResult
Analyze performs comprehensive analysis of a form.
func (*FormAnalyzer) GeneratePayload ¶
func (a *FormAnalyzer) GeneratePayload(form FormInfo) map[string]string
GeneratePayload generates test payloads for a form.
type FormInfo ¶
type FormInfo struct {
Action string
Method string
Enctype string
ID string
Name string
Class string
Inputs []InputInfo
Buttons []ButtonInfo
}
FormInfo represents a parsed form.
type FormInput ¶
type FormInput struct {
Name string
Type string
Value string
Required bool
Placeholder string
Pattern string
MaxLength int
MinLength int
}
FormInput represents an input field in a form.
type FormType ¶
type FormType string
FormType represents the type of form.
const ( FormTypeLogin FormType = "login" FormTypeSignup FormType = "signup" FormTypeSearch FormType = "search" FormTypeContact FormType = "contact" FormTypePayment FormType = "payment" FormTypeUpload FormType = "upload" FormTypeComment FormType = "comment" FormTypeSettings FormType = "settings" FormTypeGeneric FormType = "generic" )
type FunctionInfo ¶
FunctionInfo represents a function signature.
type HTMLParser ¶
type HTMLParser struct {
// contains filtered or unexported fields
}
HTMLParser parses HTML documents to extract links and other elements.
func NewHTMLParser ¶
func NewHTMLParser(baseURL string) (*HTMLParser, error)
NewHTMLParser creates a new HTML parser.
func (*HTMLParser) Parse ¶
func (p *HTMLParser) Parse(html string) (*ParseResult, error)
Parse parses an HTML document.
type InputInfo ¶
type InputInfo struct {
Name string
Type string
Value string
ID string
Class string
Required bool
Disabled bool
Readonly bool
Placeholder string
Pattern string
MinLength int
MaxLength int
Min string
Max string
Step string
Multiple bool
Accept string
Autocomplete string
}
InputInfo represents a form input.
type JSParseResult ¶
type JSParseResult struct {
URLs []string
APIEndpoints []APIEndpoint
WebSockets []string
Secrets []PotentialSecret
Routes []Route
Functions []FunctionInfo
}
JSParseResult contains the result of JavaScript analysis.
type JSParser ¶
type JSParser struct{}
JSParser performs static analysis on JavaScript code.
func (*JSParser) Parse ¶
func (p *JSParser) Parse(js string) *JSParseResult
Parse analyzes JavaScript code.
type ParseResult ¶
type ParseResult struct {
Links []Link
Forms []FormInfo
Scripts []string
Stylesheets []string
Images []string
Iframes []string
Meta map[string]string
Comments []string
}
ParseResult contains the result of parsing an HTML document.
type PotentialSecret ¶
PotentialSecret represents a potential secret in code.