Documentation
¶
Index ¶
- Constants
- Variables
- func ConvertGoquerySelectionToFormField(item *goquery.Selection) interface{}
- func ExtractBodyEndpoints(data string) []string
- func ExtractParentPaths(rawurl string) []string
- func ExtractRelativeEndpoints(data string) []string
- func FingerprintURL(rawURL string, trie *PathTrie) string
- func FlattenHeaders(headers map[string][]string) map[string]string
- func FormFillSuggestions(formFields []interface{}) mapsutil.OrderedMap[string, string]
- func FormInputFillSuggestions(inputs []FormInput) mapsutil.OrderedMap[string, string]
- func FormSelectFill(inputs []FormSelect) mapsutil.OrderedMap[string, string]
- func FormTextAreaFill(inputs []FormTextArea) mapsutil.OrderedMap[string, string]
- func IsPathCommonJSLibraryFile(path string) bool
- func IsURL(url string) bool
- func MergeDataMaps(dataMap1 *mapsutil.OrderedMap[string, string], ...)
- func ParseFormFields(document *goquery.Document) []navigation.Form
- func ParseLinkTag(value string) []string
- func ParseRefreshTag(value string) string
- func ParseSRCSetTag(value string) []string
- func ReplaceAllQueryParam(reqUrl, val string) string
- func WebUserAgent() string
- type FormFillData
- type FormInput
- type FormSelect
- type FormTextArea
- type JSLuiceEndpoint
- type PathTrie
- type SelectOption
Constants ¶
const DefaultMaxHosts = 10000
DefaultMaxHosts is the maximum number of hosts tracked by the LRU cache. When exceeded, the least recently used host's trie is evicted.
const DefaultPromotionThreshold = 10
DefaultPromotionThreshold is the number of distinct children a trie node must accumulate before being promoted to a parameter node.
Variables ¶
var ( BodyA0 = `(?:` BodyB0 = `(` BodyC0 = `(?:[\.]{1,2}/[A-Za-z0-9\-_/\\?&@\.?=%]+)` BodyC1 = `|(https?://[A-Za-z0-9_\-\.]+([\.]{0,2})?\/[A-Za-z0-9\-_/\\?&@\.?=%]+)` BodyC2 = `|(/[A-Za-z0-9\-_/\\?&@\.%]+\.(aspx?|action|cfm|cgi|do|pl|css|x?html?|js(p|on)?|pdf|php5?|py|rss))` BodyC3 = `|([A-Za-z0-9\-_?&@\.%]+/[A-Za-z0-9/\\\-_?&@\.%]+\.(aspx?|action|cfm|cgi|do|pl|css|x?html?|js(p|on)?|pdf|php5?|py|rss))` BodyB1 = `)` BodyA1 = `)` JsA0 = `(?:"|'|\s)` JsB0 = `(` JsC0 = `((https?://[A-Za-z0-9_\-.]+(?:\:\d{1,5})?)+([\.]{1,2})?/[A-Za-z0-9/\-_\\.%]+(?:[\?|#][^"']+)?)` JsC1 = `|((\.{1,2}/)?[a-zA-Z0-9\-_/\\%]+\.(aspx?|js(?:on|p)?|html|php5?|action|do)(?:[\?|#][^"']+)?)` JsC2 = `|((\.{0,2}/)[a-zA-Z0-9\-_/\\%]+(?:/|\\)[a-zA-Z0-9\-_]{3,}(?:[\?|#][^"']+)?)` JsC3 = `|((\.{0,2})[a-zA-Z0-9\-_/\\%]{3,}/)` JsB1 = `)` JsA1 = `(?:"|'|\s)` )
var ( // CommonJSLibraryFileRegex is a regex to match common js library files. CommonJSLibraryFileRegex = `` /* 2139-byte string literal not displayed */ )
var DefaultFormFillData = FormFillData{ Email: fmt.Sprintf("%s@example.org", xid.New().String()), Color: "#e66465", Password: "katanaP@assw0rd1", PhoneNumber: "2124567890", Placeholder: "katana", }
Functions ¶
func ConvertGoquerySelectionToFormField ¶ added in v1.1.1
ConvertGoquerySelectionToFormField converts a goquery.Selection object to a form field. It checks the type of the selection and calls the appropriate conversion function. If the selection is an input, it calls ConvertGoquerySelectionToFormInput. If the selection is a select, it calls ConvertGoquerySelectionToFormSelect. If the selection is a textarea, it calls ConvertGoquerySelectionToFormTextArea. If the selection is of any other type, it returns nil.
func ExtractBodyEndpoints ¶
ExtractBodyEndpoints extracts body endpoints from a data item
func ExtractParentPaths ¶ added in v1.2.0
ExtractParentPaths returns all path directories for a given URL
func ExtractRelativeEndpoints ¶
ExtractRelativeEndpoints extracts relative endpoints from a data item
func FingerprintURL ¶ added in v1.5.0
FingerprintURL produces a structural fingerprint of the given URL by: 1. Replacing variable path segments (IDs, UUIDs, hashes, dates) with placeholders 2. Using the adaptive trie (if provided) to detect learned parameter positions 3. Dropping query parameter values, keeping only sorted keys
When trie is nil, only Layer 1 regex-based normalization is applied.
func FlattenHeaders ¶ added in v1.0.0
func FormFillSuggestions ¶ added in v1.1.1
func FormFillSuggestions(formFields []interface{}) mapsutil.OrderedMap[string, string]
FormFillSuggestions takes a slice of form fields and returns an ordered map containing suggestions for filling those form fields. The function iterates over each form field and based on its type, calls the corresponding fill function to generate suggestions. The suggestions are then merged into a single ordered map and returned.
Parameters: - formFields: A slice of form fields.
Returns: An ordered map containing suggestions for filling the form fields.
func FormInputFillSuggestions ¶
func FormInputFillSuggestions(inputs []FormInput) mapsutil.OrderedMap[string, string]
FormInputFillSuggestions returns a list of form filling suggestions for inputs returning the specified recommended values.
func FormSelectFill ¶ added in v1.1.1
func FormSelectFill(inputs []FormSelect) mapsutil.OrderedMap[string, string]
FormSelectFill fills a map with selected values from a slice of FormSelect structs. It iterates over each FormSelect struct in the inputs slice and checks for a selected option. If a selected option is found, it adds the corresponding value to the map using the input's name as the key. If no option is selected, it selects the first option and adds its value to the map. The function returns the filled map.
func FormTextAreaFill ¶ added in v1.1.1
func FormTextAreaFill(inputs []FormTextArea) mapsutil.OrderedMap[string, string]
FormTextAreaFill fills the form text areas with placeholder values. It takes a slice of FormTextArea structs as input and returns an OrderedMap containing the form field names as keys and the placeholder values as values.
func IsPathCommonJSLibraryFile ¶ added in v1.0.3
IsPathCommonJSLibraryFile checks if a given path is a common js library file.
func MergeDataMaps ¶ added in v1.1.1
func MergeDataMaps(dataMap1 *mapsutil.OrderedMap[string, string], dataMap2 mapsutil.OrderedMap[string, string])
func ParseFormFields ¶ added in v1.0.3
func ParseFormFields(document *goquery.Document) []navigation.Form
parses form, input, textarea & select elements
func ParseLinkTag ¶
ParseLinkTag parses link tag values returning found urls
Inspired from: https://github.com/tomnomnom/linkheader
func ParseRefreshTag ¶
ParseRefreshTag parses refresh tag values returning found urls
func ParseSRCSetTag ¶
ParseSRCSetTag parses srcset tag returning found URLs
func ReplaceAllQueryParam ¶ added in v1.0.1
ReplaceAllQueryParam replaces all the query param with the given value
Types ¶
type FormFillData ¶
type FormFillData struct {
Email string `yaml:"email"`
Color string `yaml:"color"`
Password string `yaml:"password"`
PhoneNumber string `yaml:"phone"`
Placeholder string `yaml:"placeholder"`
}
FormFillData contains suggestions for form filling
var FormData FormFillData
FormData is the global form fill data instance
type FormInput ¶
type FormInput struct {
Type string
Name string
Value string
Attributes mapsutil.OrderedMap[string, string]
}
FormInput is an input for a form field
func ConvertGoquerySelectionToFormInput ¶
ConvertGoquerySelectionToFormInput converts goquery selection to form input
type FormSelect ¶ added in v1.1.1
type FormSelect struct {
Name string
Attributes mapsutil.OrderedMap[string, string]
SelectOptions []SelectOption
}
FormSelect is a select input for a form field
func ConvertGoquerySelectionToFormSelect ¶ added in v1.1.1
func ConvertGoquerySelectionToFormSelect(item *goquery.Selection) FormSelect
ConvertGoquerySelectionToFormSelect converts a goquery.Selection object to a FormSelect object. It extracts the attributes and form options from the goquery.Selection and populates them in the FormSelect object. The converted FormSelect object is then returned.
type FormTextArea ¶ added in v1.1.1
type FormTextArea struct {
Name string
Attributes mapsutil.OrderedMap[string, string]
}
func ConvertGoquerySelectionToFormTextArea ¶ added in v1.1.1
func ConvertGoquerySelectionToFormTextArea(item *goquery.Selection) FormTextArea
ConvertGoquerySelectionToFormTextArea converts a goquery.Selection object to a FormTextArea struct. It extracts the attributes from the first node of the selection and populates a FormTextArea object with the extracted data. The "name" attribute is assigned to the Name field of the FormTextArea, while other attributes are added to the Attributes map.
type JSLuiceEndpoint ¶ added in v1.0.3
func ExtractJsluiceEndpoints ¶ added in v1.0.3
func ExtractJsluiceEndpoints(data string) []JSLuiceEndpoint
ExtractJsluiceEndpoints extracts jsluice endpoints from a given string.
We use tomnomnom and bishopfox's jsluice to extract endpoints from javascript files.
We apply several optimizations before running jsluice:
- We skip common js library files.
- We skip lines that are too long and contain a lot of characters.
type PathTrie ¶ added in v1.5.0
type PathTrie struct {
// contains filtered or unexported fields
}
PathTrie is a per-host adaptive trie that tracks unique path segments at each position. When a node accumulates more distinct children than the promotion threshold, it is promoted to a parameter node and all future values at that position are collapsed.
Host tracking is bounded by an LRU cache to prevent unbounded memory growth during large crawls.
func NewPathTrie ¶ added in v1.5.0
NewPathTrie creates a new PathTrie with the given promotion threshold. If threshold is <= 0, DefaultPromotionThreshold is used.
func (*PathTrie) Fingerprint ¶ added in v1.5.0
Fingerprint walks the trie for the given host and segments, returning a new slice where promoted positions are replaced with "{param}". Non-promoted segments are registered in the trie for future cardinality tracking.
type SelectOption ¶ added in v1.1.1
type SelectOption struct {
Value string
Selected string
Attributes mapsutil.OrderedMap[string, string]
}
SelectOption is an option for a select input
func ConvertGoquerySelectionToSelectOption ¶ added in v1.1.1
func ConvertGoquerySelectionToSelectOption(item *goquery.Selection) SelectOption
ConvertGoquerySelectionToSelectOption converts a goquery.Selection object to a SelectOption object. It extracts the attributes from the goquery.Selection object and populates a SelectOption object with the extracted values.