Documentation
¶
Overview ¶
Package subscraping contains the logic of scraping agents
Index ¶
- Constants
- func CreateApiKeys[T any](keys []string, provider func(k, v string) T) []T
- func PickRandom[T any](v []T, sourceName string) T
- type BasicAuth
- type CtxArg
- type CustomRateLimit
- type KeyRequirement
- type RegexSubdomainExtractor
- type Result
- type ResultType
- type Session
- func (s *Session) Close()
- func (s *Session) DiscardHTTPResponse(response *http.Response)
- func (s *Session) Get(ctx context.Context, getURL, cookies string, headers map[string]string) (*http.Response, error)
- func (s *Session) HTTPRequest(ctx context.Context, method, requestURL, cookies string, ...) (*http.Response, error)
- func (s *Session) Post(ctx context.Context, postURL, cookies string, headers map[string]string, ...) (*http.Response, error)
- func (s *Session) SimpleGet(ctx context.Context, getURL string) (*http.Response, error)
- func (s *Session) SimplePost(ctx context.Context, postURL, contentType string, body io.Reader) (*http.Response, error)
- type Source
- type Statistics
- type SubdomainExtractor
Constants ¶
const MultipleKeyPartsLength = 2
Variables ¶
This section is empty.
Functions ¶
func CreateApiKeys ¶
func PickRandom ¶
Types ¶
type CustomRateLimit ¶
type CustomRateLimit struct {
Custom mapsutil.SyncLockMap[string, uint]
}
type KeyRequirement ¶ added in v2.12.0
type KeyRequirement int
KeyRequirement represents the API key requirement level for a source
const ( NoKey KeyRequirement = iota OptionalKey RequiredKey )
type RegexSubdomainExtractor ¶
type RegexSubdomainExtractor struct {
// contains filtered or unexported fields
}
RegexSubdomainExtractor is a concrete implementation of the SubdomainExtractor interface, using regex for extraction.
func NewSubdomainExtractor ¶
func NewSubdomainExtractor(domain string) (*RegexSubdomainExtractor, error)
NewSubdomainExtractor creates a new regular expression to extract subdomains from text based on the given domain.
func (*RegexSubdomainExtractor) Extract ¶
func (re *RegexSubdomainExtractor) Extract(text string) []string
Extract implements the SubdomainExtractor interface, using the regex to find subdomains in the given text.
type Result ¶
type Result struct {
Type ResultType
Source string
Value string
Error error
}
Result is a result structure returned by a source
type ResultType ¶
type ResultType int
ResultType is the type of result returned by the source
const ( Subdomain ResultType = iota Error )
Types of results returned by the source
type Session ¶
type Session struct {
//SubdomainExtractor
Extractor SubdomainExtractor
// Client is the current http client
Client *http.Client
// Rate limit instance
MultiRateLimiter *ratelimit.MultiLimiter
}
Session is the option passed to the source, an option is created uniquely for each source.
func NewSession ¶
func NewSession(domain string, proxy string, multiRateLimiter *ratelimit.MultiLimiter, timeout int) (*Session, error)
NewSession creates a new session object for a domain
func (*Session) DiscardHTTPResponse ¶
DiscardHTTPResponse discards the response content by demand
func (*Session) Get ¶
func (s *Session) Get(ctx context.Context, getURL, cookies string, headers map[string]string) (*http.Response, error)
Get makes a GET request to a URL with extended parameters
func (*Session) HTTPRequest ¶
func (s *Session) HTTPRequest(ctx context.Context, method, requestURL, cookies string, headers map[string]string, body io.Reader, basicAuth BasicAuth) (*http.Response, error)
HTTPRequest makes any HTTP request to a URL with extended parameters
func (*Session) Post ¶
func (s *Session) Post(ctx context.Context, postURL, cookies string, headers map[string]string, body io.Reader) (*http.Response, error)
Post makes a POST request to a URL with extended parameters
type Source ¶
type Source interface {
// Run takes a domain as argument and a session object
// which contains the extractor for subdomains, http client
// and other stuff.
Run(context.Context, string, *Session) <-chan Result
// Name returns the name of the source. It is preferred to use lower case names.
Name() string
// IsDefault returns true if the current source should be
// used as part of the default execution.
IsDefault() bool
// HasRecursiveSupport returns true if the current source
// accepts subdomains (e.g. subdomain.domain.tld),
// not just root domains.
HasRecursiveSupport() bool
// KeyRequirement returns the API key requirement level for this source
KeyRequirement() KeyRequirement
// NeedsKey returns true if the source requires an API key.
// Deprecated: Use KeyRequirement() instead for more granular control.
NeedsKey() bool
AddApiKeys([]string)
// Statistics returns the scrapping statistics for the source
Statistics() Statistics
}
Source is an interface inherited by each passive source
type Statistics ¶
Statistics contains statistics about the scraping process
type SubdomainExtractor ¶
SubdomainExtractor is an interface that defines the contract for subdomain extraction.
Directories
¶
| Path | Synopsis |
|---|---|
|
sources
|
|
|
alienvault
Package alienvault logic
|
Package alienvault logic |
|
anubis
Package anubis logic
|
Package anubis logic |
|
bevigil
Package bevigil logic
|
Package bevigil logic |
|
bufferover
Package bufferover is a bufferover Scraping Engine in Golang
|
Package bufferover is a bufferover Scraping Engine in Golang |
|
builtwith
Package builtwith logic
|
Package builtwith logic |
|
c99
Package c99 logic
|
Package c99 logic |
|
censys
Package censys logic
|
Package censys logic |
|
certspotter
Package certspotter logic
|
Package certspotter logic |
|
chaos
Package chaos logic
|
Package chaos logic |
|
commoncrawl
Package commoncrawl logic
|
Package commoncrawl logic |
|
crtsh
Package crtsh logic
|
Package crtsh logic |
|
digitorus
Package waybackarchive logic
|
Package waybackarchive logic |
|
dnsdb
Package dnsdb logic
|
Package dnsdb logic |
|
dnsdumpster
Package dnsdumpster logic
|
Package dnsdumpster logic |
|
domainsproject
Package domainsproject logic
|
Package domainsproject logic |
|
fofa
Package fofa logic
|
Package fofa logic |
|
github
Package github GitHub search package Based on gwen001's https://github.com/gwen001/github-search github-subdomains
|
Package github GitHub search package Based on gwen001's https://github.com/gwen001/github-search github-subdomains |
|
hackertarget
Package hackertarget logic
|
Package hackertarget logic |
|
hudsonrock
Package hudsonrock logic
|
Package hudsonrock logic |
|
intelx
Package intelx logic
|
Package intelx logic |
|
leakix
Package leakix logic
|
Package leakix logic |
|
merklemap
Package merklemap logic
|
Package merklemap logic |
|
netlas
Package netlas logic
|
Package netlas logic |
|
onyphe
Package onyphe logic
|
Package onyphe logic |
|
profundis
Package profundis logic
|
Package profundis logic |
|
pugrecon
Package pugrecon logic
|
Package pugrecon logic |
|
quake
Package quake logic
|
Package quake logic |
|
rapiddns
Package rapiddns is a RapidDNS Scraping Engine in Golang
|
Package rapiddns is a RapidDNS Scraping Engine in Golang |
|
reconcloud
Package reconcloud logic
|
Package reconcloud logic |
|
redhuntlabs
Package redhuntlabs logic
|
Package redhuntlabs logic |
|
riddler
Package riddler logic
|
Package riddler logic |
|
robtex
Package robtex logic
|
Package robtex logic |
|
securitytrails
Package securitytrails logic
|
Package securitytrails logic |
|
shodan
Package shodan logic
|
Package shodan logic |
|
sitedossier
Package sitedossier logic
|
Package sitedossier logic |
|
thc
Package thc logic
|
Package thc logic |
|
threatbook
Package threatbook logic
|
Package threatbook logic |
|
threatminer
Package threatminer logic
|
Package threatminer logic |
|
virustotal
Package virustotal logic
|
Package virustotal logic |
|
waybackarchive
Package waybackarchive logic
|
Package waybackarchive logic |
|
whoisxmlapi
Package whoisxmlapi logic
|
Package whoisxmlapi logic |
|
windvane
Package windvane logic
|
Package windvane logic |