Documentation
¶
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type ActiveURL ¶
type ActiveURL struct {
URL string `json:"url"`
StartTime time.Time `json:"start_time"`
Duration string `json:"duration"`
Depth int `json:"depth"`
}
ActiveURL represents a URL currently being processed
type CrawlDebugger ¶
type CrawlDebugger struct {
// contains filtered or unexported fields
}
CrawlDebugger tracks active URLs for debugging
func NewCrawlDebugger ¶
func NewCrawlDebugger(httpPort int) *CrawlDebugger
NewCrawlDebugger creates a new debugger instance
func (*CrawlDebugger) Close ¶
func (cd *CrawlDebugger) Close()
func (*CrawlDebugger) EndURL ¶
func (cd *CrawlDebugger) EndURL(url string)
EndURL marks a URL as finished processing
func (*CrawlDebugger) GetActiveURLs ¶
func (cd *CrawlDebugger) GetActiveURLs() []ActiveURL
GetActiveURLs returns currently active URLs with durations
func (*CrawlDebugger) StartURL ¶
func (cd *CrawlDebugger) StartURL(url string, depth int)
StartURL marks a URL as being processed
Directories
¶
| Path | Synopsis |
|---|---|
|
cookie
Package cookie implements cookie consent handling for the crawler.
|
Package cookie implements cookie consent handling for the crawler. |
|
stealth
from stealth go-rod
|
from stealth go-rod |
|
normalizer/simhash
Package simhash implements SimHash algorithm for near-duplicate detection.
|
Package simhash implements SimHash algorithm for near-duplicate detection. |
|
Package graph implements a Directed Graph for storing state information during crawling of a Web Application.
|
Package graph implements a Directed Graph for storing state information during crawling of a Web Application. |
Click to show internal directories.
Click to hide internal directories.