Documentation
¶
Overview ¶
Package crawler provides website crawling with SPA detection.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Config ¶
type Config struct {
// Maximum crawl depth
MaxDepth int
// Maximum pages to crawl
MaxPages int
// Include patterns (glob)
IncludePatterns []string
// Exclude patterns (glob)
ExcludePatterns []string
// Wait for SPA hydration
WaitForSPA bool
// SPA framework indicators
SPAIndicators []string
// Respect robots.txt
RespectRobots bool
// Use sitemap.xml
UseSitemap bool
// Delay between requests
Delay time.Duration
// Request timeout
Timeout time.Duration
// Concurrency limit
Concurrency int
}
Config configures the crawler.
func DefaultConfig ¶
func DefaultConfig() Config
DefaultConfig returns default crawler configuration.
type Crawler ¶
type Crawler struct {
// contains filtered or unexported fields
}
Crawler crawls websites discovering pages for accessibility auditing.
func NewCrawler ¶
NewCrawler creates a new crawler.
func (*Crawler) GetSPAFramework ¶
GetSPAFramework returns the detected SPA framework for a page.
type Page ¶
type Page struct {
URL string `json:"url"`
Title string `json:"title"`
Depth int `json:"depth"`
DiscoveredFrom string `json:"discoveredFrom"`
IsSPA bool `json:"isSPA"`
SPAFramework string `json:"spaFramework,omitempty"`
LoadTime time.Duration `json:"loadTime"`
Links []string `json:"links"`
Error string `json:"error,omitempty"`
}
Page represents a discovered page.
type Result ¶
type Result struct {
StartURL string `json:"startUrl"`
Pages []Page `json:"pages"`
TotalPages int `json:"totalPages"`
Duration time.Duration `json:"duration"`
RobotsTxt string `json:"robotsTxt,omitempty"`
SitemapURLs []string `json:"sitemapUrls,omitempty"`
}
Result contains the crawl results.
Click to show internal directories.
Click to hide internal directories.