Documentation
¶
Index ¶
Constants ¶
View Source
const ( // PageExtension is the file extension that downloaded pages get. PageExtension = ".html" // PageDirIndex is the file name of the index file for every dir. PageDirIndex = "index" + PageExtension )
Variables ¶
This section is empty.
Functions ¶
Types ¶
type Config ¶ added in v0.1.1
type Config struct {
URL string
Includes []string
Excludes []string
ImageQuality uint // image quality from 0 to 100%, 0 to disable reencoding
MaxDepth uint // download depth, 0 for unlimited
Timeout uint // time limit in seconds to process each http request
OutputDirectory string
Username string
Password string
Cookies []Cookie
Header http.Header
Proxy string
UserAgent string
}
Config contains the scraper configuration.
type Cookie ¶ added in v0.2.0
type Cookie struct {
Name string `json:"name"`
Value string `json:"value,omitempty"`
Expires *time.Time `json:"expires,omitempty"`
}
Cookie represents a cookie, it copies parts of the http.Cookie struct but changes the JSON marshaling to exclude empty fields.
type Scraper ¶
type Scraper struct {
URL *url.URL // contains the main URL to parse, will be modified in case of a redirect
// contains filtered or unexported fields
}
Scraper contains all scraping data.
Click to show internal directories.
Click to hide internal directories.