Documentation
¶
Overview ¶
Package paginate of the Dataflow kit describes Paginator interface to retrieve the next page from the current one.
Next page can be obtained in several ways:
BySelector returns a Paginator that extracts the next page from a document by querying a given CSS selector and extracting the given HTML attribute from the resulting element.
ByQueryParam returns a Paginator that returns the next page from a document by incrementing a given query parameter. Note that this will paginate infinitely - you probably want to specify a maximum number of pages to scrape by using MaxPages parameter of ScrapeOptions.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
This section is empty.
Types ¶
type Paginator ¶
type Paginator interface {
// NextPage controls the progress of the scrape. It is called for each input
// page, starting with the origin URL, and is expected to return the URL of
// the next page to process. Note that order matters - calling 'NextPage' on
// page 1 should return page 2, not page 3. The function should return an
// empty string when there are no more pages to process.
NextPage(url string, document *goquery.Selection) (string, error)
}
The Paginator interface should be implemented by things that can retrieve the next page from the current one.
func ByQueryParam ¶
ByQueryParam returns a Paginator that returns the next page from a document by incrementing a given query parameter. Note that this will paginate infinitely - you probably want to specify a maximum number of pages to scrape by using MaxPages parameter of ScrapeOptions.
func BySelector ¶
BySelector returns a Paginator that extracts the next page from a document by querying a given CSS selector and extracting the given HTML attribute from the resulting element.