Documentation
¶
Index ¶
- Constants
- func DisableImages() func(*jsOptions)
- func Headfull() func(*jsOptions)
- func WithCache(cacheType, cachePath string) func(*Config) error
- func WithConcurrency(concurrency int) func(*Config) error
- func WithExitOnInactivity(duration time.Duration) func(*Config) error
- func WithInitJob(job scrapemate.IJob) func(*Config) error
- func WithJS(opts ...func(*jsOptions)) func(*Config) error
- func WithProvider(provider scrapemate.JobProvider) func(*Config) error
- type Config
- type ScrapemateApp
Constants ¶
View Source
const ( DefaultConcurrency = 1 DefaultProvider = "memory" )
Variables ¶
This section is empty.
Functions ¶
func DisableImages ¶
func DisableImages() func(*jsOptions)
func Headfull ¶
func Headfull() func(*jsOptions)
Headfull is a helper function to create a headfull browser. Use it as a parameter to WithJS.
func WithConcurrency ¶
WithConcurrency sets the concurrency of the app.
func WithExitOnInactivity ¶
WithExitOnInactivity sets the duration after which the app will exit if there are no more jobs to run.
func WithInitJob ¶
func WithInitJob(job scrapemate.IJob) func(*Config) error
WithInitJob sets the initial job of the app.
func WithProvider ¶
func WithProvider(provider scrapemate.JobProvider) func(*Config) error
WithProvider sets the provider of the app.
Types ¶
type Config ¶
type Config struct {
// Concurrency is the number of concurrent scrapers to run.
// If not set, it defaults to 1.
Concurrency int `validate:"required,gte=1"`
// Cache is the cache to use for storing scraped data.
// If left empty then no caching will be used.
// Otherwise the CacheType must be one of file or leveldb.
CacheType string `validate:"omitempty,oneof=file leveldb"`
// CachePath is the path to the cache file or directory.
// It is required to be a valid path if CacheType is set.
CachePath string `validate:"required_with=CacheType"`
// UseJS is whether to use JavaScript to render the page.
UseJS bool `validate:"omitempty"`
// JSOpts are the options for the JavaScript renderer.
JSOpts jsOptions
// ProviderType is the type of provider to use.
// It is required to be a valid type if Provider is set.
// If not set the memory provider will be used.
Provider scrapemate.JobProvider
// Writers are the writers to use for writing the results.
// At least one writer must be provided.
Writers []scrapemate.ResultWriter `validate:"required,gt=0"`
// InitJob is the job to initialize the app with.
InitJob scrapemate.IJob
// ExitOnInactivityDuration is whether to exit the app when there are no more jobs to run.
ExitOnInactivityDuration time.Duration
}
func NewConfig ¶
func NewConfig(writers []scrapemate.ResultWriter, options ...func(*Config) error) (*Config, error)
NewConfig creates a new config with default values.
type ScrapemateApp ¶
type ScrapemateApp struct {
// contains filtered or unexported fields
}
func NewScrapeMateApp ¶
func NewScrapeMateApp(cfg *Config) (*ScrapemateApp, error)
NewScrapemateApp creates a new ScrapemateApp.
func (*ScrapemateApp) Start ¶
func (app *ScrapemateApp) Start(ctx context.Context, seedJobs ...scrapemate.IJob) error
Start starts the app.
Click to show internal directories.
Click to hide internal directories.