Documentation
¶
Overview ¶
Package distribute provides distributed task scheduling and master-slave node communication.
Index ¶
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func MasterAPI ¶ added in v1.4.0
func MasterAPI(n Distributor) teleport.API
MasterAPI creates the master node API.
func SlaveAPI ¶ added in v1.4.0
func SlaveAPI(n Distributor) teleport.API
SlaveAPI creates the slave node API.
Types ¶
type Distributor ¶ added in v1.4.0
type Distributor interface {
// Send sends a task from the master to the jar.
Send(clientNum int) Task
// Receive receives a task into the jar on a slave node.
Receive(task *Task)
// CountNodes returns the number of connected nodes.
CountNodes() int
}
Distributor is the distributed interface.
type Task ¶
type Task struct {
ID int
Spiders []map[string]string // Spider rule name and keyin, format: map[string]string{"name":"baidu","keyin":"henry"}
ThreadNum int // Global max concurrency
Pausetime int64 // Pause duration in ms (random: Pausetime/2 ~ Pausetime*2)
OutType string // Output method
BatchCap int // Batch output capacity per flush
BatchQueueCap int // Batch output pool capacity, >= 2
SuccessInherit bool // Inherit historical success records
FailureInherit bool // Inherit historical failure records
Limit int64 // Collection limit, 0=unlimited; if rule sets LIMIT then custom limit
ProxyMinute int64 // Proxy IP rotation interval in minutes
Keyins string // Custom input, later split into Keyin config for multiple tasks
}
Task is used for distributed task dispatch.
type TaskJar ¶
type TaskJar struct {
Tasks chan *Task
}
TaskJar is the task storage.
func (*TaskJar) CountNodes ¶ added in v1.4.0
CountNodes returns 0; TaskJar does not track connected nodes.
Click to show internal directories.
Click to hide internal directories.