scheduler

package
v1.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 3, 2026 License: Apache-2.0 Imports: 11 Imported by: 0

Documentation

Overview

Package scheduler provides crawl task scheduling and resource allocation.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func Init

func Init(threadNum int, proxyMinute int64)

Init initializes the scheduler with the given concurrency and proxy settings.

func PauseRecover

func PauseRecover()

PauseRecover toggles pause/resume for all crawl tasks.

func ReloadProxyLib

func ReloadProxyLib()

ReloadProxyLib reloads the proxy IP list from the config file.

func Stop

func Stop()

Stop terminates all crawl tasks.

Types

type Matrix

type Matrix struct {
	sync.Mutex
	// contains filtered or unexported fields
}

Matrix is the request queue for a single Spider instance.

func AddMatrix

func AddMatrix(spiderName, spiderSubName string, maxPage int64) *Matrix

AddMatrix registers a resource queue for the given spider and returns its Matrix.

func (*Matrix) CanStop

func (m *Matrix) CanStop() bool

CanStop reports whether this Matrix can stop (no pending work).

func (*Matrix) DoHistory

func (m *Matrix) DoHistory(req *request.Request, ok bool) bool

DoHistory records success/failure and returns true if the request was requeued as a new failure.

func (*Matrix) Free

func (m *Matrix) Free()

Free releases a resource slot.

func (*Matrix) Len

func (m *Matrix) Len() int

Len returns the number of queued requests.

func (*Matrix) Pull

func (m *Matrix) Pull() (req *request.Request)

Pull removes and returns a request from the queue, or nil if empty. Concurrency-safe.

func (*Matrix) Push

func (m *Matrix) Push(req *request.Request)

Push adds a request to the queue. Concurrency-safe.

func (*Matrix) TryFlushFailure

func (m *Matrix) TryFlushFailure()

TryFlushFailure flushes failure history in non-server mode.

func (*Matrix) TryFlushSuccess

func (m *Matrix) TryFlushSuccess()

TryFlushSuccess flushes success history in non-server mode.

func (*Matrix) Use

func (m *Matrix) Use()

Use acquires a resource slot for this Matrix.

func (*Matrix) Wait

func (m *Matrix) Wait()

Wait blocks until all in-flight requests complete.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL