crawler

package

v4.1.0 Latest Latest Go to latest Published: Jun 3, 2024 License: AGPL-3.0 Imports: 20 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/italia/publiccode-crawler

Links

Open Source Insights

Documentation ¶

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

This section is empty.

Types ¶

type Crawler ¶

type Crawler struct {
	DryRun bool

	Index string
	// contains filtered or unexported fields
}

Crawler is a helper class representing a crawler.

func NewCrawler ¶

func NewCrawler(dryRun bool) *Crawler

NewCrawler initializes a new Crawler object and connects to Elasticsearch (if dryRun == false).

func (*Crawler) CrawlPublishers ¶

func (c *Crawler) CrawlPublishers(publishers []common.Publisher) error

CrawlPublishers processes a list of publishers.

func (*Crawler) CrawlRepo ¶

func (c *Crawler) CrawlRepo(repoURL url.URL, publisher common.Publisher) error

CrawlRepo crawls a single repository (only used by the 'one' command).

func (*Crawler) ProcessRepo ¶

func (c *Crawler) ProcessRepo(repository common.Repository)

ProcessRepo looks for a publiccode.yml file in a repository, and if found it processes it.

func (*Crawler) ProcessRepositories ¶

func (c *Crawler) ProcessRepositories(repos chan common.Repository)

ProcessRepositories process the repositories channel, check the repo's publiccode.yml and send new data to the API if the publiccode.yml file is valid.

func (*Crawler) ScanPublisher ¶

func (c *Crawler) ScanPublisher(publisher common.Publisher)

ScanPublisher scans all the publisher' repositories and sends the ones with a valid publiccode.yml to the repositories channel.

Source Files ¶

View all Source files

crawler.go

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL