crawler

command module
v1.0.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Aug 16, 2025 License: MIT Imports: 5 Imported by: 0

README

Web crawler

This is a web crawler I wrote to gather links to articles, research papers and books regarding topics I find interesting. Though I wrote it for personal use, it slowly evolved into a piece of software that was kind of portable. If you find a web crawler useful, you can read the installation instructions below.

Requirements

Installation

Run go install github.com/junwei890/crawler@latest in your terminal.

Usage

You need to have a crawler.txt file in your home directory, it should contain your MongoDB URI and links you would like to scrape. See crawler.txt.example for how it should be formatted.

Run crawler in your terminal to start scraping.

Tips
  • You don't have to create a MongoDB database and collection before scraping websites, the crawler takes care of that for you.
  • You don't have to index the collection for Atlas Search after scraping is done, the crawler takes care of that for you too.
  • Logs will print in your terminal when you run the program.

Documentation

The Go Gopher

There is no documentation for this package.

Directories

Path Synopsis

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL