crawler

command module

v1.0.0 Latest Latest Go to latest Published: Aug 16, 2025 License: MIT Imports: 5 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/junwei890/crawler

Links

Open Source Insights

README ¶

Web crawler

This is a web crawler I wrote to gather links to articles, research papers and books regarding topics I find interesting. Though I wrote it for personal use, it slowly evolved into a piece of software that was kind of portable. If you find a web crawler useful, you can read the installation instructions below.

Requirements

Go installed.
A MongoDB Atlas cluster.

Installation

Run go install github.com/junwei890/crawler@latest in your terminal.

Usage

You need to have a crawler.txt file in your home directory, it should contain your MongoDB URI and links you would like to scrape. See crawler.txt.example for how it should be formatted.

Run crawler in your terminal to start scraping.

Tips

You don't have to create a MongoDB database and collection before scraping websites, the crawler takes care of that for you.
You don't have to index the collection for Atlas Search after scraping is done, the crawler takes care of that for you too.
Logs will print in your terminal when you run the program.

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
src
utils

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL