api-tools

command module

v1.0.0 Latest Latest Go to latest Published: Dec 22, 2025 License: MIT Imports: 10 Imported by: 0

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/UTDNebula/api-tools

Links

Open Source Insights

README ¶

API Tools

A CLI to scrape some really useful UTD data, parse it, and upload it to the Nebula API for community use.

Project maintained by Nebula Labs.

Design

- The `grade-data` directory contains .csv files of UTD grade data.

Files are named by year and semester, with a suffix of S, U, or F denoting Spring, Summer, and Fall semesters, respectively.
This means that, for example, 22F.csv corresponds to the 2022 Fall semester, whereas 18U.csv corresponds with the 2018 Summer semester.
This grade data is collected independently from the scrapers, and is used during the parsing process.

- The `scrapers` directory contains the scrapers for various UTD data sources. This is where the data pipeline begins.

The scrapers are concerned solely with data collection, not necessarily validation or processing of said data. Those responsibilities are left to the parsing stage.

- The `parser` directory contains the files and methods that parse the scraped data. This is the 'middle man' of the data pipeline.

The parsing stage is responsible for 'making sense' of the scraped data; this consists of reading, validating, and merging/intermixing of various data sources.
The input data is considered immutable by the parsing stage. This means the parsers should never modify the data being fed into them.

- The `uploader` directory contains the uploader that sends the parsed data to the Nebula API MongoDB database. This is the final stage of the data pipeline.

The uploader(s) are concerned solely with pushing parsed data to the database. Data, at this point, is assumed to be valid and ready for use.

Contributing

Please visit our Discord and talk to us if you'd like to contribute!

Prerequisites

Golang 1.24 (or higher)

Build

To build the project, simply clone the repository and then either:

Run make in the root (top level) directory (for systems with make installed, i.e. most Linux distros, MacOS)
Run build.bat on Windows systems (unless you want to deal with getting make to work on Windows :P)

The build process will output an executable file named api-tools; this executable is the CLI and can be ran in your terminal!

Additionally, you can run build (on Windows) and make (on MacOS/Linux) with the following arguments:

setup: Installs required dependencies for the tools.
check: Verifies prequisites and ensures the executable can be built.
test: Test run to see if the executable works after building
build: Builds the executble and makes it ready for use.

Usage

The api-tools command line interface supports three main modes: scraping, parsing and uploading data to the Nebula API.

Environment Variables

Before being able to use the tool, configure the .env file by following these steps:

Find the .env.template file and rename it to .env
Specify the required credentials for your use case as a string ("") following the variable name. Example: LOGIN_NETID="ABC123456"

Basic Usage

Run the tool by changing directory using cd to the api-tools directory and running the executable with the appropriate flags in the command line. To see all available options with the tool, run: ./api-tools. To enable logging for debugging, use the verbose flag: ./api-tools -verbose. Find available flags for each mode in the following tables.

Scraping Mode

Command	Description
`./api-tools -scrape -astra`	Scrapes Astra data.
`./api-tools -scrape -calendar`	Scrapes calendar data.
`./api-tools -scrape -coursebook -term 24F`	Scrapes coursebook data for Fall 2024. • Use `-resume` to continue from last prefix. • Use `-startprefix [prefix]` to begin at a specific course prefix.
`./api-tools -scrape -map`	Scrapes UTD Map data.
`./api-tools -scrape -mazevo`	Scrapes Mazevo data.
`./api-tools -scrape -organizations`	Scrapes SOC organizations.
`./api-tools -scrape -profiles`	Scrapes UTD professor profiles.
`./api-tools -scrape -headless`	Runs ChromeDP in headless mode.
`./api-tools -o [directory]`	Sets output directory (default: `./data`).

Parsing Mode:

Command	Description
`./api-tools -parse -astra`	Parses Astra data.
`./api-tools -parse -calendar`	Parses calendar data.
`./api-tools -parse -csv [directory]`	Outputs grade data CSVs (default: `./grade-data`).
`./api-tools -parse -map`	Parses UTD Map data.
`./api-tools -parse -mazevo`	Parses Mazevo data.
`./api-tools -parse -skipv`	Skips post-parse validation (use with caution).

Upload Mode:

Command	Description
`./api-tools -upload -events`	Uploads Astra and Mazevo data.
`./api-tools -upload -map`	Uploads UTD Map data.
`./api-tools -upload -replace`	Replaces old data instead of merging.
`./api-tools -upload -static`	Uploads only static aggregations.

Additionally, you can use the -i [directory] flag to specify where to read data from (default: ./data) and the -l [directory] flag to specify where logs must be dumped (default: ./logs).

Docker

Docker is used for automated running on Google Cloud Platform. More info here.

To build the container for local testing first make sure all scripts in the runners folder have LF line endings then run:

docker build --target local -t my-runner:local .
docker run --rm -e ENVIRONMENT=local -e RUNNER_SCRIPT_NAME=daily.sh my-runner:local

Questions?

Reach out to the team on Discord and with any questions you may have!

Documentation ¶

There is no documentation for this package.

Source Files ¶

View all Source files

main.go

Directories ¶

Path	Synopsis
parser Package parser converts scraped course and scheduling inputs into structured Nebula API schema documents.	Package parser converts scraped course and scheduling inputs into structured Nebula API schema documents.
scrapers
uploader This file is responsible for providing various useful database functions.	This file is responsible for providing various useful database functions.
pipelines Package pipelines defines reusable MongoDB aggregation pipelines for derived data.	Package pipelines defines reusable MongoDB aggregation pipelines for derived data.
utils Package utils provides shared helpers for scraping, parsing, and uploading workflows.	Package utils provides shared helpers for scraping, parsing, and uploading workflows.

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL