hapiq

command module
v0.0.0-...-622207a Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Sep 23, 2025 License: GPL-3.0 Imports: 3 Imported by: 0

README ΒΆ

Hapiq

Hapiq is a CLI tool for extracting and inspecting dataset links from scientific papers.

To extract and check links, it verifies and analyzes data sources to estimate the likelihood of a valid dataset.

Hapiq can also be used to directly download datasets into local folders.

"Hapiq" means "the one who fetches" in Quechua.

CI/CD Go Report Card


Features

  • βœ… Validate URLs and identifiers (e.g. Zenodo, Figshare, Dryad)
  • πŸ” Support for DOI resolution and repository classification
  • πŸ“Š Estimate likelihood of dataset validity
  • 🌐 HTTP status and metadata inspection
  • πŸ“ JSON or human-readable output formats

Installation

From Source
git clone https://github.com/btraven00/hapiq.git
cd hapiq
make install
Using Go Install
go install github.com/btraven00/hapiq@latest
Download Binary

Download pre-built binaries from the releases page.

Usage

Basic Usage
hapiq check <url-or-identifier>
Examples

Check a Zenodo record:

hapiq check https://zenodo.org/record/1234567

Check using DOI:

hapiq check 10.5281/zenodo.1234567

Check with quiet output (suppress verbose messages):

hapiq check https://figshare.com/articles/dataset/example/123456 --quiet

Output as JSON:

hapiq check "10.5061/dryad.example" --output json
Supported Repositories
  • Zenodo - zenodo.org
  • Figshare - figshare.com
  • Dryad - datadryad.org
  • OSF - osf.io
  • GitHub - github.com (releases)
  • Dataverse - Various Dataverse instances
  • DOI Resolution - doi.org

Output Format

Human-readable Output
Target: https://zenodo.org/record/1234567
βœ… Status: Valid (HTTP 200)
πŸ“‚ Dataset Type: zenodo_record
πŸ”— Content Type: text/html
πŸ“ Size: 15234 bytes
⏱️  Response Time: 245ms
🧠 Dataset Likelihood: 0.95
JSON Output
{
  "target": "https://zenodo.org/record/1234567",
  "valid": true,
  "http_status": 200,
  "content_type": "text/html",
  "content_length": 15234,
  "response_time": "245ms",
  "dataset_type": "zenodo_record",
  "likelihood_score": 0.95,
  "metadata": {
    "server": "nginx/1.18.0",
    "last-modified": "Wed, 15 Mar 2023 10:30:00 GMT"
  }
}

Development

Prerequisites
  • Go 1.21 or later
  • Make (optional, for convenience)

Contributing

We welcome contributions! Please see our Contributing Guidelines for details.

License

GPL-3-or-later Β© 2025 btraven

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

References

Documentation ΒΆ

The Go Gopher

There is no documentation for this package.

Directories ΒΆ

Path Synopsis
Package cmd provides command-line interface commands for the hapiq tool.
Package cmd provides command-line interface commands for the hapiq tool.
internal
checker
Package checker provides functionality for validating and checking dataset URLs and identifiers.
Package checker provides functionality for validating and checking dataset URLs and identifiers.
pkg
downloaders
Package downloaders provides a pluggable interface for downloading datasets from various scientific data repositories with comprehensive metadata tracking and provenance information.
Package downloaders provides a pluggable interface for downloading datasets from various scientific data repositories with comprehensive metadata tracking and provenance information.
downloaders/common
Package common provides shared utilities for downloader implementations including filesystem operations, progress tracking, and user interaction.
Package common provides shared utilities for downloader implementations including filesystem operations, progress tracking, and user interaction.
downloaders/figshare
Package figshare provides download functionality for different Figshare dataset types including articles, collections, and projects with comprehensive file handling.
Package figshare provides download functionality for different Figshare dataset types including articles, collections, and projects with comprehensive file handling.
downloaders/geo
Package geo provides download functionality for different GEO dataset types using NCBI E-utilities for metadata discovery and FTP for file downloads.
Package geo provides download functionality for different GEO dataset types using NCBI E-utilities for metadata discovery and FTP for file downloads.
downloaders/zenodo
Package zenodo provides download functionality for Zenodo datasets with progress tracking, file management, and error handling.
Package zenodo provides download functionality for Zenodo datasets with progress tracking, file management, and error handling.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL