π° Tap
Tap into any website from your terminal.
A Go library and CLI toolkit that runs JavaScript scripts against real websites β fast via QuickJS, with full browser fallback when needed. Also extracts clean content from any URL via go-defuddle.
Install
CLI
go install github.com/vaayne/tap/cmd/tap@latest
Library
go get github.com/vaayne/tap
Requires Go 1.22+ and Google Chrome (or Chromium) for browser fallback.
CLI Usage
Site Scripts
# List all available scripts
tap site list
# Run a script (QuickJS first, browser fallback)
tap site v2ex/hot
tap site twitter/search query=claude
tap site bilibili/search keyword=ηΌη¨ order=click
# Pipe to jq
tap site hackernews/top | jq '.stories[:3]'
Fetch Content
# Extract clean markdown from any URL
tap fetch https://example.com/article
# Output as JSON with full metadata
tap fetch --json https://example.com/article
Library Usage
package main
import (
"context"
"fmt"
"log"
"github.com/vaayne/tap"
"github.com/vaayne/tap/fetch"
)
func main() {
client, err := tap.New(
tap.WithSitesDir("./sites"),
)
if err != nil {
log.Fatal(err)
}
defer client.Close()
// Run a site script
result, err := client.RunScript(context.Background(), "v2ex/hot", nil)
if err != nil {
log.Fatal(err)
}
fmt.Println(result)
// Fetch clean content from a URL
content, err := client.Fetch(context.Background(), "https://example.com", &fetch.Options{
Markdown: true,
})
if err != nil {
log.Fatal(err)
}
fmt.Println(content.Markdown)
}
How It Works
Both tap site and tap fetch share a common transport layer with two-tier network access:
βββββββββββββββββββββββββββββββββββ
β Shared Transport Layer β
β Level 1: HTTP β Level 2: CDP β
ββββββββββ¬βββββββββ΄ββββββββ¬βββββββββ
β β
ββββββββββββββββ΄βββ ββββββββββ΄βββββββββββ
β tap site β β tap fetch β
β QuickJS β CDP β β HTTP β CDP β
β β structured β β β defuddle β
β JSON β β β markdown/HTML β
βββββββββββββββββββ βββββββββββββββββββββ
Transport layer β Shared HTTP client and headless Chrome (CDP) browser, configured once and used by all consumers.
Site scripts β Predefined recipes that know the optimal path to fetch structured data. Tries QuickJS (fast, Go-backed fetch()) first, falls back to Chrome via CDP for pages needing cookies, DOM, or auth.
Fetch β Generic content extraction from any URL. Tries direct HTTP first, falls back to browser for JS-rendered pages. Parses with go-defuddle to extract clean HTML/Markdown.
Configuration
All config via environment variables, .env file, or CLI flags:
| Variable |
Flag |
Description |
Default |
TAP_SITES_DIR |
--sites-dir |
Directory containing site scripts |
./sites |
TAP_WS_URL |
--ws-url |
Remote CDP WebSocket URL |
(local Chrome) |
TAP_PROFILE_DIR |
--profile-dir |
Chrome profile for persistent cookies |
~/.cache/tap/chrome-profile-$USER |
Browser Modes
Local Chrome (default) β launches headless Chrome with a persistent profile so cookies survive across runs.
Remote browser β connect to a remote CDP endpoint:
export TAP_WS_URL=wss://your-remote-browser/ws
tap site v2ex/hot
Writing Scripts
Scripts live in sites/ organized by site name:
/* @meta
{
"name": "site/action",
"description": "What this script does",
"domain": "example.com",
"args": {
"query": {"required": true, "description": "Search query"}
}
}
*/
async function(args) {
const resp = await fetch('https://api.example.com?q=' + args.query);
return await resp.json();
}
Available Sites
100+ scripts across 30+ sites:
- Search: Google, Bing, Baidu, DuckDuckGo
- Social: Twitter, Weibo, Reddit, ε°ηΊ’δΉ¦, ε³ε»
- Video: YouTube, Bilibili
- News: Hacker News, BBC, Reuters, δ»ζ₯倴ζ‘, 36ζ°ͺ
- Dev: GitHub, Stack Overflow, Dev.to, npm, PyPI
- Finance: ιͺη, δΈζΉθ΄’ε―, Yahoo Finance
- Knowledge: Wikipedia, η₯δΉ, Douban, arXiv
Run tap site list for the full list.
Project Structure
github.com/vaayne/tap/
βββ tap.go # Client API β unified entry point
βββ options.go # Functional options (WithSitesDir, WithWSURL, ...)
βββ transport/
β βββ transport.go # Shared network layer (HTTP + CDP browser)
βββ engine/
β βββ engine.go # Engine interface + fallback orchestrator
β βββ quickjs.go # QuickJS engine with Go fetch() polyfill
β βββ browser.go # Chrome CDP engine (delegates to transport)
βββ fetch/
β βββ fetch.go # URL β clean content via go-defuddle (HTTP β browser fallback)
βββ script/
β βββ parser.go # Script @meta parser
β βββ registry.go # Script directory scanner + index
βββ cmd/tap/
β βββ main.go # CLI binary (urfave/cli)
βββ sites/ # 100+ community site scripts
Roadmap
- Site scripts with QuickJS + browser fallback
-
tap fetch <url> β clean content extraction
-
tap screenshot <url> β page screenshots
-
tap pdf <url> β save as PDF
-
tap eval <js> --url <url> β run arbitrary JS on a page
-
tap fill <script> β form automation
License
MIT