tap

package module
v0.3.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Apr 3, 2026 License: MIT Imports: 9 Imported by: 0

README ΒΆ

🚰 Tap

Tap into any website from your terminal.

Tap is a Go library, CLI, and web app for running JavaScript scripts against real websites. It uses QuickJS first for speed, falls back to Chrome CDP when needed, and can extract clean content from any URL via go-defuddle.

Repository

This monorepo contains:

  • cmd/tap/ β€” Go CLI
  • web/ β€” web app for browsing scripts (web/README.md)
  • script/ β€” script registry and parser
  • engine/ β€” QuickJS and browser engines
  • transport/ β€” shared HTTP and CDP transport
  • tap.go β€” library entry point

Install

CLI
go install github.com/vaayne/tap/cmd/tap@latest
Library
go get github.com/vaayne/tap

Requires Go 1.22+ and Google Chrome or Chromium for browser fallback.

Web

Browse scripts at tap.vaayne.com.

CLI usage

Site scripts

Scripts are downloaded from tap.vaayne.com and cached in $XDG_CACHE_HOME/tap/sites/ (default: ~/.cache/tap/sites/). The cache refreshes every 24 hours.

# List scripts
tap site list

# Run a script
tap site v2ex/hot
tap site twitter/search query=claude
tap site bilibili/search keyword=编程 order=click

# Pipe to jq
tap site hackernews/top | jq '.stories[:3]'

# Search scripts online
tap site search bilibili

# Manually sync the cache
tap site sync

# Use a local override at ~/.config/tap/sites/{site}/{script}.js
tap site github/repo vaayne/tap

# Skip the remote cache entirely
tap --local-only site list
tap --local-only site github/repo vaayne/tap
Fetch content
# Extract clean markdown
tap fetch https://example.com/article

# Output JSON with metadata
tap fetch --json https://example.com/article
Browser automation

Tap also supports persistent browser sessions:

tap browser session new work
tap browser tab new main --url https://example.com
tap browser navigate https://httpbin.org/html
tap browser evaluate 'document.title'
tap browser screenshot
tap browser session close work

See docs/browser.md for details.

Library usage

package main

import (
    "context"
    "fmt"
    "log"

    "github.com/vaayne/tap"
    "github.com/vaayne/tap/fetch"
)

func main() {
    client, err := tap.New(
        tap.WithSitesDir("./sites"),
    )
    if err != nil {
        log.Fatal(err)
    }
    defer client.Close()

    result, err := client.RunScript(context.Background(), "v2ex/hot", nil)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(result)

    content, err := client.Fetch(context.Background(), "https://example.com", &fetch.Options{
        Markdown: true,
    })
    if err != nil {
        log.Fatal(err)
    }
    fmt.Println(content.Markdown)
}

How it works

Both tap site and tap fetch share the same transport layer:

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚       Shared Transport Layer     β”‚
                    β”‚  Level 1: HTTP  β”‚  Level 2: CDP  β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚                β”‚
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚    tap site     β”‚    β”‚    tap fetch      β”‚
              β”‚  QuickJS β†’ CDP  β”‚    β”‚  HTTP β†’ CDP       β”‚
              β”‚  β†’ structured   β”‚    β”‚  β†’ defuddle       β”‚
              β”‚    JSON         β”‚    β”‚  β†’ markdown/HTML  β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Transport β€” shared HTTP client and headless Chrome via CDP
  • Site scripts β€” QuickJS first, browser fallback for cookies, DOM, or auth
  • Fetch β€” direct HTTP first, browser fallback for JS-rendered pages

Configuration

Config can be set with environment variables, .env, or CLI flags:

Variable Flag Description Default
TAP_SITES_DIR --sites-dir Directory containing site scripts ~/.config/tap/sites
TAP_WS_URL --ws-url Remote CDP WebSocket URL (local Chrome)
TAP_BROWSER --browser, -b Force browser execution false
TAP_PROFILE_DIR --profile-dir Chrome profile for persistent cookies ~/.cache/tap/chrome-profile-$USER
--pause Pause after navigation for manual interaction false
--delay Wait a fixed duration after navigation 0s
--wait-selector Wait until a CSS selector becomes visible ""
--wait-js Wait until a JavaScript expression becomes truthy ""
--no-headless Run browser in visible mode false

Browser modes

Local Chrome is the default and uses a persistent profile so cookies survive across runs.

Remote browser uses a CDP WebSocket endpoint:

export TAP_WS_URL=wss://your-remote-browser/ws
tap site v2ex/hot

Login and interactive browser

Use tap login to log in once and reuse saved cookies later:

tap login https://github.com/login
tap site -b github/notifications

Interactive wait modes are also supported:

tap site --pause twitter/search query=claude
tap fetch https://example.com --delay 5s
tap fetch https://example.com --wait-selector '.redeem-code'
tap fetch https://example.com --wait-js 'document.body.innerText.includes("Code")'

--pause, --delay, --wait-selector, and --wait-js imply visible browser mode and browser execution.

Writing scripts

Scripts live in the separate tap-scripts repository.

/* @meta
{
  "name": "site/action",
  "description": "What this script does",
  "domain": "example.com",
  "args": {
    "query": {"required": true, "description": "Search query"}
  }
}
*/

async function(args) {
  const resp = await fetch('https://api.example.com?q=' + args.query)
  return await resp.json()
}

Available sites

100+ scripts across 30+ sites, including:

  • Search: Google, Bing, Baidu, DuckDuckGo
  • Social: Twitter, Weibo, Reddit, 小纒书, 即刻
  • Video: YouTube, Bilibili
  • News: Hacker News, BBC, Reuters, 今ζ—₯倴村, 36ζ°ͺ
  • Dev: GitHub, Stack Overflow, Dev.to, npm, PyPI
  • Finance: ι›ͺ球, δΈœζ–Ήθ΄’ε―Œ, Yahoo Finance
  • Knowledge: Wikipedia, ηŸ₯乎, Douban, arXiv

Run tap site list for the full list.

Docs

Roadmap

  • Site scripts with QuickJS + browser fallback
  • tap fetch <url>
  • tap login <url>
  • --pause
  • tap browser
  • tap browser forms / tap browser fill
  • tap pdf <url>

License

MIT

Documentation ΒΆ

Overview ΒΆ

Package tap provides a unified API for interacting with web pages.

Tap can run site scripts (with QuickJS β†’ Browser fallback) and fetch clean content from URLs via go-defuddle. Both share a common transport layer for HTTP and browser-based network access.

Basic usage:

client, err := tap.New(ctx, tap.WithSitesDir("./sites"))
if err != nil {
    log.Fatal(err)
}
defer client.Close()

// Run a site script
result, err := client.RunScript(ctx, "v2ex/hot", nil)

// Fetch clean content
content, err := client.Fetch(ctx, "https://example.com", nil)

Index ΒΆ

Constants ΒΆ

This section is empty.

Variables ΒΆ

This section is empty.

Functions ΒΆ

This section is empty.

Types ΒΆ

type Client ΒΆ

type Client struct {
	// contains filtered or unexported fields
}

Client is the main entry point for the tap library.

func New ΒΆ

func New(ctx context.Context, optFns ...Option) (*Client, error)

New creates a new Client with the given options. The context is used for any startup work (e.g. downloading a browser binary).

func (*Client) Close ΒΆ

func (c *Client) Close() error

Close releases all resources.

func (*Client) Fetch ΒΆ

func (c *Client) Fetch(ctx context.Context, url string, opts *fetch.Options) (*fetch.Result, error)

Fetch retrieves a URL and extracts clean content using go-defuddle.

func (*Client) GetScript ΒΆ

func (c *Client) GetScript(name string) (*script.Script, bool)

GetScript returns a script by name.

func (*Client) ListScripts ΒΆ

func (c *Client) ListScripts() []*script.Script

ListScripts returns all available scripts sorted by name.

func (*Client) ListScriptsLocalOnly ΒΆ added in v0.3.0

func (c *Client) ListScriptsLocalOnly() []*script.Script

ListScriptsLocalOnly returns only scripts loaded from the local override directory.

func (*Client) Login ΒΆ added in v0.1.5

func (c *Client) Login(ctx context.Context, url string, pauseFn transport.PauseFunc) error

Login opens a browser to the given URL and keeps it open until pauseFn returns. Cookies are persisted in the Chrome profile directory so that subsequent script runs are authenticated.

func (*Client) RunScript ΒΆ

func (c *Client) RunScript(ctx context.Context, name string, args map[string]string) (any, error)

RunScript executes a site script by name with the given arguments. It tries QuickJS first, then falls back to the browser (unless --browser is set).

type Option ΒΆ

type Option func(*options)

Option configures a Client.

func WithBrowserType ΒΆ added in v0.1.6

func WithBrowserType(bt transport.BrowserType) Option

WithBrowserType selects the browser backend ("chrome" or "lightpanda").

func WithForceBrowser ΒΆ added in v0.1.1

func WithForceBrowser(force bool) Option

WithForceBrowser skips QuickJS and runs scripts directly in Chrome.

func WithHeadless ΒΆ added in v0.1.1

func WithHeadless(headless bool) Option

WithHeadless sets whether Chrome runs in headless mode (default: true).

func WithLocalOverrideDir ΒΆ added in v0.3.0

func WithLocalOverrideDir(dir string) Option

WithLocalOverrideDir sets a directory that is checked before the main sites cache. Scripts found here shadow cached versions and are flagged as local overrides. Mirrors the path structure: {dir}/{site}/{script}.js

func WithPause ΒΆ added in v0.1.5

func WithPause(fn transport.PauseFunc) Option

WithPause sets a function that is called after browser navigation, allowing the user to interact (login, solve CAPTCHAs) before script execution.

func WithProfileDir ΒΆ

func WithProfileDir(dir string) Option

WithProfileDir sets the Chrome user data directory for persistent cookies/storage. Defaults to ~/.cache/tap/chrome-profile-$USER.

func WithSitesDir ΒΆ

func WithSitesDir(dir string) Option

WithSitesDir sets the directory containing site scripts.

func WithTimeout ΒΆ added in v0.1.1

func WithTimeout(d time.Duration) Option

WithTimeout sets the execution timeout for scripts and fetches.

func WithWSURL ΒΆ

func WithWSURL(url string) Option

WithWSURL sets the remote CDP WebSocket URL. If empty, a local Chrome is launched.

type ScriptNotFoundError ΒΆ added in v0.1.1

type ScriptNotFoundError struct {
	Name      string
	Available []string
}

ScriptNotFoundError is returned when a script name doesn't match any registered script.

func (*ScriptNotFoundError) Error ΒΆ added in v0.1.1

func (e *ScriptNotFoundError) Error() string

func (*ScriptNotFoundError) Suggestions ΒΆ added in v0.1.1

func (e *ScriptNotFoundError) Suggestions(max int) []string

Suggestions returns script names similar to the requested name, ranked by relevance.

Directories ΒΆ

Path Synopsis
Package browser defines the persistent browser session metadata model used by the planned `tap browser ...` workflow.
Package browser defines the persistent browser session metadata model used by the planned `tap browser ...` workflow.
cmd
tap command
Package engine provides execution engines for running site scripts.
Package engine provides execution engines for running site scripts.
Package fetch provides URL content extraction using go-defuddle.
Package fetch provides URL content extraction using go-defuddle.
Package script handles parsing and discovery of site scripts.
Package script handles parsing and discovery of site scripts.
Package transport provides a shared network layer for fetching web content.
Package transport provides a shared network layer for fetching web content.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL