portcullis

package module
v0.0.0-...-6745259 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 21, 2026 License: Apache-2.0 Imports: 4 Imported by: 0

README

portcullis

Go Reference

A tiny Go library that detects and redacts API tokens, cloud credentials, and other secret material in arbitrary text.

import "github.com/docker/portcullis"

clean := portcullis.Redact("Run this with token=ghp_1234567890abcdef1234567890abcdef1234 please.")
// → "Run this with token=[REDACTED] please."

portcullis.Contains("not a secret")                                   // false
portcullis.Contains("token=ghp_1234567890abcdef1234567890abcdef1234") // true

Why

LLM agents, log pipelines, error reporters, and anything else that echoes user-controlled or tool-produced text back into a third party need to scrub credentials before they leak. portcullis is the extracted, dependency-free core of the redactor used by docker-agent, built around two design constraints:

  • Cheap on clean input. A single Aho–Corasick pass over the input yields a bitset of every keyword present, after which each rule's keyword check collapses to two AND instructions. Most messages never pay for a regex.
  • Idempotent. The default marker [REDACTED] is chosen so it does not match any built-in rule; redacting an already-redacted string is a no-op, and pipelines that scrub at multiple stages don't amplify.

Install

go get github.com/docker/portcullis

Requires Go 1.22+.

API

The public surface is intentionally tiny:

const Marker = "[REDACTED]"

func Contains(text string) bool   // detect
func Redact(text string) string   // scrub

Both functions are safe for concurrent use. The compiled rule set and its Aho–Corasick automaton are built once on first call and shared across goroutines.

What it detects

The built-in catalogue covers ~220 patterns spanning:

  • Cloud providers — AWS (incl. STS ABIA / context-specific ACCA prefixes), GCP service accounts, Azure Storage, Azure DevOps, Azure AD client secrets, DigitalOcean, Tencent, Alibaba, Yandex, Akamai, Cloudflare Origin CA.
  • Source forges & CI — GitHub (PAT / OAuth / app / fine-grained / refresh), GitLab (full token family incl. glimt- / glagent- / glsoat- / routable variants), Bitbucket, Docker Hub (PAT / OAT), JFrog (key + reference token), Sonar, Buildkite, CircleCI, Harness (pat. / sat.), Authress.
  • LLM / AI providers — OpenAI, Anthropic, DeepSeek, Google (AIza), xAI / Grok, Cohere, Groq, Perplexity, Replicate, OpenRouter, Hugging Face (user hf_ + organisation api_org_), AssemblyAI, Deepgram.
  • Payment processors — Stripe (publishable / secret / restricted / webhook), Razorpay, Adyen, Plaid, Square, Braintree.
  • Communication & ops — Slack (legacy, rotating, webhooks), Discord (bot & webhook), Telegram, Twilio, SendGrid, Mailgun, Mailchimp, Sendinblue, Microsoft Teams webhooks.
  • SaaS & developer tools — Figma, Contentful, HubSpot, LaunchDarkly (incl. sdk- keys), Doppler (full family), 1Password, Vercel, Netlify, Render, Notion, Linear, Trello, ClickUp, Okta, ngrok, Cisco Meraki, SettleMint, Fly.io macaroons, Heroku v1/v2, OpenShift sha256~ tokens.
  • Infra, web3 & databases — HashiCorp Vault (service / batch / recovery), Terraform Cloud, Tailscale, PlanetScale, Supabase, MongoDB / Postgres / MySQL / Redis / AMQP connection-string passwords, Sidekiq Pro/Enterprise gem-server URLs, Alchemy / Etherscan / Moralis (web3), Logz.io, PEM private keys, JWTs, and more.

Connection-string rules (MongoDB, Postgres, MySQL, Redis, AMQP, Azure Storage) redact only the password / key span so log readers can still tell which host or account was being addressed.

Development

The project is driven by gogo, a small task runner. Install it once:

go install github.com/dgageot/gogo@latest

Then, from the repository root:

gogo            # default: lint + test
gogo test       # go test ./...
gogo test-race  # go test -race ./...
gogo bench      # go test -bench=. -benchmem -run=^$ ./...
gogo lint       # golangci-lint run + go.mod tidy check
gogo format     # golangci-lint fmt
gogo tidy       # go mod tidy
gogo -l         # list every task with its description

The linter configuration lives in .golangci.yml. The matching CI workflow runs lint + race-enabled tests against the go.mod floor (Go 1.22) and the latest stable Go release on every PR — see .github/workflows/ci.yml.

Performance

On a typical clean input the cost is dominated by a single linear scan over the bytes; the regex engine is never invoked. With a secret, only the rules whose keywords are present run. Both functions allocate only when text actually changes.

On an Apple M4 Max scrubbing a 9000-byte clean payload and a secret-bearing 1.5 KB payload:

BenchmarkAhoScanCleanInput-16        257198      4823 ns/op  1865.95 MB/s     0 B/op    0 allocs/op
BenchmarkAhoScanWithKeyword-16      1251084       956.0 ns/op 1636.96 MB/s     0 B/op    0 allocs/op
BenchmarkRedactCleanInput-16         228036      4803 ns/op                    0 B/op    0 allocs/op
BenchmarkContainsCleanInput-16       233336      4782 ns/op                    0 B/op    0 allocs/op
BenchmarkRedactWithSecret-16         565590      2094 ns/op                 1585 B/op    2 allocs/op
BenchmarkContainsWithSecret-16       912943      1312 ns/op                    0 B/op    0 allocs/op

The AC scan dominates the clean-input path — Redact and Contains add no measurable overhead on top of it because the rule loop short-circuits on an empty keyword mask. Contains allocates zero bytes even on a secret-bearing input: it only needs the regex MatchString, which uses pooled state machines internally.

Provenance

The default ruleset is derived from the MIT-licensed github.com/docker/mcp-gateway/pkg/secretsscan package, which adapted it from github.com/aquasecurity/trivy/pkg/fanal/secret, extended with additional patterns for modern AI providers, payment processors, and infrastructure tokens.

License

Apache-2.0. See LICENSE.

Documentation

Overview

Package portcullis detects and redacts API tokens, cloud credentials, and other secret material in arbitrary text.

Contains reports whether any rule matches the input; Redact replaces every detected secret span with Marker while preserving the surrounding text. Both are safe for concurrent use and idempotent.

Performance

Detection runs in O(len(text)) per rule via Go's RE2-based regexp engine, gated by an Aho–Corasick keyword pre-filter so clean inputs typically don't compile or run any regex. Memory allocations are zero on a clean input and small (a few hundred bytes) on a secret-bearing one.

Caller responsibilities

portcullis intentionally does not cap input size: callers process inputs of widely different shapes (a chat message, a tool's stdout, a multi-megabyte log buffer) and can pick the right upper bound for their context. If the input is attacker-controlled and unbounded — e.g. an HTTP request body relayed through an untrusted intermediary — wrap the call site with the appropriate size limit before invoking Redact / Contains.

Provenance

The default ruleset is derived from the MIT-licensed github.com/docker/mcp-gateway/pkg/secretsscan package, which adapted it from github.com/aquasecurity/trivy/pkg/fanal/secret, extended with additional patterns for modern AI providers, payment processors, and infrastructure tokens.

Index

Examples

Constants

View Source
const Marker = "[REDACTED]"

Marker replaces every detected secret span. Chosen so it doesn't match any rule's keyword pre-filter — see TestMarkerIsNotASecret for the safety property that makes Redact idempotent.

Variables

This section is empty.

Functions

func Contains

func Contains(text string) bool

Contains reports whether text matches any built-in secret rule. It is safe for concurrent use.

Example
package main

import (
	"fmt"

	"github.com/docker/portcullis"
)

func main() {
	fmt.Println(portcullis.Contains("hello world"))
	fmt.Println(portcullis.Contains("token=ghp_cxLeRrvbJfmYdUtr70xnNE3Q7Gvli43s19PD"))
}
Output:
false
true

func Redact

func Redact(text string) string

Redact returns a copy of text with every detected secret span replaced by Marker. When a rule defines a (?P<secret>…) named subgroup, only that span is replaced (so callers still see "AWS_SECRET_ACCESS_KEY=[REDACTED]"); otherwise the whole match is replaced.

Idempotent: Marker does not match any rule, so calling Redact twice yields the same result. Safe for concurrent use.

Example
package main

import (
	"fmt"

	"github.com/docker/portcullis"
)

func main() {
	log := "Run this with token=ghp_cxLeRrvbJfmYdUtr70xnNE3Q7Gvli43s19PD please."

	fmt.Println(portcullis.Redact(log))
}
Output:
Run this with token=[REDACTED] please.
Example (ConnectionString)
package main

import (
	"fmt"

	"github.com/docker/portcullis"
)

func main() {
	// Connection-string rules redact only the password span so the
	// surrounding URL stays useful for log readers.
	uri := "postgresql://app:hunter2supersecret@db.internal:5432/orders"

	fmt.Println(portcullis.Redact(uri))
}
Output:
postgresql://app:[REDACTED]@db.internal:5432/orders
Example (MultipleSecrets)
package main

import (
	"fmt"

	"github.com/docker/portcullis"
)

func main() {
	in := "first ghp_cxLeRrvbJfmYdUtr70xnNE3Q7Gvli43s19PD " +
		"and second ghp_AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA end"

	fmt.Println(portcullis.Redact(in))
}
Output:
first [REDACTED] and second [REDACTED] end

Types

This section is empty.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL