redact

package
v0.1.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 2, 2026 License: MIT Imports: 21 Imported by: 0

Documentation

Index

Constants

View Source
const RedactedPlaceholder = "REDACTED"

RedactedPlaceholder is the replacement text used for redacted secrets.

View Source
const RedactorsDirName = "redactors"

RedactorsDirName is the .trace subdirectory used for user-defined rule packs.

Variables

This section is empty.

Functions

func Bytes

func Bytes(b []byte) []byte

Bytes is a convenience wrapper around String for []byte content.

func ConfigureCustomRules

func ConfigureCustomRules(cfg CustomRulesConfig)

ConfigureCustomRules compiles user-defined redaction rules and stores the result for use by redact.String(). Sample-validation runs here too, so failures surface the next time any process initializes redaction.

Call once at process startup after loading settings. Thread-safe.

func ConfigurePII

func ConfigurePII(cfg PIIConfig)

ConfigurePII sets the global PII redaction configuration. Pre-compiles patterns so the hot path (String → detectPII) does no compilation. Call once at startup after loading settings. Thread-safe. Warns if called after PII redaction has already been used (patterns may have been applied with the old configuration).

func JSONLContent

func JSONLContent(content string) (string, error)

JSONLContent parses each line as JSON to determine which string values need redaction, then performs targeted replacements on the raw JSON bytes. Lines with no secrets are returned unchanged, preserving original formatting.

For multi-line JSON content (e.g., pretty-printed single JSON objects like OpenCode export), the function first attempts to parse the trace content as a single JSON value. This ensures field-aware redaction (which skips ID fields) is used instead of falling back to entropy-based detection on raw text lines, which would corrupt high-entropy identifiers.

func String

func String(s string) string

String replaces secrets and PII in s using layered detection: 1. Entropy-based: high-entropy alphanumeric sequences (threshold 4.5) 2. Pattern-based: betterleaks regex rules (260+ known secret formats) 3. Credentialed URIs: URLs containing userinfo passwords 4. Database connection strings: JDBC, keyword DSNs, and semicolon strings 5. Bounded credential key/value pairs: DB_PASSWORD=... 6. PII detection: email, phone, address patterns (only when configured via ConfigurePII) A string is redacted if ANY method flags it.

Types

type CustomRulesConfig

type CustomRulesConfig struct {
	// Inline maps a label (used only in logs/diagnostics) to a Go RE2 regex
	// string. Failed compilations are logged via slog.Warn and dropped.
	Inline map[string]string

	// Packs are pre-parsed rule packs (see LoadPacks). Per-rule regex
	// compilation failures are logged and dropped; sample mismatches are
	// logged but do not drop the rule.
	Packs []*Pack
}

CustomRulesConfig configures inline custom_redactions and parsed rule packs.

type Finding

type Finding struct {
	// Type is a stable identifier for the kind of secret, e.g. "github-token",
	// "jwt", "pem-private-key", "aws-access-key", "high-entropy", "pattern",
	// or "credential".
	Type string
	// Secret is the matched substring.
	Secret string
	// Start and End are byte offsets into the scanned string ([Start, End)).
	Start int
	End   int
	// Verification grades confidence the match is a live credential.
	Verification Verification
}

Finding is a single detected secret candidate with its location, the kind of secret it appears to be, and how confidently it was verified.

Redaction (String/Bytes) still masks every non-rejected candidate — Detect is the reporting/triage surface that lets callers surface the real leaks first.

func Detect

func Detect(s string) []Finding

Detect scans s for secret candidates and returns structured findings, each graded by how confidently it was verified offline. Findings are sorted by start offset; overlapping matches are de-duplicated keeping the most confident, most specific finding.

Detect never performs network I/O: "verification" means checksum/structural proof, so secrets are never transmitted to confirm them.

type PIICategory

type PIICategory string

PIICategory identifies a category of personally identifiable information.

const (
	PIIEmail   PIICategory = "email"
	PIIPhone   PIICategory = "phone"
	PIIAddress PIICategory = "address"
)

type PIIConfig

type PIIConfig struct {
	// Enabled globally enables/disables PII redaction.
	// When false, no PII patterns are checked (secrets still redacted).
	Enabled bool

	// Categories maps each PII category to whether it is enabled.
	// Missing keys default to false (disabled).
	Categories map[PIICategory]bool

	// CustomPatterns allows teams to define additional regex patterns.
	// Each key is a label used in the replacement token (uppercased),
	// and each value is a regex pattern string.
	// Example: {"employee_id": `EMP-\d{6}`} produces [REDACTED_EMPLOYEE_ID].
	CustomPatterns map[string]string
	// contains filtered or unexported fields
}

PIIConfig controls which PII categories are detected and redacted.

type Pack

type Pack struct {
	Name        string `json:"name"                  yaml:"name"`
	Version     string `json:"version"               yaml:"version"`
	Description string `json:"description,omitempty" yaml:"description"`
	Rules       []Rule `json:"rules"                 yaml:"rules"`
	// contains filtered or unexported fields
}

Pack is a versioned bundle of redaction rules loaded from a single file under .trace/redactors/. Both YAML and JSON encodings are accepted; the schema is identical.

func LoadPacks

func LoadPacks(dir string) ([]*Pack, error)

LoadPacks discovers and parses all rule packs in dir, including any subdirectories (so the conventional .trace/redactors/local/ path for personal/uncommitted rules is picked up automatically). Files with the extensions .yaml, .yml, and .json are considered packs; other files are ignored. A missing directory is treated as "no packs configured" and returns no error. Per-file parse errors are slog.Warn'd and the file is skipped — never fatal — so one bad file does not silence the rest.

Soft caps: files larger than maxPackFileBytes are skipped with a warning, and discovery stops after maxPackFiles parsed packs. The trust boundary is "user owns repo," so these are runaway-input guards, not security limits.

func ParsePack

func ParsePack(data []byte, sourcePath string) (*Pack, error)

ParsePack decodes a single pack file. sourcePath is used both to pick the encoding (YAML by default; JSON only when the extension is .json) and to enforce that the pack's `name` matches the filename stem.

Precondition: sourcePath must be a vetted local file path (the production caller is LoadPacks, which only invokes ParsePack with paths produced by WalkDir under the configured .trace/redactors/ directory). Callers passing arbitrary or remote paths must enforce their own trust model — ParsePack does not sanitize sourcePath beyond reading its extension.

type RedactedBytes

type RedactedBytes struct {
	// contains filtered or unexported fields
}

RedactedBytes represents transcript data that has been through secret redaction. Consumers that require pre-redacted input (e.g., compact.Compact, checkpoint stores) accept this type to enforce the contract at compile time.

Produced by JSONLBytes (primary constructor) or trusted wrappers for data previously persisted by checkpoint writers.

func AlreadyRedacted

func AlreadyRedacted(data []byte) RedactedBytes

AlreadyRedacted wraps transcript bytes known to already be redacted by a prior write path. Use this ONLY for trusted sources such as persisted checkpoint transcripts or controlled test fixtures. For fresh transcript input, use JSONLBytes.

func JSONLBytes

func JSONLBytes(b []byte) (RedactedBytes, error)

JSONLBytes redacts secrets in JSONL-formatted byte content and returns the result as RedactedBytes, certifying the output has been through redaction.

func (RedactedBytes) Bytes

func (r RedactedBytes) Bytes() []byte

Bytes returns the underlying byte slice.

func (RedactedBytes) Len

func (r RedactedBytes) Len() int

Len returns the number of bytes in the redacted payload.

type Rule

type Rule struct {
	ID          string   `json:"id"                    yaml:"id"`
	Description string   `json:"description,omitempty" yaml:"description"`
	Regex       string   `json:"regex"                 yaml:"regex"`
	Samples     []Sample `json:"samples,omitempty"     yaml:"samples"`
}

Rule is a single redaction rule within a Pack.

type Sample

type Sample struct {
	Input    string `json:"input"    yaml:"input"`
	Redacted bool   `json:"redacted" yaml:"redacted"`
}

Sample is a self-test entry for a Rule. The runner asserts whether the rule's regex matching `Input` matches the `Redacted` expectation.

type Verification

type Verification int

Verification grades the confidence that a detected candidate is a real, usable credential. TruffleHog draws the same distinction between a pattern match and a *verified* secret; we do it offline (no network egress, no secret exfiltration) using checksums and structural decoding.

const (
	// Rejected: the candidate looks like a placeholder, example, or mask
	// ("REDACTED", "<password>", "xxxx", "${VAR}") — almost certainly not a
	// live secret.
	Rejected Verification = iota
	// Unverified: matches a known secret pattern or is high-entropy, but no
	// offline proof is available. Redacted, but lower triage priority.
	Unverified
	// Verified: structurally or cryptographically confirmed — a GitHub token
	// whose CRC32 checksum matches, a JWT whose header decodes, a PEM block
	// that parses, or a card number that passes Luhn.
	Verified
)

func (Verification) String

func (v Verification) String() string

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL