redact

package
v0.6.2 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: May 18, 2026 License: MIT Imports: 17 Imported by: 0

Documentation

Index

Constants

View Source
const RedactedPlaceholder = "REDACTED"

RedactedPlaceholder is the replacement text used for redacted secrets.

View Source
const RedactorsDirName = "redactors"

RedactorsDirName is the .entire subdirectory used for user-defined rule packs.

Variables

This section is empty.

Functions

func Bytes

func Bytes(b []byte) []byte

Bytes is a convenience wrapper around String for []byte content.

func ConfigureCustomRules added in v0.6.2

func ConfigureCustomRules(cfg CustomRulesConfig)

ConfigureCustomRules compiles user-defined redaction rules and stores the result for use by redact.String(). Sample-validation runs here too, so failures surface the next time any process initializes redaction.

Call once at process startup after loading settings. Thread-safe.

func ConfigurePII added in v0.5.1

func ConfigurePII(cfg PIIConfig)

ConfigurePII sets the global PII redaction configuration. Pre-compiles patterns so the hot path (String → detectPII) does no compilation. Call once at startup after loading settings. Thread-safe.

func JSONLContent

func JSONLContent(content string) (string, error)

JSONLContent parses each line as JSON to determine which string values need redaction, then performs targeted replacements on the raw JSON bytes. Lines with no secrets are returned unchanged, preserving original formatting.

For multi-line JSON content (e.g., pretty-printed single JSON objects like OpenCode export), the function first attempts to parse the entire content as a single JSON value. This ensures field-aware redaction (which skips ID fields) is used instead of falling back to entropy-based detection on raw text lines, which would corrupt high-entropy identifiers.

func String

func String(s string) string

String replaces secrets and PII in s using layered detection: 1. Entropy-based: high-entropy alphanumeric sequences (threshold 4.5) 2. Pattern-based: betterleaks regex rules (260+ known secret formats) 3. Credentialed URIs: URLs containing userinfo passwords 4. Database connection strings: JDBC, keyword DSNs, and semicolon strings 5. User-defined custom rules: configured via ConfigureCustomRules 6. Bounded credential key/value pairs: DB_PASSWORD=... 7. PII detection: email, phone, address patterns (only when configured via ConfigurePII) A string is redacted if ANY method flags it.

Types

type CustomRulesConfig added in v0.6.2

type CustomRulesConfig struct {
	// Inline maps a label (used only in logs/diagnostics) to a Go RE2 regex
	// string. Failed compilations are logged via slog.Warn and dropped.
	Inline map[string]string

	// Packs are pre-parsed rule packs (see LoadPacks). Per-rule regex
	// compilation failures are logged and dropped; sample mismatches are
	// logged but do not drop the rule.
	Packs []*Pack
}

CustomRulesConfig configures inline custom_redactions and parsed rule packs.

type PIICategory added in v0.5.1

type PIICategory string

PIICategory identifies a category of personally identifiable information.

const (
	PIIEmail   PIICategory = "email"
	PIIPhone   PIICategory = "phone"
	PIIAddress PIICategory = "address"
)

type PIIConfig added in v0.5.1

type PIIConfig struct {
	// Enabled globally enables/disables PII redaction.
	// When false, no PII patterns are checked (secrets still redacted).
	Enabled bool

	// Categories maps each PII category to whether it is enabled.
	// Missing keys default to false (disabled).
	Categories map[PIICategory]bool

	// CustomPatterns allows teams to define additional regex patterns.
	// Each key is a label used in the replacement token (uppercased),
	// and each value is a regex pattern string.
	// Example: {"employee_id": `EMP-\d{6}`} produces [REDACTED_EMPLOYEE_ID].
	CustomPatterns map[string]string
	// contains filtered or unexported fields
}

PIIConfig controls which PII categories are detected and redacted.

type Pack added in v0.6.2

type Pack struct {
	Name        string `json:"name"                  yaml:"name"`
	Version     string `json:"version"               yaml:"version"`
	Description string `json:"description,omitempty" yaml:"description"`
	Rules       []Rule `json:"rules"                 yaml:"rules"`
	// contains filtered or unexported fields
}

Pack is a versioned bundle of redaction rules loaded from a single file under .entire/redactors/. Both YAML and JSON encodings are accepted; the schema is identical.

func LoadPacks added in v0.6.2

func LoadPacks(dir string) ([]*Pack, error)

LoadPacks discovers and parses all rule packs in dir, including any subdirectories (so the conventional .entire/redactors/local/ path for personal/uncommitted rules is picked up automatically). Files with the extensions .yaml, .yml, and .json are considered packs; other files are ignored. A missing directory is treated as "no packs configured" and returns no error. Per-file parse errors are slog.Warn'd and the file is skipped — never fatal — so one bad file does not silence the rest.

Soft caps: files larger than maxPackFileBytes are skipped with a warning, and discovery stops after maxPackFiles parsed packs. The trust boundary is "user owns repo," so these are runaway-input guards, not security limits.

func ParsePack added in v0.6.2

func ParsePack(data []byte, sourcePath string) (*Pack, error)

ParsePack decodes a single pack file. sourcePath is used both to pick the encoding (YAML by default; JSON only when the extension is .json) and to enforce that the pack's `name` matches the filename stem.

Precondition: sourcePath must be a vetted local file path (the production caller is LoadPacks, which only invokes ParsePack with paths produced by WalkDir under the configured .entire/redactors/ directory). Callers passing arbitrary or remote paths must enforce their own trust model — ParsePack does not sanitize sourcePath beyond reading its extension.

type RedactedBytes added in v0.5.5

type RedactedBytes struct {
	// contains filtered or unexported fields
}

RedactedBytes represents transcript data that has been through secret redaction. Consumers that require pre-redacted input (e.g., compact.Compact, checkpoint stores) accept this type to enforce the contract at compile time.

Produced by JSONLBytes (primary constructor) or trusted wrappers for data previously persisted by checkpoint writers.

func AlreadyRedacted added in v0.5.5

func AlreadyRedacted(data []byte) RedactedBytes

AlreadyRedacted wraps transcript bytes known to already be redacted by a prior write path. Use this ONLY for trusted sources such as persisted checkpoint transcripts or controlled test fixtures. For fresh transcript input, use JSONLBytes.

func JSONLBytes

func JSONLBytes(b []byte) (RedactedBytes, error)

JSONLBytes redacts secrets in JSONL-formatted byte content and returns the result as RedactedBytes, certifying the output has been through redaction.

func (RedactedBytes) Bytes added in v0.5.5

func (r RedactedBytes) Bytes() []byte

Bytes returns the underlying byte slice.

func (RedactedBytes) Len added in v0.5.5

func (r RedactedBytes) Len() int

Len returns the number of bytes in the redacted payload.

type Rule added in v0.6.2

type Rule struct {
	ID          string   `json:"id"                    yaml:"id"`
	Description string   `json:"description,omitempty" yaml:"description"`
	Regex       string   `json:"regex"                 yaml:"regex"`
	Samples     []Sample `json:"samples,omitempty"     yaml:"samples"`
}

Rule is a single redaction rule within a Pack.

type Sample added in v0.6.2

type Sample struct {
	Input    string `json:"input"    yaml:"input"`
	Redacted bool   `json:"redacted" yaml:"redacted"`
}

Sample is a self-test entry for a Rule. The runner asserts whether the rule's regex matching `Input` matches the `Redacted` expectation.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL