trustwatch

module

v0.3.1 Latest Latest Go to latest Published: Feb 20, 2026 License: MIT

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/ppiankov/trustwatch

Links

Open Source Insights

README ¶

trustwatch

Kubernetes trust surface monitoring. Discovers expiring certificates on admission webhooks, API aggregation endpoints, service mesh issuers, cert-manager renewals, SPIFFE trust bundles, cloud provider certs, and external dependencies — then reports only the ones that matter. Supports policy-driven rules, multi-cluster federation, and historical trend tracking.

Quick Start

# Homebrew
brew install ppiankov/tap/trustwatch

# Or install from source
go install github.com/ppiankov/trustwatch/cmd/trustwatch@latest

# kubectl plugin (also installed by Homebrew)
kubectl trustwatch now --context prod

# Scan current cluster
trustwatch now --context prod

# Run as in-cluster service
trustwatch serve --config /etc/trustwatch/config.yaml

Agent Integration

trustwatch is designed to be used by autonomous agents without plugins or SDKs. Single binary, deterministic output, structured JSON, bounded jobs.

Agents: read SKILL.md for commands, flags, JSON output structure, and parsing examples.

Key pattern for agents: trustwatch now --context prod --output json then parse .snapshot.findings[] for certificate issues.

Container Image

# Pull from GitHub Container Registry
docker pull ghcr.io/ppiankov/trustwatch:latest

# Or build locally
make docker-build IMAGE=my-registry.io/trustwatch
docker push my-registry.io/trustwatch:v0.2.0

Multi-arch images (linux/amd64, linux/arm64) are published automatically on each release. The image is built FROM scratch with only the static binary and CA certificates (~15 MB). It doubles as its own tunnel relay in air-gapped clusters via trustwatch socks5 (see tunnel docs).

Verification

Container images and binary checksums are signed with Sigstore keyless signing.

# Verify container image
cosign verify --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  --certificate-identity-regexp "https://github.com/ppiankov/trustwatch/" \
  ghcr.io/ppiankov/trustwatch:latest

# Verify SBOM attestation
cosign verify-attestation --type spdxjson \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  --certificate-identity-regexp "https://github.com/ppiankov/trustwatch/" \
  ghcr.io/ppiankov/trustwatch:latest

# Verify binary checksums
cosign verify-blob --certificate checksums.txt.pem --signature checksums.txt.sig \
  --certificate-oidc-issuer https://token.actions.githubusercontent.com \
  --certificate-identity-regexp "https://github.com/ppiankov/trustwatch/" checksums.txt

Exit Codes

Code	Meaning
0	No problems
1	Warnings (certs expiring within warn threshold)
2	Critical (certs expiring within crit threshold or expired)
3	Discovery or probe errors

What is trustwatch?

Discovers trust surfaces that Kubernetes depends on (webhooks, apiservices, mesh issuers)
Probes TLS endpoints and reads TLS Secrets for certificate expiry
Annotation-driven: teams declare what matters with trustwatch.dev/* annotations
Accepts external targets via ConfigMap (vault, IdP, databases, anything with TLS)
TrustPolicy CRD for declarative policy rules (min key size, no SHA-1, required issuer, no self-signed)
Multi-cluster federation: aggregate findings from remote trustwatch instances
Historical snapshots via SQLite with trend API for UI sparklines
cert-manager renewal health: detects stuck renewals, failed challenges, pending certificates
SPIFFE/SPIRE trust bundle monitoring via workload API
Cloud provider certs: AWS ACM, GCP Certificate Manager, Azure Key Vault (build-tagged)
OpenTelemetry tracing with OTLP export
Two modes: now (ad-hoc TUI) and serve (always-on web UI + Prometheus metrics)
Deterministic, rule-based severity (no ML, no anomaly detection)
Reports only problems — healthy surfaces stay quiet

What trustwatch is NOT

Not a port scanner — discovery is API-driven and annotation-driven
Not a cert-manager replacement — it monitors, not manages
Not a mesh leaf-cert alarm — ignores short-lived workload certs by design
Not a compliance auditor — it reports operational risk, not regulatory posture
Not a trust graph visualizer — it shows problems, not topology

Discovery Sources

Auto-Critical (always discovered)

Source	What	Why Critical
API server	`kubernetes.default.svc:443`	Everything depends on it
Admission webhooks	`failurePolicy: Fail` webhooks	Expiry bricks deployments
API aggregation	`APIService` backends	Expiry breaks APIs
Linkerd	Trust roots + issuer Secret	Expiry breaks mesh identity
Istio	CA/root/intermediate materials	Expiry breaks mesh identity
Gateway API	`Gateway` listener TLS certificate refs	Expiry breaks gateway routing
cert-manager	`Certificate` CR expiry via dynamic client	Expiry breaks managed certs
cert-manager renewal	Stuck `CertificateRequest`, failed `Challenge`, not-ready `Certificate`	Stalled renewals lead to silent expiry
SPIFFE/SPIRE	Trust bundle root CAs via workload API	Expiry breaks SPIFFE identity

Opt-In (annotation-driven)

Annotate any Service or Deployment:

metadata:
  annotations:
    trustwatch.dev/enabled: "true"
    trustwatch.dev/severity: "critical"
    trustwatch.dev/ports: "443,8443"
    trustwatch.dev/sni: "api.internal"

Declare external dependencies:

metadata:
  annotations:
    trustwatch.dev/external-targets: |
      https://vault.internal:8200
      tcp://idp.company.com:443?sni=idp.company.com

Cloud Provider Certs (build-tagged)

Cloud provider certificate discovery is available when built with the corresponding tags:

# Build with all cloud providers
make build-cloud   # or: go build -tags "aws,gcp,azure" ./cmd/trustwatch

# Build with specific providers
go build -tags aws ./cmd/trustwatch

Provider	Build Tag	Source
AWS ACM	`aws`	Lists certificates via `acm:ListCertificates`
GCP Certificate Manager	`gcp`	Lists certificates via Certificate Manager API
Azure Key Vault	`azure`	Lists certificates via Key Vault API

All providers use ambient authentication (IAM roles, workload identity, managed identity).

ConfigMap Externals

# trustwatch config
external:
  - url: "https://vault.company.internal:8200"
  - url: "tcp://10.0.8.10:9443?sni=api.internal"

Modes

`trustwatch now` — Ad-hoc TUI

Run from your laptop. Discovers trust surfaces, probes endpoints, displays problems in a terminal UI.

trustwatch now --context prod --warn-before 720h --crit-before 336h

Output formats

By default, now shows a TUI when stdout is a terminal and a plain table when piped. Use --output to force a specific format, or --quiet for CI gates that only need the exit code:

# JSON output for automation
trustwatch now -o json

# Force table output even in a terminal
trustwatch now -o table

# CI gate: exit code only, no output
trustwatch now --quiet && echo "All certs OK"

# JSON piped to jq
trustwatch now -o json | jq '.snapshot.findings[] | select(.severity == "critical")'

Flag	Short	Default	Description
`--output`	`-o`	(auto)	Output format: `json`, `table` (default: TUI if TTY, table if piped)
`--quiet`	`-q`	`false`	Suppress all output, exit code only

`--tunnel`: In-cluster DNS resolution

By default, now runs from your laptop and can't resolve in-cluster DNS names (e.g. webhook-svc.ns.svc:443). The --tunnel flag deploys a temporary SOCKS5 proxy pod inside the cluster and routes all probe traffic through it via port-forwarding:

trustwatch now --tunnel

Flags:

Flag	Default	Description
`--tunnel`	`false`	Enable the in-cluster SOCKS5 relay
`--tunnel-ns`	`default`	Namespace for the relay pod
`--tunnel-image`	`serjs/go-socks5-proxy:latest`	SOCKS5 proxy image
`--tunnel-pull-secret`	(empty)	imagePullSecret name for private registries
`--tunnel-command`	(empty)	Override container entrypoint (comma-separated)

Private registries / air-gapped clusters:

If your cluster can't pull from Docker Hub, mirror the image and use --tunnel-image:

# Mirror with crane or skopeo
crane copy serjs/go-socks5-proxy:latest my-registry.io/socks5-proxy:latest
# or: skopeo copy docker://serjs/go-socks5-proxy:latest docker://my-registry.io/socks5-proxy:latest

trustwatch now --tunnel --tunnel-image my-registry.io/socks5-proxy:latest

If the registry requires authentication, create an imagePullSecret and pass it:

trustwatch now --tunnel \
  --tunnel-image my-registry.io/socks5-proxy:latest \
  --tunnel-pull-secret my-registry-creds

Air-gapped clusters (self-relay):

trustwatch includes a built-in SOCKS5 server. If the trustwatch image is already in your registry, use it as its own relay — no extra images needed:

trustwatch now --tunnel \
  --tunnel-image my-registry.io/trustwatch:v0.2.0 \
  --tunnel-command /trustwatch,socks5

Custom SOCKS5 image:

If you use a custom image that doesn't run a SOCKS5 server by default, use --tunnel-command to supply the entrypoint. The server must listen on port 1080:

trustwatch now --tunnel \
  --tunnel-image nicolaka/netshoot:latest \
  --tunnel-command microsocks,-p,1080

Relay pod lifecycle:

The relay pod is cleaned up automatically when trustwatch exits. A 5-minute activeDeadlineSeconds safety net ensures the pod is terminated even if trustwatch crashes or the connection drops.

Multi-Cluster Federation

Aggregate findings from multiple trustwatch instances:

# Scan local cluster and two remote instances
trustwatch now --cluster-name prod \
  --remote staging=http://trustwatch.staging:8080 \
  --remote dev=http://trustwatch.dev:8080

In serve mode, configure remotes via config file:

clusterName: prod
remotes:
  - name: staging
    url: http://trustwatch.staging.svc:8080
  - name: dev
    url: http://trustwatch.dev.svc:8080

All findings are labeled with their cluster name and the cluster label appears on Prometheus metrics.

`trustwatch serve` — In-Cluster Service

helm install trustwatch charts/trustwatch \
  --namespace trustwatch --create-namespace \
  --set image.repository=harbor.example.com/trustwatch/trustwatch

Override config values:

helm install trustwatch charts/trustwatch \
  --namespace trustwatch --create-namespace \
  --set config.warnBefore=360h \
  --set config.critBefore=168h

Enable Prometheus ServiceMonitor:

helm install trustwatch charts/trustwatch \
  --namespace trustwatch --create-namespace \
  --set serviceMonitor.enabled=true

Enable PrometheusRule alerts:

helm install trustwatch charts/trustwatch \
  --namespace trustwatch --create-namespace \
  --set prometheusRule.enabled=true

Or generate PrometheusRule YAML without Helm:

trustwatch rules --namespace monitoring --name trustwatch-alerts

Enable Grafana dashboard (auto-imported via sidecar):

helm install trustwatch charts/trustwatch \
  --namespace trustwatch --create-namespace \
  --set grafanaDashboard.enabled=true

Exposes web UI, Prometheus metrics, and JSON API.

Endpoint	Purpose
`/`	Problems web UI (filterable, with detail panels and trend sparklines)
`/metrics`	Prometheus scrape
`/healthz`	Liveness/readiness (503 if no scan or stale)
`/api/v1/snapshot`	JSON findings
`/api/v1/history`	Historical snapshot summaries (requires `--history-db`)
`/api/v1/trend`	Severity trend for a specific finding (requires `--history-db`)

Prometheus Metrics

trustwatch_cert_not_after_timestamp{source, namespace, name, severity, cluster}
trustwatch_cert_expires_in_seconds{source, namespace, name, severity, cluster}
trustwatch_probe_success{source, namespace, name, cluster}
trustwatch_scan_duration_seconds
trustwatch_findings_total{severity}
trustwatch_discovery_errors_total{source}
trustwatch_chain_errors_total{source}

TrustPolicy CRD

Declarative policy rules via trustwatch.dev/v1alpha1 TrustPolicy resources:

# Install the CRD
trustwatch apply

# List active policies
trustwatch policy

Example TrustPolicy:

apiVersion: trustwatch.dev/v1alpha1
kind: TrustPolicy
metadata:
  name: production-standards
spec:
  targets:
    - kind: Namespace
      names: ["production", "kube-system"]
  thresholds:
    warnBefore: 720h
    critBefore: 336h
  rules:
    - type: minKeySize
      params:
        bits: "2048"
    - type: noSHA1
    - type: requiredIssuer
      params:
        issuer: "CN=My CA"
    - type: noSelfSigned

Policy violations appear as findings with source policy and finding type POLICY_VIOLATION.

Configuration

listenAddr: ":8080"
metricsPath: "/metrics"
refreshEvery: "2m"
warnBefore: "720h"    # 30 days
critBefore: "336h"    # 14 days
namespaces: []         # empty = all
historyDB: ""          # path to SQLite DB (enables /api/v1/history, /api/v1/trend)
spiffeSocket: ""       # path to SPIFFE workload API socket
otelEndpoint: ""       # OTLP gRPC endpoint (e.g. localhost:4317)
clusterName: ""        # label for this cluster in federated views
external:
  - url: "https://vault.internal:8200"
remotes:               # remote trustwatch instances for federation
  - name: staging
    url: http://trustwatch.staging.svc:8080
notifications:
  enabled: false
  webhooks:
    - url: "https://hooks.slack.com/services/T/B/x"
      type: slack
    - url: "https://alerts.example.com/trustwatch"
      type: generic
  severities: ["critical", "warn"]
  cooldown: "1h"

Architecture

trustwatch
├── Discovery (Kubernetes API)
│   ├── Webhooks (Validating + Mutating)
│   ├── APIService aggregation
│   ├── TLS Secrets
│   ├── Ingress TLS refs
│   ├── Linkerd identity (trust roots + issuer)
│   ├── Istio CA materials
│   ├── Gateway API TLS refs
│   ├── cert-manager Certificates + renewal health
│   ├── SPIFFE/SPIRE trust bundles
│   ├── Cloud providers (AWS ACM, GCP, Azure KV)
│   └── Annotations (trustwatch.dev/*)
├── Probing (TLS handshake)
│   ├── In-cluster endpoints
│   ├── External targets (ConfigMap)
│   └── SOCKS5 tunnel (--tunnel)
├── Policy Engine
│   ├── TrustPolicy CRD (trustwatch.dev/v1alpha1)
│   └── Rules: min key size, no SHA-1, required issuer, no self-signed
├── Federation
│   ├── Remote snapshot aggregation (--remote name=url)
│   └── Cluster labels on metrics and UI
├── Storage
│   ├── SQLite history (--history-db)
│   └── Trend API (/api/v1/trend)
├── Output
│   ├── TUI (now mode)
│   ├── Web UI (serve mode, filterable with detail panels + sparklines)
│   ├── Prometheus metrics
│   ├── JSON API
│   ├── Notifications (Slack, generic webhook)
│   └── OpenTelemetry traces (--otel-endpoint)
└── Severity
    ├── Critical: expired, webhook Fail, within crit threshold
    ├── Warn: within warn threshold, webhook Ignore (capped¹), insecureSkipTLSVerify
    └── Info: inventory (metrics only)

¹ Webhooks with failurePolicy=Ignore are capped at Warn because they do not block admission.

Security Model

RBAC Requirements

trustwatch needs read-only cluster-wide access. The Helm chart creates a ClusterRole with these rules:

API Group	Resources	Verbs
`""` (core)	secrets, services, configmaps, namespaces	list, watch
`admissionregistration.k8s.io`	validatingwebhookconfigurations, mutatingwebhookconfigurations	list, watch
`apiregistration.k8s.io`	apiservices	list, watch
`apiextensions.k8s.io`	customresourcedefinitions	get, create, update
`apps`	deployments	list, watch
`networking.k8s.io`	ingresses	list, watch
`gateway.networking.k8s.io`	gateways	list, watch
`cert-manager.io`	certificates, certificaterequests, challenges	list, watch
`trustwatch.dev`	trustpolicies	list, watch
`authorization.k8s.io`	selfsubjectaccessreviews	create

When --namespace is used, trustwatch probes its own permissions via SelfSubjectAccessReview and silently skips namespaces where it lacks access. This allows namespace-scoped RBAC without 403 errors in the output.

Secret Access

trustwatch reads kubernetes.io/tls Secrets to extract certificate expiry dates. It reads the tls.crt PEM data only — private keys (tls.key) are never accessed, logged, or stored. If you want to avoid Secret access entirely, remove the secrets permission and trustwatch will fall back to probe-only mode (TLS handshake) for all endpoints.

External Targets

External targets are configured via a ConfigMap (in serve mode) or CLI config file (in now mode). They contain hostnames and ports, never credentials. If your external targets require authentication context, use annotations on Services instead.

Data Retention

now mode: Snapshot exists only in memory for the duration of the TUI session. Nothing is written to disk unless --history-db is set.
serve mode: The latest snapshot is held in memory and served via /api/v1/snapshot. When --history-db is configured, snapshots are persisted to a local SQLite database for trend analysis.
No PII: trustwatch stores certificate metadata (subject, issuer, SANs, serial, expiry). It does not store certificate private keys, request bodies, or user data.

Stability

Metric names (trustwatch_*) may change before v1.0
Annotation keys (trustwatch.dev/*) are stable from v0.2+
Exit codes (0/1/2/3) are stable from v0.1+
JSON API (/api/v1/snapshot) schema may gain fields but will not remove them before v1.0

Known Limitations

Does not detect certs served via Envoy SDS that aren't backed by Kubernetes Secrets
Cannot probe endpoints blocked by NetworkPolicy from trustwatch's namespace
Mesh leaf/workload certs (24h default) are intentionally ignored to avoid noise
Cloud provider discovery requires build tags (aws, gcp, azure) — not included in the default binary
SPIFFE discovery requires a reachable workload API socket
Historical storage uses local SQLite — not suitable for HA deployments with multiple replicas
Requires RBAC read access to secrets, webhooks, apiservices, ingresses, services, gateways, certificates, trustpolicies
--tunnel mode may log connection reset by peer errors from the Kubernetes port-forward layer — these are cosmetic and caused by unreachable probe targets closing the SOCKS5 connection; probe results are unaffected

Roadmap

now mode with BubbleTea TUI
serve mode with web UI + Prometheus metrics
Webhook + APIService auto-discovery
TLS Secret parsing
Ingress TLS discovery
Linkerd issuer/trust-roots discovery
Istio CA material discovery
Annotation-based target discovery
External targets from config
--tunnel SOCKS5 relay for laptop-to-cluster probing
Helm chart
Structured logging (--log-level, --log-format)
JSON/table output formats (--output json|table, --quiet)
Gateway API TLS discovery
Namespace-scoped RBAC with access probing
Grafana dashboard (Helm chart)
rules command (generate PrometheusRule YAML)
cert-manager Certificate CR discovery
Webhook and Slack notifications
Certificate chain validation (broken chains, wrong SANs, self-signed leaves)
Signed container images + SBOM attestation (Cosign/Sigstore)
TrustPolicy CRD with policy rules engine
cert-manager renewal health monitoring
Historical snapshot storage (SQLite) with trend API
SPIFFE/SPIRE trust bundle discovery
Cloud provider certs (AWS ACM, GCP, Azure Key Vault)
OpenTelemetry tracing
Multi-cluster federation
Web UI filtering, detail panels, and trend sparklines

License

MIT — see LICENSE for details.

Directories ¶

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL

Path	Synopsis
cmd
trustwatch command Package main is the trustwatch CLI entry point.	Package main is the trustwatch CLI entry point.
internal
chain Package chain validates X.509 certificate chains for trust issues.	Package chain validates X.509 certificate chains for trust issues.
cli Package cli provides the trustwatch CLI commands.	Package cli provides the trustwatch CLI commands.
config Package config provides YAML configuration loading and validation.	Package config provides YAML configuration loading and validation.
ct Package ct provides Certificate Transparency log monitoring via crt.sh.	Package ct provides Certificate Transparency log monitoring via crt.sh.
discovery Package discovery provides pluggable discoverers that find TLS trust surfaces in a Kubernetes cluster.	Package discovery provides pluggable discoverers that find TLS trust surfaces in a Kubernetes cluster.
drift Package drift detects unexpected certificate changes between consecutive snapshots.	Package drift detects unexpected certificate changes between consecutive snapshots.
federation Package federation provides multi-cluster snapshot aggregation.	Package federation provides multi-cluster snapshot aggregation.
history Package history provides persistent snapshot storage using SQLite.	Package history provides persistent snapshot storage using SQLite.
impact Package impact provides rotation impact analysis for certificate chains.	Package impact provides rotation impact analysis for certificate chains.
metrics Package metrics provides Prometheus instrumentation for trustwatch.	Package metrics provides Prometheus instrumentation for trustwatch.
monitor Package monitor provides TUI rendering and exit-code logic for trustwatch.	Package monitor provides TUI rendering and exit-code logic for trustwatch.
notify Package notify sends webhook notifications when findings cross severity thresholds.	Package notify sends webhook notifications when findings cross severity thresholds.
policy Package policy manages TrustPolicy CRDs and policy evaluation.	Package policy manages TrustPolicy CRDs and policy evaluation.
probe Package probe provides TLS handshake probing for certificate inspection.	Package probe provides TLS handshake probing for certificate inspection.
remediation Package remediation maps finding types to actionable fix suggestions.	Package remediation maps finding types to actionable fix suggestions.
report Package report generates self-contained HTML compliance reports from scan snapshots.	Package report generates self-contained HTML compliance reports from scan snapshots.
revocation Package revocation checks certificate revocation status via OCSP and CRL.	Package revocation checks certificate revocation status via OCSP and CRL.
rotation Package rotation detects excessive certificate rotation frequencies.	Package rotation detects excessive certificate rotation frequencies.
socks5 Package socks5 provides a minimal SOCKS5 CONNECT-only server.	Package socks5 provides a minimal SOCKS5 CONNECT-only server.
store Package store defines the data model for trustwatch findings and snapshots.	Package store defines the data model for trustwatch findings and snapshots.
telemetry Package telemetry provides OpenTelemetry tracing initialization.	Package telemetry provides OpenTelemetry tracing initialization.
tunnel Package tunnel provides a SOCKS5 relay for routing TLS probes through an in-cluster proxy pod, enabling trustwatch to resolve cluster-internal DNS from a laptop.	Package tunnel provides a SOCKS5 relay for routing TLS probes through an in-cluster proxy pod, enabling trustwatch to resolve cluster-internal DNS from a laptop.
web Package web provides HTTP handlers for the trustwatch web UI and API.	Package web provides HTTP handlers for the trustwatch web UI and API.

README ¶

trustwatch

Quick Start

Agent Integration

Container Image

Verification

Exit Codes

What is trustwatch?

What trustwatch is NOT

Discovery Sources

Auto-Critical (always discovered)

Opt-In (annotation-driven)

Cloud Provider Certs (build-tagged)

ConfigMap Externals

Modes

trustwatch now — Ad-hoc TUI

Output formats

--tunnel: In-cluster DNS resolution

Multi-Cluster Federation

trustwatch serve — In-Cluster Service

Prometheus Metrics

TrustPolicy CRD

Configuration

Architecture

Security Model

RBAC Requirements

Secret Access

External Targets

Data Retention

Stability

Known Limitations

Roadmap

License

Directories ¶

`trustwatch now` — Ad-hoc TUI

`--tunnel`: In-cluster DNS resolution

`trustwatch serve` — In-Cluster Service