Documentation
¶
Overview ¶
Package lighthouse implements the sampler, scheduler, and runner used to capture Lighthouse performance audits for a small subset of pages during each crawl. See docs/plans/lighthouse-performance-reports.md for the full design.
Index ¶
Constants ¶
This section is empty.
Variables ¶
var ErrMemoryShed = errors.New("lighthouse runner shed audit due to low memory")
ErrMemoryShed is returned by LocalRunner.Run when free memory is below LIGHTHOUSE_MEMORY_SHED_THRESHOLD_MB. The consumer treats this like a shutdown cancellation: leave the lighthouse_runs row in 'running' so XAUTOCLAIM redelivers it once memory recovers, and skip the XAck so the Redis stream entry survives.
var ErrRunnerNotImplemented = errors.New("lighthouse local runner not implemented yet")
ErrRunnerNotImplemented is returned by the local runner shim until Phase 3 wires Chromium into the analysis-app image. Keeping the error here means the scheduler and DB layer can be exercised end-to-end with the stub before any Chromium work lands.
Functions ¶
func PerBand ¶
PerBand returns the number of audits to schedule per extreme band for a job with completedPages successful tasks so far. Floored at 1 (so even a 1-page site gets one audit per band) and capped at 15 (so a 10,000-page crawl never queues more than 30 audits per job).
math.Floor (rather than Round) keeps the curve from over-shooting the cap at exactly 10,000 pages and matches the published anchor points exactly.
func SanitiseAuditURL ¶
SanitiseAuditURL strips query strings and fragments before logging. Lighthouse audit URLs come from customer crawls and can carry session tokens, signed-link tokens, or other low-entropy PII in the query string; the runner does not need them in central logs. Exported so the analysis service (cmd/analysis) can apply the same rule to its own info-level logs without redefining the helper.
Types ¶
type AuditRequest ¶
type AuditRequest struct {
RunID int64
JobID string
PageID int
SourceTaskID string
URL string
Profile Profile
Timeout time.Duration
}
AuditRequest is the input handed to a Runner. The scheduler builds these from a lighthouse_runs row plus the matching tasks/pages metadata. Timeout is the per-run budget; runners must respect it.
SourceTaskID is the lighthouse_runs.source_task_id (empty when the FK was NULLed via ON DELETE SET NULL). The local runner uses it to co-locate the report with the matching crawl artefact under jobs/{JobID}/tasks/{SourceTaskID}/. Empty falls back to a run-id-keyed path so a deleted parent task doesn't lose the report.
type AuditResult ¶
type AuditResult struct {
PerformanceScore *int
LCPMs *int
CLS *float64
INPMs *int
TBTMs *int
FCPMs *int
SpeedIndexMs *int
TTFBMs *int
TotalByteWeight *int64
ReportKey string
Duration time.Duration
}
AuditResult is the output of a successful audit. ReportKey is the R2 object key that the runner uploaded the full Lighthouse JSON to; empty for the stub runner since it doesn't write to R2 in Phase 1.
Optional metric fields are pointers so we can distinguish "not produced" from "produced as zero" — Lighthouse occasionally omits metrics on pages it can't audit cleanly.
func ParseReport ¶
func ParseReport(raw []byte) (AuditResult, error)
ParseReport turns a Lighthouse JSON report into an AuditResult. The Duration field is left zero — the runner stamps wall time from the outside since the report itself doesn't capture exec wrapper cost.
ReportKey is left empty here for the same reason: it's the R2 key, which only the runner that just uploaded the report can know.
Returns an error only on malformed JSON; a report that is well-formed but missing a given audit produces a nil pointer in that field, which preserves the "not measured" semantics of the lighthouse_runs columns.
type CompletedTask ¶
CompletedTask is the input shape the sampler needs from a completed crawl task. ResponseTime is the HTTP response time in milliseconds (tasks.response_time, BIGINT). The sampler does not need anything else from the row.
type LocalRunner ¶
type LocalRunner struct {
// contains filtered or unexported fields
}
LocalRunner shells out to the bundled lighthouse CLI to perform real audits. Implements the Runner interface. Phase 3's only Runner alternative to StubRunner.
func NewLocalRunner ¶
func NewLocalRunner(cfg LocalRunnerConfig) (*LocalRunner, error)
NewLocalRunner constructs a LocalRunner with the supplied config. Returns an error if the config is missing pieces the runner needs to operate (binary paths, bucket, archive provider) — better to fail at boot than to discover the missing config on the first audit.
func (*LocalRunner) Run ¶
func (l *LocalRunner) Run(ctx context.Context, req AuditRequest) (AuditResult, error)
Run executes a single Lighthouse audit, parses the JSON result, uploads the gzipped raw report to R2, and returns the populated AuditResult. Honours req.Timeout via context wrapping; the caller's XAck/Fail flow on the consumer side decides what to do with errors.
Retry policy: at most one retry on a transient Chromium failure, recognised via stderr substring match. After that, the most recent error (with stderr tail) is propagated up.
type LocalRunnerConfig ¶
type LocalRunnerConfig struct {
LighthouseBin string
ChromiumBin string
Provider archive.ColdStorageProvider
Bucket string
MemoryShedMB int
ProfilePreset Profile // defaults to ProfileMobile if empty
}
LocalRunnerConfig captures the bits the local runner needs from the analysis service config. Kept separate from cmd/analysis so the runner is testable without the whole consumer scaffolding.
type Profile ¶
type Profile string
Profile selects between Lighthouse's mobile and desktop presets. v1 only schedules mobile audits; desktop is reserved for Phase 5.
type Runner ¶
type Runner interface {
Run(ctx context.Context, req AuditRequest) (AuditResult, error)
}
Runner executes a single Lighthouse audit. The Phase 1 stub implementation returns canned data so the rest of the pipeline can be exercised before Chromium lands. Phase 3 adds a localRunner that shells out to the bundled lighthouse binary.
type Sample ¶
type Sample struct {
Task CompletedTask
Band SelectionBand
}
Sample is one task selected by the sampler, tagged with the band it came from. The scheduler turns each Sample into a lighthouse_runs row plus an outbox entry.
func SelectSamples ¶
func SelectSamples(completed []CompletedTask, milestone int, alreadySampled map[int]SelectionBand) []Sample
SelectSamples picks fastest and slowest tasks from completed, enforcing a global per-band cap of PerBand(len(completed)) across the lifetime of a job. alreadySampled maps every page_id already queued for the job (any band) to the band it was scheduled under; the function uses it both to dedupe (a page never appears twice) and to count existing fastest/slowest rows so each milestone only tops up the band quotas — it never re-spends them.
BandReconcile rows in alreadySampled count toward dedupe but not toward fastest/slowest quotas: at milestone 100 the scheduler retags whatever the sampler picks as reconcile, so by construction reconcile rows shouldn't exist before this call. If they do (e.g. a duplicate milestone-100 fire) the dedupe still keeps page IDs disjoint while the quota math correctly treats the existing rows as already-spent budget.
The fastest and slowest output sets are guaranteed disjoint: when fewer than fastestNeeded+slowestNeeded candidates remain, fastest takes priority and slowest fills from what's left.
Order in the returned slice is fastest band first (ascending response_time), then slowest band (descending response_time). The scheduler treats this slice as the per-milestone work list.
The function is pure — it does not touch the database or the network — so it is straightforward to unit test.
type Scheduler ¶
type Scheduler struct {
// contains filtered or unexported fields
}
Scheduler turns crawl progress into pending lighthouse_runs rows and matching task_outbox entries. One scheduler is bound to a single SchedulerDB / TxRunner pair; the JobManager invokes OnMilestone via the OnProgressMilestone callback whenever a flush observes a 10% boundary crossing.
func NewScheduler ¶
func NewScheduler(database SchedulerDB, queue TxRunner) *Scheduler
NewScheduler constructs a Scheduler. Callers are expected to wire the returned Scheduler.OnMilestone into JobManager.OnProgressMilestone during bootstrap.
func (*Scheduler) OnMilestone ¶
OnMilestone runs the sampler at the given milestone (0..100) and enqueues any newly chosen audits. Safe to call from the milestone callback path: errors are returned for the caller to log, but the caller is expected to never let a scheduler failure block the batch loop.
At milestone == 100, samples are tagged as the 'reconcile' band so the analytics layer can distinguish opportunistic per-decade picks from the catch-up pass at job completion.
type SchedulerDB ¶
type SchedulerDB interface {
GetCompletedTasksForLighthouseSampling(ctx context.Context, jobID string) ([]db.CompletedTaskForSampling, error)
GetLighthouseRunPageBands(ctx context.Context, jobID string) (map[int]db.LighthouseSelectionBand, error)
}
SchedulerDB is the narrow subset of *db.DB the scheduler needs. Kept as an interface so unit tests can inject a fake without standing up a full Postgres pool.
type SelectionBand ¶
type SelectionBand string
SelectionBand identifies which extreme of the response-time distribution a sampled task came from. The reconcile band is reserved for the 100% pass run by the scheduler at job completion.
const ( BandFastest SelectionBand = "fastest" BandSlowest SelectionBand = "slowest" BandReconcile SelectionBand = "reconcile" )
type StubRunner ¶
type StubRunner struct{}
StubRunner is a deterministic Runner that returns canned metrics without launching Chromium. Used for local development, CI, and integration tests that drive a synthetic job through the full schedule → enqueue → record pipeline.
Metric values are fixed so test assertions stay simple. Tests that need varied data should construct a custom Runner rather than extending this one.
func NewStubRunner ¶
func NewStubRunner() *StubRunner
NewStubRunner returns a StubRunner ready for use.
func (*StubRunner) Run ¶
func (s *StubRunner) Run(ctx context.Context, req AuditRequest) (AuditResult, error)
Run honours ctx cancellation and req.Timeout but otherwise sleeps briefly to approximate the cost of an audit, then returns a canned result.
type TxRunner ¶
type TxRunner interface {
ExecuteWithContext(ctx context.Context, fn func(ctx context.Context, tx *sql.Tx) error) error
}
TxRunner runs the supplied function inside a Postgres transaction. Mirrors db.QueueExecutor.ExecuteWithContext so the scheduler can reuse the same retry / pool semantics the rest of the codebase already exercises.