Audit Stability

Lighthouse scores vary between runs even on the same page. Some variance is inherent (page nondeterminism, network jitter) and unavoidable. The largest avoidable source is the test machine itself. Underpowered hardware, or hardware doing other work (background apps, browser extensions, other Lighthouse runs) while the audit is in progress, produces unstable scores. This is why running Lighthouse on a developer laptop or a shared CI runner is unreliable.

Dedicated infrastructure

xcelera audits run on dedicated, high-performance machines with no other workloads. We benchmarked several hosting providers before picking one with consistent run-to-run results, and we don't share or re-use audit machines across organizations.

CPU throttling

Lighthouse simulates a slower device by applying a CPU throttle multiplier. The multiplier is relative to host speed, so the same setting produces different simulated devices on different hardware.

We benchmark each audit machine and compute a throttle multiplier that targets a fixed simulated device. The simulated profile stays consistent regardless of which machine picks up the audit.

Multi-run aggregation

Even on stable hardware, Lighthouse has variance from network timing, GC pauses, and JIT warmup. Each audit performs a minimum of 3 runs. We apply statistical analysis across those runs to pick a representative result and exclude outliers. If the intial audit results are not producing consistent results, we may trigger additional runs.

See Concepts for how runs and audits relate.

Monitoring

We track baseline performance, network conditions, and run variance on every audit machine. If we detect an audit machine does not pass the expected benchmarks we discard it and start a new one.