Browser Analysis
How ModPageSpeed 2.0 uses headless Chrome to extract critical CSS, detect LCP, and validate optimizations.
ModPageSpeed 2.0 can use headless Chrome to analyze pages with real browser rendering instead of relying solely on heuristics. Browser analysis extracts critical CSS from actual CSS Coverage data, detects the true Largest Contentful Paint element, measures image dimensions, and validates that optimizations do not cause visual regressions.
Browser analysis is strictly additive. Every failure falls back to the heuristic path. Pages still get optimized — they just use the faster, less precise heuristic pipeline instead.
Enabling Browser Analysis
Browser analysis is off by default. Enable it with the --enable-browser-analysis
flag and ensure Chrome (or chrome-headless-shell) is available in the
container:
factory_worker \
--cache-path /data/cache.vol \
--enable-browser-analysis \
--chrome-binary /usr/bin/chrome-headless-shell
The Docker release images (modpagespeed/worker) ship with Chromium
pre-installed. No additional setup required.
Architecture
Worker (libuv event loop)
|
+-- BrowserAnalysisManager
|
+-- AnalysisQueue -- bounded priority queue with dedup
|
+-- ChromeProcess -- spawn/recycle/RSS monitor
| |
| +-- CdpClient -- JSON-RPC over pipe (FD 3/4)
|
+-- BrowserCssExtractor -- CSS Coverage API -> critical CSS
+-- PageAnalyzer -- LCP, fold, CLS, image dims
+-- UnusedCssRemover -- dead rule removal
+-- VisualRegressionGate -- PNG pixel diff validation
+-- FontGlyphScanner -- code point scanning + @font-face
+-- ScriptCoverageAnalyzer -- Profiler coverage + deferral
BrowserAnalysisManager owns the Chrome lifecycle, analysis queue, and the
CDP pipeline. It runs on the main libuv event loop (where CDP must operate).
Worker thread pool threads enqueue analysis requests via uv_async_send().
CDP Pipe Transport
Chrome DevTools Protocol communication happens over --remote-debugging-pipe
(file descriptors 3 and 4), not over a WebSocket. Messages are null-byte
delimited JSON-RPC. This avoids the overhead and port management of the
WebSocket debugging protocol.
Design decisions:
- Per-command
uv_timer_ttimeout (default 30s) - Large CDP messages (>64KB) parsed off the event loop via
uv_queue_work() CancelAll()on pipe EOF resolves all pending callbacks
How It Works
- The worker thread runs
HtmlScanner::Scan()to extract page structure TemplateDetector::HashStructure()computes an FNV-1a hash of the DOM structure, identifying the page templateLookupProfile()checks the cache for an existingOptimizationProfilefor this template hash- Profile found: browser-validated critical CSS and LCP data are used instead of heuristics
- No profile:
EnqueueAnalysis()sends the request to the main event loop viauv_async_send() DrainQueue()dequeues items and runs the analysis pipeline across three viewports: Mobile (375x667), Tablet (768x1024), Desktop (1440x900)- The resulting
OptimizationProfileis stored in the cache withSentinelId::kBrowserProfile
CSS Cache Inlining
Before passing HTML to Chrome, the worker resolves <link rel="stylesheet">
tags against the Cyclone cache and injects <style> blocks into the HTML.
This enables Chrome’s CSS Coverage API to compute real coverage percentages
instead of returning 0% for external stylesheets.
Guards prevent abuse: 50 stylesheet cap, 2MB per-stylesheet cap, 10MB total HTML cap.
Analysis Components
| Component | Purpose |
|---|---|
| BrowserCssExtractor | Uses Chrome’s CSS Coverage API to identify which CSS rules are actually used on each viewport. Produces per-viewport critical CSS. |
| PageAnalyzer | Detects the real LCP element, measures fold position, computes CLS, and reads rendered image dimensions. |
| UnusedCssRemover | Takes Coverage data and removes dead rules from stylesheets. |
| VisualRegressionGate | Captures before/after screenshots and compares them pixel-by-pixel. Blocks optimizations that cause visible regressions. |
| FontGlyphScanner | Scans the DOM with TreeWalker for code points used on the page. Maps them to @font-face declarations for future subsetting. |
| ScriptCoverageAnalyzer | Uses Chrome’s Profiler domain to measure JS code coverage. Identifies scripts safe to defer. |
Script Coverage Analysis
When browser analysis is enabled, the ScriptCoverageAnalyzer component uses
Chrome’s Profiler domain to measure JavaScript code coverage. This identifies
scripts that are safe to defer, improving page load performance by reducing
parser-blocking JavaScript.
How It Works
- The analyzer loads the page with JavaScript enabled (Profiler + Coverage APIs)
- Each external script’s coverage is measured during page load
- Scripts are classified into deferral categories based on coverage data and execution timing
Deferral Categories
| Category | Description |
|---|---|
kSafeToDefer | Script has low main-thread impact; safe to add defer |
kCandidateForAsync | Script is independent; could use async instead |
kAlreadyAsync | Script already has async or defer attribute |
kKeepSynchronous | Script must execute synchronously (DOM-dependent, inline handlers) |
SSRF Defense
Script analysis enables JavaScript execution in Chrome (required for accurate coverage measurement). The other three SSRF defense layers remain active: network offline mode, Fetch interception, and DNS-level blocking. Chrome cannot make outbound connections even with JavaScript enabled.
Configuration
| Flag | Default | Description |
|---|---|---|
--no-browser-script-analysis | (enabled) | Disable script coverage analysis |
Script analysis results feed into the optimization policy engine, which decides whether to enable script deferral for each URL template.
Optimization Policy
The optimization policy engine computes per-template decisions about optional HTML transforms based on browser analysis data. It runs after profile generation and stores the policy alongside the optimization profile in cache.
Policy Fields
| Field | Condition | Description |
|---|---|---|
async_css_enabled | Avg CSS coverage < 50% | Enable async loading for render-blocking stylesheets |
script_deferral_enabled | Deferrable scripts detected | Enable defer attribute on safe scripts |
Stats Counters
| Counter | Description |
|---|---|
policy.computed | Total optimization policies computed |
policy.async_css_enabled | Times async CSS was enabled by policy |
policy.script_deferral_enabled | Times script deferral was enabled by policy |
These counters appear in /v1/stats JSON, /v1/metrics Prometheus output,
the management socket STATS command, and the web console metrics page.
Chrome Process Management
Lifecycle
ChromeProcess::Start()spawns Chrome with headless flags and pipe transport- CDP commands flow through
CdpClientfor page analysis - After each page,
IncrementPageCount()checks the recycle threshold - At the recycle threshold,
Stop()sends SIGTERM (then SIGKILL after 5s) - A fresh Chrome process starts for the next batch
Launch Flags
Chrome is spawned with strict isolation flags:
--headless=new— new headless mode--remote-debugging-pipe— FD 3/4 pipe transport--disable-gpu— no GPU required--no-sandbox— required in containers (Chrome must run with minimal container privileges)--host-resolver-rules="MAP * ~NOTFOUND"— DNS-level SSRF block--disable-dev-shm-usage— avoids /dev/shm exhaustion in containers- Various isolation flags (
--disable-extensions,--disable-background-networking,--no-first-run, etc.)
RSS Monitoring
On Linux, the worker reads /proc/pid/status VmRSS every 5 seconds. When
Chrome exceeds --chrome-max-memory (default 512MB), the worker stops it
and starts a fresh instance. This prevents memory leaks from accumulating
across hundreds of pages.
SSRF Defense (4 Layers)
Browser analysis operates on cached content, not live network requests. Four layers prevent Chrome from making any outbound connections:
Network.emulateNetworkConditions({offline: true})— blocks all networkFetch.enable+Fetch.requestPaused— intercept and fail all requestsEmulation.setScriptExecutionDisabled({value: true})— no JS execution (CSS extractor and visual regression gate)--host-resolver-rules="MAP * ~NOTFOUND"— Chrome-level DNS block
Font Glyph Scanner and Script Coverage Analyzer enable JavaScript (they need it for accurate analysis) but still enforce the other three layers.
Configuration Flags
| Flag | Default | Description |
|---|---|---|
--enable-browser-analysis | off | Enable browser analysis pipeline |
--chrome-binary | /usr/bin/chrome-headless-shell | Path to Chrome binary |
--chrome-recycle-interval | 100 | Pages per Chrome instance before restart |
--chrome-page-timeout | 60000 | Per-page analysis timeout in ms |
--chrome-max-memory | 512 | Max Chrome RSS in MB before forced restart |
--chrome-startup-timeout | 10000 | Chrome startup timeout in ms |
--browser-queue-size | 1000 | Max queued analysis requests |
--browser-profile-ttl | 86400 | Profile cache lifetime in seconds (24h) |
--no-browser-critical-css | (enabled) | Disable browser-based critical CSS |
--no-browser-lazy-loading | (enabled) | Disable browser-based lazy load decisions |
--no-browser-lcp-preload | (enabled) | Disable browser-based LCP detection |
--no-browser-image-sizing | (enabled) | Disable browser-based image dimensions |
--no-browser-script-analysis | (enabled) | Disable script coverage analysis |
All flags are also hot-reloadable via PATCH /v1/config from the web console.
Monitoring
Stats Counters
Browser analysis stats appear in the management socket STATS and
BROWSER-STATUS commands, and in the web console dashboard:
| Counter | Description |
|---|---|
browser.profiles_generated | Templates analyzed and cached |
browser.profiles_used | Cache hits on existing profiles |
browser.analysis_errors | Failures (timeout, Chrome crash, etc.) |
browser.chrome_crashes | Chrome process crashes |
browser.queue_depth | Current queue size |
browser.scripts_analyzed | Scripts evaluated by browser analysis |
browser.scripts_deferrable | Scripts identified as safe to defer |
browser.css_inlining_attempted | CSS inlining attempts |
browser.css_inlining_stylesheets_cached | Stylesheets found in cache |
browser.css_inlining_bytes_inlined | Total CSS bytes injected |
Management Socket
The BROWSER-STATUS command on the management socket returns detailed JSON
including Chrome state, queue contents, and per-profile statistics:
echo "BROWSER-STATUS" | socat - UNIX-CONNECT:/data/pagespeed.sock.mgmt
Error Handling
Every failure falls back to the heuristic path:
| Failure | Behavior |
|---|---|
| Chrome binary not found | Heuristic only, no retry |
| Chrome fails to start | Retry after 2 seconds |
| Chrome crashes mid-analysis | Cancel current item, restart Chrome after 2s |
| Analysis timeout | Skip item, process next in queue |
| Cache read failure | Skip item |
| Queue full | Head-drop oldest item |
The worker logs all browser analysis errors at the warning level. Monitor
them in the debug console (/logs) or via the management socket.
Troubleshooting
Chrome not available (503 errors in waterfall/diff)
The web console’s waterfall viewer and visual diff features return 503 when Chrome is not running. Check:
- Is
--enable-browser-analysisset? - Does the Chrome binary exist at the configured path?
- In Docker: is the worker image the full variant (not the minimal image)?
High chrome_crashes count
Frequent Chrome crashes usually indicate memory pressure:
- Lower
--chrome-recycle-intervalto restart Chrome more often - Lower
--chrome-max-memoryto catch leaks earlier - Check container memory limits — Chrome needs at least 256MB headroom
Profiles not being generated
If profiles_generated stays at zero while traffic flows:
- Check
queue_depth— if it stays at 0, analysis requests are not being enqueued. Verify--enable-browser-analysisis set. - Check
analysis_errors— errors during analysis prevent profile creation. - Check
css_inlining_stylesheets_cached— if external CSS is not yet cached, the worker waits for it before running browser analysis.
Visual Regression Gate false positives
The visual regression gate disables JavaScript (SSRF defense). Pages that rely on CSS-in-JS frameworks (styled-components, Emotion, etc.) will show differences because their styles are injected by JavaScript. This is a known limitation. The heuristic path optimizes these pages correctly.
Next Steps
- Web Console — Use the waterfall viewer and visual diff tools powered by browser analysis
- HTTP API Reference — BROWSER-STATUS management command and /v1/stats browser counters
- Configuration Reference — All browser analysis flags
- Troubleshooting — Chrome not found, CDP failures, and analysis timeout diagnostics