Every optimization runs on
your infrastructure
Image transcoding, CSS/JS minification, critical CSS extraction, and variant-aware caching. No external service. No data leaving your servers.
Image Optimization
WebP/AVIF transcoding, viewport-aware resizing, 37 variants per image
CSS Optimization
Minification, critical CSS extraction, async loading, @import flattening
JS Optimization
Safe minification, automatic script deferral — no AST transforms
HTML Optimization
Critical CSS injection, async CSS, script deferral, Early Hints (103), lazy loading
Variant-Aware Caching
32-bit capability mask, best-fit fallback, per-client variants
Zero-Copy Serving
Sub-millisecond cache hits via mmap — no copies, no allocations
Image Optimization
Automatically transcode images to the best format each client supports. No manual srcset management, no build pipelines, no CDN configuration.
- ✓
WebP/AVIF transcoding
Content-negotiated format selection based on the client's Accept header. Serve WebP to Chrome, AVIF to supporting browsers, original to the rest.
- ✓
Responsive image generation
Viewport-class and pixel-density aware resizing. Desktop, tablet, and mobile variants are generated and cached automatically.
- ✓
Lossless PNG reduction
OptiPNG-based lossless reduction strips unnecessary metadata and optimizes DEFLATE parameters without any quality loss.
- ✓
GIF-to-WebP for animated GIFs
Animated GIFs are transcoded to animated WebP, typically reducing file size by 60-80% while preserving all frames and timing.
- ✓
Save-Data aware compression
Compression levels adapt to Save-Data preferences. Clients requesting reduced data receive more aggressively compressed image variants. Text resources (HTML, CSS, JS) are served with Identity, Gzip, or Brotli transfer encoding as appropriate.
- ✓
Proactive variant generation
A single image decode produces up to 37 cache variants: 3 raster formats (WebP, AVIF, original) across 3 viewports, 2 pixel densities, and 2 Save-Data states, plus a resolution-independent SVG variant for eligible images. One request warms the cache for all client types.
- ✓
SVG auto-vectorization
Logos, icons, and simple graphics are automatically converted from raster to SVG using VTracer (Rust FFI). The resulting vector is resolution-independent — one variant serves all viewports and pixel densities. A size gate ensures the SVG is only stored when it is smaller than the raster original. Runs automatically for eligible images with no configuration required.
- ✓
Jpegli encoder with perceptual quality
Jpegli replaces libjpeg-turbo for JPEG encoding. SSIMULACRA2 perceptual quality verification ensures visual fidelity. Content-aware quality presets adjust compression based on image characteristics.
- ✓
Intelligent quality prediction
Per-format ML models predict the optimal encoder quality for a target SSIMULACRA2 score. Trained LightGBM decision trees compiled to C run in ~5 microseconds per format with zero runtime dependencies. SSIMULACRA2 verification catches outliers.
CSS Optimization
Reduce render-blocking CSS to the minimum required for first paint. Heuristic critical CSS extraction runs standalone — no headless browser required. Optional browser analysis adds CSS coverage validation, waterfall capture, and visual comparison.
- ✓
Minification
Whitespace, comments, and redundant syntax are stripped. Trailing semicolons are removed, decimal values optimized (0.5 to .5), and whitespace around CSS operators eliminated.
- ✓
Critical CSS extraction
Heuristic-based above-the-fold CSS extraction. Automatically includes selectors for
html,body,:root, first 25 DOM elements, and header/nav/hero patterns. Optional browser pipeline adds Lighthouse validation, waterfall capture, and visual comparison. - ✓
Injected into HTML
Critical CSS is injected as a
<style>tag before</head>, creating a self-contained HTML variant that renders without blocking on external stylesheets. - ✓
Async CSS loading
Render-blocking stylesheets are converted to non-blocking loads using
media="print" onloadwith a<noscript>fallback. Activated automatically by the optimization policy engine when CSS coverage data indicates low stylesheet utilization. Requires--enable-browser-analysis.
JavaScript Optimization
Reduce JavaScript payload with automatic minification. Conservative transforms that remove dead weight without changing behavior.
- ✓
Minification
Comments, whitespace, and unnecessary semicolons are removed. Safe by construction — no variable renaming, no AST transforms.
- ✓
Size-gated optimization
Only writes a minified variant if the result is actually smaller. No wasted cache space for already-minified scripts.
- ✓
Script deferral
Browser script analysis identifies scripts safe to defer. The worker adds
deferto those<script src="...">tags, moving them off the critical path. Scripts already markedasync,defer, ortype="module"are left unchanged. Requires--enable-browser-analysis.
HTML Optimization
The worker processes HTML to inject critical CSS and generate preload hints. Combined with the caching proxy's Early Hints support, pages render faster without any changes to your application.
- ✓
Critical CSS injection
Heuristic-based above-the-fold CSS extraction. The worker scans HTML and linked stylesheets, extracts rules matching header, nav, hero, and early DOM elements, and injects them as a
<style>tag before</head>. Optional browser analysis adds Lighthouse validation, waterfall capture, and visual comparison. - ✓
Early Hints (103)
The worker stores stylesheet preload hints in the cache. On subsequent requests, the server sends
103 Early Hintsresponses withLink: rel=preloadheaders before the origin responds, letting the browser fetch CSS while waiting. - ✓
Preconnect injection
Third-party origins discovered during HTML processing automatically receive
Link: rel=preconnectheaders, eliminating DNS/TLS setup time for external resources. Enabled by default. - ✓
Speculation Rules Experimental
Opt-in prefetching via the Speculation Rules API. When enabled with
--enable-speculation-rules, the worker injects prefetch hints for frequently-accessed URLs detected by the hot URL tracker. Off by default. - ✓
Hot URL warmup
Frequently-accessed URLs are detected automatically. When a URL exceeds the hit threshold, the worker proactively generates all missing optimized variants so they are ready before the next request.
- ✓
Native lazy loading
The worker automatically adds
loading="lazy"to off-screen<img>and<iframe>elements. The LCP candidate (or first body image as fallback) receivesfetchpriority="high"instead, ensuring above-the-fold content loads at full priority. Enabled by default. - ✓
Explicit image dimensions
Width and height attributes are injected on
<img>tags that lack them, using dimensions from cached image data. This prevents Cumulative Layout Shift (CLS) caused by images loading without reserved space. Enabled by default.
Variant-Aware Caching
Every URL maps to a set of cache alternates, each identified by its 32-bit capability bitmask. Different clients get different optimized variants without cache pollution.
32-bit capability bitmask
On cache miss, the best-fit fallback mechanism finds the closest cached variant rather than going back to the origin. A Desktop WebP request can be served from a Tablet WebP cache entry while the exact variant is being generated.
Pre-compressed gzip and brotli variants are generated for text resources (HTML, CSS, JS). On cache hits, the server serves directly with the appropriate Content-Encoding — no dynamic compression overhead.
Zero-Copy Serving
The Cyclone cache maps optimized resources directly into server output buffers using mmap. The response body is served directly from a memory-mapped cache file with no data copying and no per-request memory allocation for content.
Cache hit path
- 1
Caching proxy classifies the request into a 32-bit capability mask
- 2
URL is looked up in cache; best-fit variant selected by capability mask
- 3
Cyclone lookup returns an mmap'd pointer with best-fit fallback
- 4
Pointer is passed directly to the server's output chain — zero copy, zero allocation
- 5
On miss: origin is served immediately, Factory Worker is notified via Unix socket to generate the optimized variant
Sub-millisecond cache hits. The hot path is a hash lookup and a pointer assignment — no data copying, no content processing in the request path.
Architecture Overview
Three purpose-built components. Optimization outside the request path. Each component does one thing well and communicates through minimal, well-defined interfaces.
Caching Proxy
High-performance reverse proxy
Deploys in front of any HTTP origin server. Classifies incoming requests by parsing Accept headers, Client Hints, and Save-Data signals. Composes cache keys, serves cache hits via zero-copy mmap, and passes cache misses through to the origin.
Cyclone Cache
Variant-aware disk cache
High-performance disk cache with capability-based keys. Supports best-fit fallback so near-miss variants can be served while the exact match is being generated. Data is memory-mapped for zero-copy serving.
Factory Worker
Lightweight optimization worker
Receives optimization requests via Unix socket. Runs HTML parsing, image transcoding, CSS/JS minification, and critical CSS extraction. Results are written to the Cyclone cache for subsequent zero-copy serving.
Beyond nginx, ModPageSpeed integrates as middleware for ASP.NET Core applications via the libpagespeed.so C API — same optimization pipeline, different stack.
Monitoring & Operations
Production-grade observability out of the box. Health checks, Prometheus metrics, cache invalidation, and Kubernetes-ready deployment.
- ✓
Prometheus metrics
The worker exposes metrics in Prometheus text exposition format via the management socket. Track variant generation rates, processing times by content type, cache size, and error counts.
- ✓
Helm chart for Kubernetes
Deploy as a Kubernetes pod with the included Helm chart. The caching proxy and worker run as sidecar containers sharing a cache volume.
- ✓
Cache invalidation
Purge all variants for a URL via the management socket. A single PURGE command removes all cached variants for the target URL.
- ✓
Web console
Built-in web console at
/console/for cache inspection, live dashboard, and URL analysis. - ✓
HTTP management API
REST API for health checks, statistics, Prometheus metrics, configuration, and cache inspection. WebSocket streaming for real-time optimization events.
- ✓
Configuration hot-reload
RCU-based configuration updates. Change settings without restarting the worker or dropping connections. Applied atomically with zero downtime.
Try it on your own infrastructure
Full access to every optimization. No feature gates, no usage limits. Cancel anytime.
14-day free trial. Cancel anytime.