Skip to main content

Every optimization runs on
your infrastructure

Image transcoding, CSS/JS minification, critical CSS extraction, and variant-aware caching. No external service. No data leaving your servers.

Image Optimization

Automatically transcode images to the best format each client supports. No manual srcset management, no build pipelines, no CDN configuration.

  • WebP/AVIF transcoding

    Content-negotiated format selection based on the client's Accept header. Serve WebP to Chrome, AVIF to supporting browsers, original to the rest.

  • Responsive image generation

    Viewport-class and pixel-density aware resizing. Desktop, tablet, and mobile variants are generated and cached automatically.

  • Lossless PNG reduction

    OptiPNG-based lossless reduction strips unnecessary metadata and optimizes DEFLATE parameters without any quality loss.

  • GIF-to-WebP for animated GIFs

    Animated GIFs are transcoded to animated WebP, typically reducing file size by 60-80% while preserving all frames and timing.

  • Save-Data aware compression

    Compression levels adapt to Save-Data preferences. Clients requesting reduced data receive more aggressively compressed image variants. Text resources (HTML, CSS, JS) are served with Identity, Gzip, or Brotli transfer encoding as appropriate.

  • Proactive variant generation

    A single image decode produces up to 37 cache variants: 3 raster formats (WebP, AVIF, original) across 3 viewports, 2 pixel densities, and 2 Save-Data states, plus a resolution-independent SVG variant for eligible images. One request warms the cache for all client types.

  • SVG auto-vectorization

    Logos, icons, and simple graphics are automatically converted from raster to SVG using VTracer (Rust FFI). The resulting vector is resolution-independent — one variant serves all viewports and pixel densities. A size gate ensures the SVG is only stored when it is smaller than the raster original. Runs automatically for eligible images with no configuration required.

  • Jpegli encoder with perceptual quality

    Jpegli replaces libjpeg-turbo for JPEG encoding. SSIMULACRA2 perceptual quality verification ensures visual fidelity. Content-aware quality presets adjust compression based on image characteristics.

  • Intelligent quality prediction

    Per-format ML models predict the optimal encoder quality for a target SSIMULACRA2 score. Trained LightGBM decision trees compiled to C run in ~5 microseconds per format with zero runtime dependencies. SSIMULACRA2 verification catches outliers.

CSS Optimization

Reduce render-blocking CSS to the minimum required for first paint. Heuristic critical CSS extraction runs standalone — no headless browser required. Optional browser analysis adds CSS coverage validation, waterfall capture, and visual comparison.

  • Minification

    Whitespace, comments, and redundant syntax are stripped. Trailing semicolons are removed, decimal values optimized (0.5 to .5), and whitespace around CSS operators eliminated.

  • Critical CSS extraction

    Heuristic-based above-the-fold CSS extraction. Automatically includes selectors for html, body, :root, first 25 DOM elements, and header/nav/hero patterns. Optional browser pipeline adds Lighthouse validation, waterfall capture, and visual comparison.

  • Injected into HTML

    Critical CSS is injected as a <style> tag before </head>, creating a self-contained HTML variant that renders without blocking on external stylesheets.

  • Async CSS loading

    Render-blocking stylesheets are converted to non-blocking loads using media="print" onload with a <noscript> fallback. Activated automatically by the optimization policy engine when CSS coverage data indicates low stylesheet utilization. Requires --enable-browser-analysis.

JavaScript Optimization

Reduce JavaScript payload with automatic minification. Conservative transforms that remove dead weight without changing behavior.

  • Minification

    Comments, whitespace, and unnecessary semicolons are removed. Safe by construction — no variable renaming, no AST transforms.

  • Size-gated optimization

    Only writes a minified variant if the result is actually smaller. No wasted cache space for already-minified scripts.

  • Script deferral

    Browser script analysis identifies scripts safe to defer. The worker adds defer to those <script src="..."> tags, moving them off the critical path. Scripts already marked async, defer, or type="module" are left unchanged. Requires --enable-browser-analysis.

HTML Optimization

The worker processes HTML to inject critical CSS and generate preload hints. Combined with the caching proxy's Early Hints support, pages render faster without any changes to your application.

  • Critical CSS injection

    Heuristic-based above-the-fold CSS extraction. The worker scans HTML and linked stylesheets, extracts rules matching header, nav, hero, and early DOM elements, and injects them as a <style> tag before </head>. Optional browser analysis adds Lighthouse validation, waterfall capture, and visual comparison.

  • Early Hints (103)

    The worker stores stylesheet preload hints in the cache. On subsequent requests, the server sends 103 Early Hints responses with Link: rel=preload headers before the origin responds, letting the browser fetch CSS while waiting.

  • Preconnect injection

    Third-party origins discovered during HTML processing automatically receive Link: rel=preconnect headers, eliminating DNS/TLS setup time for external resources. Enabled by default.

  • Speculation Rules Experimental

    Opt-in prefetching via the Speculation Rules API. When enabled with --enable-speculation-rules, the worker injects prefetch hints for frequently-accessed URLs detected by the hot URL tracker. Off by default.

  • Hot URL warmup

    Frequently-accessed URLs are detected automatically. When a URL exceeds the hit threshold, the worker proactively generates all missing optimized variants so they are ready before the next request.

  • Native lazy loading

    The worker automatically adds loading="lazy" to off-screen <img> and <iframe> elements. The LCP candidate (or first body image as fallback) receives fetchpriority="high" instead, ensuring above-the-fold content loads at full priority. Enabled by default.

  • Explicit image dimensions

    Width and height attributes are injected on <img> tags that lack them, using dimensions from cached image data. This prevents Cumulative Layout Shift (CLS) caused by images loading without reserved space. Enabled by default.

Variant-Aware Caching

Every URL maps to a set of cache alternates, each identified by its 32-bit capability bitmask. Different clients get different optimized variants without cache pollution.

32-bit capability bitmask

Bits 0-1 Image Format (Original / WebP / AVIF / SVG)
Bits 2-3 Viewport Class (Mobile / Tablet / Desktop)
Bit 4 Pixel Density (1x / 2x+)
Bit 5 Save-Data (off / on)
Bits 6-7 Transfer Encoding (Identity / Gzip / Brotli / Reserved)

On cache miss, the best-fit fallback mechanism finds the closest cached variant rather than going back to the origin. A Desktop WebP request can be served from a Tablet WebP cache entry while the exact variant is being generated.

Pre-compressed gzip and brotli variants are generated for text resources (HTML, CSS, JS). On cache hits, the server serves directly with the appropriate Content-Encoding — no dynamic compression overhead.

Zero-Copy Serving

The Cyclone cache maps optimized resources directly into server output buffers using mmap. The response body is served directly from a memory-mapped cache file with no data copying and no per-request memory allocation for content.

Cache hit path

  1. 1

    Caching proxy classifies the request into a 32-bit capability mask

  2. 2

    URL is looked up in cache; best-fit variant selected by capability mask

  3. 3

    Cyclone lookup returns an mmap'd pointer with best-fit fallback

  4. 4

    Pointer is passed directly to the server's output chain — zero copy, zero allocation

  5. 5

    On miss: origin is served immediately, Factory Worker is notified via Unix socket to generate the optimized variant

Sub-millisecond cache hits. The hot path is a hash lookup and a pointer assignment — no data copying, no content processing in the request path.

Architecture Overview

Three purpose-built components. Optimization outside the request path. Each component does one thing well and communicates through minimal, well-defined interfaces.

Caching Proxy

High-performance reverse proxy

Deploys in front of any HTTP origin server. Classifies incoming requests by parsing Accept headers, Client Hints, and Save-Data signals. Composes cache keys, serves cache hits via zero-copy mmap, and passes cache misses through to the origin.

Cyclone Cache

Variant-aware disk cache

High-performance disk cache with capability-based keys. Supports best-fit fallback so near-miss variants can be served while the exact match is being generated. Data is memory-mapped for zero-copy serving.

Factory Worker

Lightweight optimization worker

Receives optimization requests via Unix socket. Runs HTML parsing, image transcoding, CSS/JS minification, and critical CSS extraction. Results are written to the Cyclone cache for subsequent zero-copy serving.

Beyond nginx, ModPageSpeed integrates as middleware for ASP.NET Core applications via the libpagespeed.so C API — same optimization pipeline, different stack.

Monitoring & Operations

Production-grade observability out of the box. Health checks, Prometheus metrics, cache invalidation, and Kubernetes-ready deployment.

  • Prometheus metrics

    The worker exposes metrics in Prometheus text exposition format via the management socket. Track variant generation rates, processing times by content type, cache size, and error counts.

  • Helm chart for Kubernetes

    Deploy as a Kubernetes pod with the included Helm chart. The caching proxy and worker run as sidecar containers sharing a cache volume.

  • Cache invalidation

    Purge all variants for a URL via the management socket. A single PURGE command removes all cached variants for the target URL.

  • Web console

    Built-in web console at /console/ for cache inspection, live dashboard, and URL analysis.

  • HTTP management API

    REST API for health checks, statistics, Prometheus metrics, configuration, and cache inspection. WebSocket streaming for real-time optimization events.

  • Configuration hot-reload

    RCU-based configuration updates. Change settings without restarting the worker or dropping connections. Applied atomically with zero downtime.

Try it on your own infrastructure

Full access to every optimization. No feature gates, no usage limits. Cancel anytime.

14-day free trial. Cancel anytime.