Canonicalize JavaScript Libraries
Configuration
The 'Canonicalize JavaScript Libraries' filter is enabled by specifying:
- Apache:
-
ModPagespeedEnableFilters canonicalize_javascript_libraries
- Nginx:
-
pagespeed EnableFilters canonicalize_javascript_libraries;
in the configuration file.
Description
This filter identifies popular JavaScript libraries that can be replaced with ones hosted for free by a JavaScript library hosting service — by default the Google Hosted Libraries. This has several benefits:
- Most important, first-time site visitors can benefit from browser caching, since they may have visited other sites making use of the same service to obtain the libraries.
- The JavaScript hosting service acts as a content delivery network for the hosted files, reducing load on the server and improving browser load times.
- There are no charges for the resulting use of bandwidth by site visitors.
- The hosted versions of library code are generally optimized with third-party minification tools. These optimizations can make use of library-specific annotations or minification settings that aren't portable to arbitrary JavaScript code, so the libraries benefit from more aggressive optimization than can be provided by PageSpeed.
In Apache the default set of libraries can be found in the
pagespeed_libraries.conf
file, which is loaded along with pagespeed.conf when Apache starts up. It
contains signatures for all the Google Hosted Libraries.
In Nginx you need to convert pagespeed_libraries.conf from Apache-format to
Nginx format:
$ scripts/pagespeed_libraries_generator.sh > ~/pagespeed_libraries.conf $ sudo mv ~/pagespeed_libraries.conf /path/to/nginx/configuration_files/
You also need to include it in your Nginx configuration by reference:
include pagespeed_libraries.conf;
Don't edit pagespeed_libraries.conf. Local edits will keep you from being able
to update it when you update PageSpeed. Rather than editing it you should add additional
libraries to your main configuration file:
- Apache:
-
ModPagespeedLibrary 43 1o978_K0_LNE5_ystNklf \ //www.modpagespeed.com/rewrite_javascript.js
- Nginx:
-
pagespeed Library 43 1o978_K0_LNE5_ystNklf //www.modpagespeed.com/rewrite_javascript.js;
The general format of these entries is:
- Apache:
ModPagespeedLibrary bytes MD5 canonical_url
- Nginx:
pagespeed Library bytes MD5 canonical_url;
Here bytes is the size in bytes of the library after minification by
PageSpeed, and MD5 is the MD5 hash of the library after minification.
Minification controls for differences in whitespace that may occur when the same script is
obtained from different sources. The canonical_url is the hosting service URL
used to replace occurrences of the script. Note that the canonical URL in the above example
is protocol-relative; this means the data will be fetched using the same protocol (http
or https) as the containing page. Because older browsers don't handle
protocol-relative URLs reliably, PageSpeed resolves a protocol-relative library URL to an
absolute URL based on the protocol of the containing page. Do not use
http canonical URLs in configurations that may serve content over
https, or the rewritten pages will expose your site to attack and trigger a
mixed-content warning in the browser. Similarly, avoid using https URLs unless
you know that the resulting library will eventually be fetched from a secure page, as SSL
negotiation adds overhead to the initial request.
Additional library configuration metadata can be generated with the help of the
pagespeed_js_minify utility installed along with PageSpeed. To use this
utility, you will need a local copy of the JavaScript code that you wish to replace. If this
is stored in library.js, you can generate bytes and
MD5 as follows:
- Apache:
-
$ pagespeed_js_minify --print_size_and_hash library.js
- Nginx:
-
$ cd /path/to/psol/lib/Release/linux/ia32/ $ pagespeed_js_minify --print_size_and_hash library.js
If you're using the new javascript minifier, add
the --use_experimental_minifier argument to pagespeed_js_minify.
If you're using the old minifier, add --nouse_experimental_minifier. (As of
1.10.33.0, --use_experimental_minifier is default. Previously,
--nouse_experimental_minifier was.) The default
pagespeed_libraries.conf includes hashes for both the old and new minifiers.
This filter is based on the best practices of optimizing browser caching and reducing payload size.
Operation
In order to identify a library and canonicalize its URL, PageSpeed must of course be able to fetch the JavaScript code from the URL on the original page. Because library canonicalization identifies libraries solely by their size and hash signature, it is not necessary to authorize PageSpeed to fetch content from the domain hosting the canonical content itself. This means that it is safe to use this filter behind a reverse proxy or in other situations where network access by PageSpeed is deliberately restricted. Browsers visiting the site will fetch the content from the canonical URL, but PageSpeed itself does not need to do so.
Examples
You can see the filter in action at www.modpagespeed.com on this
example.
If the HTML document looks like this:
<html>
<head>
<script src="jquery_1_8.js">
</script>
<script src="a.js">
</script>
<script src="b.js">
</script>
</head>
<body>
...
</body>
</html>
Then, assuming jquery_1_8.js was an unminified copy of the jquery library and
a.js and b.js contained site-specific code that made use of
jquery, the page would be rewritten as follows:
<html>
<head>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js">
</script>
<script src="a.js">
</script>
<script src="b.js">
</script>
</head>
<body>
...
</body>
</html>
The library URL has been replaced by a reference to the canonical minified version hosted on
ajax.googleapis.com. Note that canonical libraries do not participate in most
other JavaScript optimizations. For example, if
Combine JavaScript is also enabled, the above page will be
rewritten as follows:
<html>
<head>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.8.3/jquery.min.js">
</script>
<script src="http://www.example.com/a.js+b.js.pagespeed.jc.zYiUaxFS8I.js">
</script>
</head>
<body>
...
</body>
</html>
The canonical library is not combined with the other two JavaScript files, since this would lose the bandwidth and caching benefits of fetching it from the canonical URL.
If defer_javascript is enabled, and library is
not tagged with data-pagespeed-no-defer, the canonicalized library is
deferred.
Requirements
Only complete, unmodified libraries referenced by <script>
tags in the HTML will be rewritten. Libraries that are loaded by other means (for example by
injecting a loader script) or that have been modified will not be canonicalized.
Risks
You must ensure that you abide by the terms of service of the providers of the canonical content before enabling canonicalization. The terms of service for the default configuration can be found at https://developers.google.com/speed/libraries/terms.
The canonical URL refers to a third-party domain; this can cause additional DNS lookup latency the first time a library is loaded. This is mitigated by the fact that the canonical copy of the data is shared among multiple sites.
The initial request for a canonical URL will contain a Referer:
header with the URL of the referring page. This permits the host of the canonical content to
see a subset of traffic to your site (the first load of a page on your site that contains an
identified library by a browser that does not already have that library in its cache). The
provider should describe how this data is used in its terms of service. The terms of service
for the default configuration can be found at
https://developers.google.com/speed/libraries/terms. Again, this risk is mitigated by the fact that canonical libraries are shared among
multiple sites; a popular library is likely to already reside in the browser cache.
Sites serving content on both http and https URLs must use
protocol-relative canonical URLs as shown above. Fetching a library
insecurely from a secure page exposes a site to attack. Fetching a library securely from an
ordinary page can increase load time due to SSL overheads.
It is theoretically possible to craft a JavaScript file whose minified size and hash exactly match that of a canonical library, but whose code behaves differently. In such a case the library will be replaced with the canonical (widely-used) library. This will break the page that contains the reference to the crafted JavaScript.