When to optimize?
To optimize or not to optimize; thats the question.
It can be pretty difficult to analyse why a page is not optimized by Percolate. Here we will describe the rules Percolate follows to decide to optimize or not.
When percolate receives a request it proxies the request to the application backend. The response is then
sent to the client, and when all these rules are true the url is placed in the optimize queue. When an item is not placed
in the optimize queue it will also not be visible on the
OPTIMIZE_PASSTHROUGHis set to true. If this is set to false, the sitemap crawler can be used so only pages in the sitemap are optimized.
- The backend response status code is
- The backend response headers state that the object is cachable. It uses the http-cache-semantics package. From the
http-cache-semanticpackage page: "CachePolicy tells when responses can be reused from a cache, taking into account HTTP RFC 7234 rules for user agents and shared caches. It also implements RFC 5861, implementing stale-if-error and stale-while-revalidate. It's aware of many tricky details such as the Vary header, proxy revalidation, and authenticated responses."
- The request method is
HTML_ONLYis set to false or the
content-typeheader exists and contains
- The request query parameters do not contain any of the parameters defined in
- There are no more than
QUERY_MAX_PARAMS(default 0) query parameters. The parameters defined in
p) are not counted. So by default no url's with query parameters are optimized.
The worker picks up the item from the optimize queue and a new request is sent to the application backend. This request
does not contain any cookies or other user related data. In case of HTML response it will check all
meta tags and only optimize if these conditions are met.
OPTIMIZE_ROBOT_NOFOLLOWis set to true, or the tag value does not contain
OPTIMIZE_ROBOT_NOINDEXis set to true, or the tag value does not contain