Pack Hosting Panel

When does Percolate optimize a page?

To optimize or not to optimize thats the question.

It can be pretty difficult to analyse why a page is not optimized by Percolate. Here we will describe the rules Percolate follows to decide to optimize or not.

When percolate receives a request it proxies the request to the application backend. The response is then sent to the client, and when all these rules are true the url is placed in the optimize queue. When an item is not placed in the optimize queue it will also not be visible on the /.percolate/status page.

  1. OPTIMIZE_PASSTHROUGH is set to true. If this is set to false, the sitemap crawler can be used so only pages in the sitemap are optimized.
  2. The backend response status code is 200
  3. The backend response headers state that the object is cachable. It uses the http-cache-semantics package. From the http-cache-semantic package page: "CachePolicy tells when responses can be reused from a cache, taking into account HTTP RFC 7234 rules for user agents and shared caches. It also implements RFC 5861, implementing stale-if-error and stale-while-revalidate. It's aware of many tricky details such as the Vary header, proxy revalidation, and authenticated responses."
  4. The request method is GET
  5. HTML_ONLY is set to false or the content-type header contains html
  6. The content-type header exists and contains html, javascript or css.
  7. The request query parameters do not contain any of the parameters defined in QUERY_EXCLUDE (default: id, limit, dir, order or mode).
  8. There are no more than QUERY_MAX_PARAMS (default 0) query parameters. The parameters defined in QUERY_MAX_PARAMS_EXCLUDE (default p) are not counted. So by default no url's with query parameters are optimized.

Robot meta tags

The worker picks up the item from the optimize queue and a new request is sent to the application backend. This request does not contain any cookies or other user related data. In case of HTML response it will check all <meta name="robots"> meta tags and only optimize if these conditions are met.

  1. OPTIMIZE_ROBOT_NOFOLLOW is set to true, or the tag value does not contain noindex.
  2. OPTIMIZE_ROBOT_NOINDEX is set to true, or the tag value does not contain nofollow.