Efficient cache in the REST API using Varnish ESI

Some time ago, while working on a REST API for our corporate applications, approaching the moment when the API was already stabilized we moved on to the optimization - because we expected the API can be used very intensively. Looking for the optimal strategy for the cache, we established the following criteria:

After much research we have concluded that the Varnish and its ESI tags are almost ideal.

Edge-Side Include

Edge-Side Include (ESI) is proposed by Akamai and Oracle web standard. It allows the server to support this standard for placement in one page content from different URLs.

With ESI we can extract part of the page that should not be caching to a separate request and put the rest to the cache. This granularity allows to increase the “cache-hit ratio” - the ratio of cached queries to non-cached. Of course, if bigger the “cache-hit ratio” the page loads faster and the costs are lower.

<div>
    <strong>Your profile:</strong>
    <esi:include src="/user/profile" />
    <esi:include src="/user/last_activity" />
</div>
<div>
    <strong>Our products:</strong>    
    <esi:include src="/products" />
</div>

In this case, we see that the block displaying the section of our site was generated by 3 different URLs content of which has been inserted in place of the call by the ESI tag <esi:include src="/..." />.

Of course, just generate such sites is not uncommon, until we use Varnish, by which we are able to define a caching policy separately for each of the enclosed by ESI elements.

For more information about ESI tags in Varnish refer to the official documentation: (https://www.varnish-cache.org/docs/4.0/users-guide/esi.html) - Basic knowledge of their work and knowledge of Varnish configuration VCL will be necessary to understand the following description of this technique.

ESI tags & REST API - concept

Sample API consists of the following endpoints:

Example response to the request GET /api/rest/document/16629 :

<document>
    <id>16629</id>
    <user_id>478</user_id>
    <title>Sample documents</title>
    <attachments>
        <attachment>
            <id>556219</id>
            <title>Invoice</title>
            <content>......</content>
        </attachment>
        <attachment>
            <id>556220</id>
            <title>Invoice #2</title>
            <content>......</content>
        </attachment>
    </attachments>
</document>

We can see that the response we get (XML data format is irrelevant) which is an object document along with its attributes and the assigned objects of type attachment.

We can also see that a document addition to the data from your entity also needs to retrieve a list of objects attachment that belong to it. If we put in their place ESI tags that will indicate the endpoint GET /api/rest/attachment/#{attachment_id} we can delegate them to the outer request made by the Varnish.

Inserting ESI tags in places where attachment objects are generated obtain such a structure:

<document>
    <id>16629</id>
    <user_id>478</user_id>
    <title>Sample documents</title>
    <attachments>
        <esi:include src="/api/rest/attachment/556219" />
        <esi:include src="/api/rest/attachment/556220" />
    </attachments>
</document>

When Varnish will receive a response from the backend server, there will be an additional call of two requests:

Varnish synchronously executes the query after each of these elements one by one. After receipt of each response all of them will be placed in a right place in the code.

REST API with subresources

It is worth to emphasize that the ESI requests are synchronous (community edition), thus blocking. Varnish on an ongoing basis will send a content received from the requests of the ESI tags. Each sending of an ESI request blocks whole response.

Varnsih ESI requests flow

So sending a request GET /api/rest/document/16629 really made a 3 HTTP requests to the backend, one to generate the resource document and the other two ESI requests to generate the necessary attachment resources. The time client waits for this response is the sum of these 3 requests. Of course, here we take the most pessimistic version - every resource was not previously in the cache - so generation of each of these resources required to send a request to the backend.

This allows to selectively cache each of these resources separately and to invalidate cache only for those elements that actually have changed, leaving the remaining contents.

Collections

Already knowing the basic concept, we apply the same technique for endpoints that return collections.

Example response to the request GET /api/rest/document:

<collection>
    <document>
        <id>16629</id>
        ...
    </document>
    <document>
        <id>16630</id>
        ...
    </document>
    ...
</collection>

We modify the response and placed in the ESI tags:

<collection>
    <esi:include src="/api/rest/document/16629" />
    <esi:include src="/api/rest/document/16630" />
    <esi:include src="/api/rest/document/16631" />
    <esi:include src="/api/rest/document/16632" />
    <esi:include src="/api/rest/document/16632" />
</collection>

Same as in the case of a single element, Varnish perform requests so long as there is tag ESI.

We have also a case of nested ESI tags, because, as noted earlier request GET /api/rest/document/16629 can generate additional request GET /api/rest/attachment/#{id_attachment} to retrieve associated attachment objects. So taking the pessimistic case with a lack of items in the cache, where each document is composed of at least 3 attachment objects - 1 request to GET /api/rest/document internally produces 3 requests 5 times - a total of 15 synchronous HTTP requests.

This is a disadvantage and an advantage at the same time - on the one hand it will generate additional traffic on the backend and on the other it will automatically warm up the cache for multiple items. This can also be read in such a way that by calling 1 request automatically warm up cache for 15 elements.

API designed this way require from developer to implement requesting to a single element, because both the collection and nested objects are in fact the response of a single item endpoint.

REST API collection

Parallel ESI

It would be very helpful to have asynchronous ESI requesting - much gain in performance of collection pages. But unfortunately today (December 2016) parallel ESI was introduced only in the commercial version Varnish Plus (https://info.varnish-software.com/blog/varnish-lab-parallel-esi) and it does not seem to have it quickly moved to the community version (https://www.varnish-cache.org/lists/pipermail/varnish-misc/2014-October/024039.html).

Here we see as obvious is the difference in parallel building a pages composed of ESI tags: Varnsih ESI requests flowVarnsih ESI requests flow

ESI tags & REST API - Retrieving Data

Retrieving the data needed to generate the endpoint content can in some cases be reduced only to extract these data which are necessary to create the resource URL.

In our example when retrieving a collection of document objects, backend must actually retrieve only the primary keys and then generate a “template” with the ESI tags.

With this solution we reduce the traffic between the database and the application. Also note that once created “template” for collection endpoint will be saved to the cache, so the next time you request an application and even database will not be used.

ESI tags & REST API - Problems with implementation

JSON data format

If the API returns data in format different than XML, for example in JSON - Varnish will have problems with parsing ESI tags. To allow Varnish parse such documents in order to search for ESI tags (which are XML nodes) parameter feature=+esi_disable_xml_check must be set in the parameters of the daemon startup.

DAEMON_OPTS="-a :6081 \
             -T localhost:6082 \
             -f /etc/varnish/default.vcl \
             -S /etc/varnish/secret \
             -s malloc,2GB \
             -p feature=+esi_disable_xml_check"

Handling errors in subresources

Sometimes it can lead to a situation where Varnish attempts to get resource of ESI tag that no longer exists - for a single resource that is not a big problem in the case of the collection it may result in that we will have mixed contents of HTML containing the description of the 404 error and JSON content of resource - which results in a syntax error for the whole document.

The problem can be easily solved using the benefits of Varnish VCL. At the moment when Varnish detects response error in ESI subresource we can replace it with our content - in this case - the empty string.

sub vcl_recv {

    # Add information about ESI level for each request

    set req.http.X-Varnish-Esi-Level = req.esi_level;
}

sub vcl_backend_response {
    if (beresp.status == 404 || (beresp.status >= 500 && beresp.status <= 599)) {

        # We check this condition because if we do not ask about ESI resource - we want to display 404 error. We check whether the condition applies ESI tag - if so, call vcl_backend_error.

        if (bereq.http.X-Varnish-Esi-Level && std.integer(bereq.http.X-Varnish-Esi-Level, 0) > 0) {
            return (abandon);
        }
    }
}

sub vcl_backend_error {
    # To make sure that we behave properly only when we ask about ESI resources

    if (bereq.http.X-Varnish-Esi-Level && std.integer(bereq.http.X-Varnish-Esi-Level, 0) > 0) {

        # Our empty string will be returned in place of the faulty ESI tag

        synthetic("");
        return (deliver);
    }
}

Setting the TTL

If we want to set the TTL for each endpoint separately we should pass this information with the HTTP response header and then in vcl_backend_response set the received TTL.

sub vcl_backend_response {
    if (beresp.http.X-Varnish-Cache-Ttl ~ "^\d+$") {
        set beresp.ttl = std.duration(regsub(beresp.http.X-Varnish-Cache-Ttl, "^(\d+)$", "\1s"), 3600s);
    }
}

Tagging elements and cache invalidation

For precise cache invalidation we should use tags for ESI responses. Tags can be transferred as in the case of TTL - using the HTTP response headers. It should be remembered that both the collections and individual resources can consist of different models - so you should use the format allowing for the distinction of specific keys for specific models such as:

X-Varnish-Tags: model1(12,14,15,16,17);model2(44,56,57,58,59);

With headline built this way we are able to easily invalidate cache for a particular resource by the Varnish administration interface:

# Ban cache for model "model1" with ID "123"
ban obj.http.X-Varnish-Tags ~ .*model1\(([A-Za-z0-9-_]+,)*123(,[A-Za-z0-9-_]+)*\);.*

Summary

This technique has its advantages and disadvantages, however, in the case of our API where endpoints contain many subresources proved to be very efficient.

The implementation is not complicated. Using ESI allows us to divide API into the logical individual elements which can be folded like a blocks.