Caching Strategies in a Federated GraphQL Architecture

6 min readMar 29, 2021

Kyle Schrade
Engineer at StockX
@NotKyleSchrade

There are two difficult problems in computer science: naming, caching, and off-by-one errors.

(More introduction needed)

This post will explore multiple caching strategies in a federated GraphQL system, from local caching and memoization to distributed caching.

You can find the code describing each strategy here.

No caching: a basic federated architecture

Our initial architecture is pretty simple:

A GraphQL gateway, to consume requests
An implementing service, where we define the resolvers
A backend service, to retrieve the data for the implementing service

Every request is passed from the gateway to the single implementing service, which resolves the data from the backend service.

There’s no caching here yet, and you can see an example of this basic no-caching architecture here.

Local caching

Looking at the previous architecture above, you might notice that there’s a one-to-one ratio of requests that come into the gateway and hit the underlying backend service.

If we wanted to reduce strain on the backend service, we could introduce a cache, preventing the need to re-fetch data that the implementing service has already fetched.

The simple approach would be introducing a local, in-memory cache. Our diagram now looks like this:

Visit the GitHub repo for this post to find an example of the local caching strategy.

Memoization

When we talk about caching, in most cases, we are talking about memoization. Don’t worry; the name used to confuse me too.

Memoization refers to the technique of caching function output based on the function input. When executing a function multiple times, if the parameters to a function are the same, the result will be the same too! Memoization is what our local caching approach is built on.

Let’s say we know that executing getUser(id:1) fetches the user with ID = 1, and running it with the same parameter will return the same result. Since the result will always be the same with these arguments, we can just memoize the result and return that!

Let’s say we have our `getUser` function:

If we wanted to memoize this function, we could use this generic memoization pattern.

Then we could apply the memoize to the `getUser` function to create a memoized version of that function

Let’s break down what’s happening above.

We have a function that takes in an `innerFn`, `getKeyFn`, and a `cache`.

The `innerFn` is the function you are trying to memorize, and the `getKeyFn` function is used to create a cache key.

The memoization function uses the arguments of the `innerFn` to create a cache key with `getKeyFn`, then checks the cache for a value at that key. If the cache contains a value, it is returned.

If the cache doesn’t contain a value, the memoization function executes the `innerFn`, saves that value using the key, and returns the result.

You can find the entire implementation of this function here.

Distributed caching

An in-memory cache works until you start to scale your implementing service.

Once you have more than one replica, the in-memory cache becomes less effective since the caches aren’t shared. A distributed cache helps solve this problem.

Using a distributed cache

Distributed caching is a technique where the cache is external to a service and shared between multiple replicas.

You may already be familiar with distributed caching if you have worked with Redis, but many tools fit this problem space. Key-value stores like Redis, Hazelcast, or Memcached are often used, but you could also use a database as a distributed caches; NoSQL databases such as DynamoDB and MongoDB are popular choices. The goal is to have a shared place to save values, accessible by multiple service replicas. An important thing to keep in mind when choosing a distributed cache; the latency of retrieving data from the cache should be lower than the latency of re-fetching the data from the underlying service.

If we substitute our existing memoization functions with a cache object that points to a distributed cache instead of a local cache implementation, we get something like this instead:

For our TypeScript example, you can strictly type the cache object argument of the memoization function. You could then switch from using your local in-memory cache to Redis, MongoDB, DynamoDB, etc. — whatever you like!

Awe-inspiring benefits to distributed caching

Let’s expand our architecture a little bit further to include more federated services:

One of the best parts of using a distributed cache is how it enables the implementing services in your federated GraphQL architecture to cache things for each other. If both implementing services are making calls to the same endpoint in the underlying backend service, they can share that cache value. This is HUGE!

For a practical example, say you have two implementing services: media and product. Suppose a user searches for a product on your site. That search query ends up calling your product implementing service, which calls a backend service to get the image URLs for that product. Using the architecture we’ve described, the URLs would get cached in the distributed cache.

When the user then clicks on the product to take a closer look, it issues a query to the media service to fetch the image. We can take advantage of the distributed cache and avoid fetching the image URLs from the backend service.

Further improvement: batching requests

Often, fetching data in a batch can be more efficient than individual requests. But when caching fetched data, you want to cache the responses as if an individual request was made; that way, you can take advantage of partially cached batch requests.

Dataloader is a marvelous NodeJS package used to batch requests together (check out the docs to learn more about how it works). We can add this library between our cache and the backend service like this:

Going back to the product and media example, assume we requested seven products and their images. Without batching, every cache miss would result in a request to the backend service; if we had three cache hits and four cache misses, that would result in four requests to the backend service. With batching, we can bundle up the four requests into one!

This strategy can significantly increase our efficiency, and Dataloader enables us to use batching without overcomplicating our system.

Conclusion

We now have a robust caching strategy! Starting with a local cache, adding a distributed cache, and evolving to support batching, we have explored an emergent caching architecture in a federated GraphQL implementation.

But the journey doesn’t stop here! Two additional forms of caching not mentioned in this post are client-side caching and HTTP caching. With client-side caching, you can reduce the amount of data requested via the client, and with HTTP caching — using a tool like Varnish — your distributed cache becomes a bubble around your service.

These approaches are well worth exploring as you further optimize your GraphQL architecture.

If you’re interested in changing the game, please apply to join our team.