Apollo Caching 1on1

This blog post aims to provide an overview of caching mechanisms on both the front-end and back-end of GraphQL Apollo. In this article, the researcher presented the misconfiguration that can exist in the caching mechanism of Apollo with the attached checklist for the penetration testing engagements and recommendations for developers.

When using the Apollo Server (with CDNs) and Apollo Client + React.js (very popular and recommended by Apollo for web development), several issues can arise in the web application leading to various types of information leakage through caching mechanism.

Caching in Apollo-Client:

Apollo-client (react, but also iOS and Android) implements in-Memory cache. As of Apollo Client 3.0, the InMemoryCache class is provided by the @apollo/client package. The local cache is used to maximize the performance of Apollo, by caching certain query results to minimize network interactions with Apollo Server.

From the implementation perspective, to initialize the in-Memory cache it is enough to create the object and pass it to the Apollo Client constructor. Please see the example below:

import { InMemoryCache, ApolloClient } from '@apollo/client';

const client = new ApolloClient({
  // ...other arguments...
  cache: new InMemoryCache(options)
});

There are many options to pass to the constructor of InMemoryCache, general the goal of passing those options is to:

customize the primary key of the cache,
customize the storage and retrieval of the fields,
manage client-side local state.

For advanced applications - serving both static and dynamic content the advanced configuration would be required to ensure that the local state works properly and the security of local storage is ensured.

From the penetration testing perspective is not important to know how the configuration is done, but the pentester must check what kind of data is stored in the in-memory cache and for how long it is stored there. The cache is simple key-value storage where the query results are normalized. By default the in-memory cache stores the triggered mutation with the passed parameters.

InMemoryCache generates a unique identifier for any object that includes a __typename field. To do so, it combines the object’s __typename with its id or _id field (whichever is defined). These two values are separated by a colon (:). For example, an object with a __typename of Task and an id of 14 is assigned a default identifier of Task:14. The key generation could be customized by developers, so the mechanism of key generation can differ in different applications. The cache can be accessed via the Developer Console in your favorite web browser. Please see below the example.

apollo_caching

The object window.APOLLO_CLIENT.cache.data.data has all the data stored in in-memory Apollo-client cache. The data should be examined if any personal data/passwords/credit card data is cached on the front-end. By default the object contains all the results of the query, along with the mutation parameters e.g. login mutation would have the email and password passed as the parameters, hence the in-memory cache would store those values. Example:

apollo_caching

Also, it is important to check how long the in-memory cache would stay valid in the JavaScript context. A high risk arises in the web application where cached data with PII would persist between sessions - the cache is not purged after the logout or session invalidation. Example:

apollo_caching

Developers can persist and rehydrate Apollo Cache from a storage provider like AsyncStorage or localStorage, the developers can use apollo3-cache-persist. Apollo3-cache-persist works with all Apollo caches. This is done by simply passing Apollo Cache and a storage provider to persist cache. By default, the contents of your Apollo Cache will be immediately restored asynchronously and persisted upon every write to the cache with a short configurable debounce interval.

import { AsyncStorage } from 'react-native';
import { InMemoryCache } from '@apollo/client';
import { persistCache } from 'apollo3-cache-persist';

const cache = new InMemoryCache();

persistCache({
  cache,
  storage: AsyncStorage,
}).then(() => {
  // Continue setting up Apollo as usual.
})

This is why the penetration tester needs to examine the web browser’s persistent storage to search for cache-sensitive data. In case the PII or other sensitive data persists between sessions, that must be reported as a security risk.

Recommendation for developers to fix the caching in-memory:

The developers should have an established framework, what kind of data can be saved to in-memory cache. Some of the data must never land in in-memory apollo client cache like credit-cards-related data. The developers can achieve that by using the type policies or @connection directive in the client side schema definition. https://www.apollographql.com/docs/react/caching/advanced-topics/#the-connection-directive / https://www.apollographql.com/docs/react/caching/cache-configuration/#typepolicy-fields
The mutation parameters must not be saved to the in-memory cache (especially the passwords from the login mutation). The developers can achieve that by using the type policies or @connection directive in the client-side schema definition. https://www.apollographql.com/docs/react/caching/advanced-topics/#the-connection-directive / https://www.apollographql.com/docs/react/caching/cache-configuration/#typepolicy-fields)
In-memory cache must be purged after log out of other session invalidation. This can be achieved by calling the ApolloClient method resetStore(). https://www.apollographql.com/docs/react/caching/advanced-topics/#resetting-the-store
In case of persistent storage usage for in-memory cache (local storage, session storage, cookies), no PII must be saved to store persistently.

Checklist for pentesters for testing caching in Apollo-Client:

Inspect the data saved in the “window.APOLLO_CLIENT.cache.data.data”, look for the data that should never land in cache - like passwords, secret values, credit cards data. Inspect both the query results and the mutation parameters.
Check if the cache is purged, once the user logs out from its session. If data stays in the cache after the logout, this is considered a security risk.
Check if session storage, local storage, async storage, or cookies are used to persist in-memory apollo client cache. If yes, examine the data stored there to check for PII and other sensitive data persisting between sessions.

Caching in Apollo-Server:

Apollo Server provides a mechanism for server authors to declare fine-grained cache control parameters on individual GraphQL types and fields, both statically inside your schema using the @cacheControl directive and dynamically within your resolvers using the info.cacheControl.setCacheHint API.

For each request, Apollo Server combines all the cache hints from all the queried fields and uses them to power several caching features. These features include HTTP caching headers for CDNs and browsers and a GraphQL full response cache. The directive can be used in the schema or resolver. Schema usage:

type Post @cacheControl(maxAge: 240) {
  id: Int!
  title: String
  author: Author
  votes: Int @cacheControl(maxAge: 30)
  comments: [Comment]
  readByCurrentUser: Boolean! @cacheControl(scope: PRIVATE)
}

type Comment @cacheControl(maxAge: 1000) {
  post: Post!
}

type Query {
  latestPost: Post @cacheControl(maxAge: 10)
}

Resolver usage:

const resolvers = {
Query: {
post: (_, { id }, _, info) => {
info.cacheControl.setCacheHint({ maxAge: 60, scope: 'PRIVATE' });
return find(posts, { id });}}}

The directive can specify max-age and scope: PUBLIC or PRIVATE, which would then determine the values in the HTTP response Cache-Control header, leading to caching certain things in CDNs. Additionally, the Cache-Control can be set globally - this is the riskiest configuration, during the ApolloServer initialization.

const server = new ApolloServer({
// ...
cacheControl: {
defaultMaxAge: 5,},
}));

It is important to note, that the penetration tester must examine the ( white box if code provided, or black box through the Cache-Control headers) if any PII or sensitive data is set up to be cached as this may lead to leakage of that information. Also, even if the PIIs are set to cache on the CDN, the penetration tester must be aware that the response to POST requests is hardly ever cached. Due to that fact, for successful exploitations, all the queries can be transformed from POST to GET. This must be done manually, for example below, normal query through POST:

apollo_caching

POST query changed to GET:

apollo_caching

URL decoded:

apollo_caching

In this case, in the response, you can see that the Cache-Control header was set to not cache these values - that have to be examined either by checking the white box (examine the @cacheControl directive) or black-box, examine the Cache-Control header for each query. If the server-side caching config is broken and there are some PII that would return the cacheable response, we can make the exploitation even easier by utilizing the Automatic Persisted Queries - which is going to be described later.

Let’s analyze the broken cache behavior on the example. For the sake of simplicity, let’s say that we have added the cacheControl directive on the ApolloServer initialization:

const server = new ApolloServer({
  // ...
  cacheControl: {defaultMaxAge: 30},
});

This config means, that all returned data would have the Cache-Control header set to public with max-age 30 seconds. Then, to not cause private user data caching on our CDN we have set more granular control in our schema.

...
type User {
  id: ID! @cacheControl(scope: PRIVATE, maxAge: 10)
  email: String! @cacheControl(scope: PRIVATE, maxAge: 20)
  trips: [Launch]!
  token: String @cacheControl(scope: PRIVATE, maxAge: 0)
}
...

As you can see the behavior of the Cache-Control header was modified for some of the fields, but unfortunately, the developers forgot to change the Cache-Control for the user’s trips. Let’s see how that would look like in practice. The following screenshot presents the response to the LaunchDetails query. The LaunchDetails query is not user-specific and the results of that query would be the same for all users. This is why the expected caching behavior is to cache the response. This is why in the response we can see the Cache-Control set to public with max-age=30. At this point, we can start the analysis of our potentially vulnerable configuration.

apollo_caching

Let’s examine the query, GetMyTrips. The query returns data from the user (me) object, as we specified in the schema the graphQL analyzes all the @cacheControl directives for the data we requested and it returned the Cache-Control header which meets the strictest criteria of cache control. In this case, it returned the private, max-age=10 as we set those values on the id parameter.

apollo_caching

By removing the id parameter from the query, we can confirm that the second the strictest cache-control was assigned to the email attribute of the user object (private, max-age=20). From the returned Cache-Control response we can assume that the results of both queries would not be cached on the CDN (if CDN respects the origin’s cache-control headers).

apollo_caching

Unfortunately, we can expose potentially cacheable responses by removing both id and email from the query object and request only the user’s trips. The request returned the cache-control: max-age=30, public response, which was default one set globally on the ApolloServer initialization (there was not @cacheControl directive on the field in the schema). This response has the potential to be cached on the CDN, which may result in user’s private data leakage.

apollo_caching

The solution would be to set all data which not supposed to be cached on the object level with maxAge 0.

...
type User (scope: PRIVATE, maxAge: 0) {
  ...
}
...

Or the best solution would be to set no-cache, no-store, maxAge=0 by default to all queries and specify the caching for queries that you want to cache. (Caching whitelist approach)

Automatic Persisted Queries (APQ):

To improve network performance for large query strings, Apollo Server supports Automatic Persisted Queries (APQ). A persisted query is a query string that’s cached on the server-side, along with its unique identifier (always its SHA-256 hash). Clients can send this identifier instead of the corresponding query string, thus reducing request sizes dramatically (response sizes are unaffected).

To persist a query string, Apollo Server must first receive it from a requesting client. Consequently, each unique query string must be sent to Apollo Server at least once. After any client sends a query string to persist, every client that executes that query can then benefit from APQ. https://www.apollographql.com/docs/apollo-server/performance/apq/

The exploitation (attacking the victim) with APQ can be even easier than, normal GET queries as APQ are specifically created to be cached on CDNs and the documentation of apollo specifically explains how to do this. In case the caching is misconfigured. The exploitation should be tried using the APQ. This can be done by:

Map your query from POST to GET.
Once it is done - generate the sha256sum of your query.

Send the following request:

curl -g 'http://<your_target>:4000/?query=<your_query>&extensions={"persistedQuery":{"version":1,"sha256Hash":"<sha256hash>"}}'

The previous requests cached your query using APQ.

To run the same query, remove the query value and send the same hash in the extension

curl -g 'http://<your_target>:4000/?extensions={"persistedQuery":{"version":1,"sha256Hash":"<sha256hash>"}}'

If the PII was cached on the CDN (wrong settings of the cache) it might be that you will be able to leak other’s data. At this point, you may ask, what is the reason for testing using the APQ, as the Cache-Control HTTP response header is properly set by the server. The answer is that the CDNs do not always accept Origin’s Cache-Control headers, so APQ queries may be cached on CDN even though the Cache-Control header is set to no-cache, no-store.

Caching full response server-side with in-memory LRU or external cache like Redis, Memcached:

In case of the wrong configuration of full-response server-side caching, you will notice information leakage just by browsing the application and observe other users’ data in the responses.

The vulnerable configuration would be caching all response, default configuration:

import responseCachePlugin from 'apollo-server-plugin-response-cache';
const server = new ApolloServer({
  // ...
  plugins: [responseCachePlugin()],
});

To resolve the issue a separate caching context must be applied to each user session.

import responseCachePlugin from 'apollo-server-plugin-response-cache';
const server = new ApolloServer({
  // ...
  plugins: [responseCachePlugin({
    sessionId: (requestContext) => (requestContext.request.http.headers.get('sessionid') || null),
  })],
});

Recommendation for developers to fix caching in Apollo-server:

Ensure that your caching of data on the CDN and web browser excludes the queries with PII. Use the @cacheControl directive on the schema level to granularly control which fields should be cached.
If not being utilized, disable the Advanced Persistent Queries. Otherwise, ensure that there are no caching issues with the APQ queries, caching of APQ is allowed only for non-PII/public content.
Only non-PII and possibly public data, should be cached server-side using the in-memory LRU or external cache providers.

Checklist for pentesters for testing caching in Apollo-Server:

Examine all query responses cache-related headers - Cache-Control HTTP header. Usually, you would find that the header is set to no-cache, no-store so the results of the queries won’t be cached in the web browser or CDN. But the cache-headers may be wrongly set to cache the PII on CDNs. When testing the black-box, remember that the @cacheControl directive can be as granular as a single field, which is why when testing we need to limit our queries to the single fields and enumerate through all fields to check the caching.
The Apollo-sever features allow to not only send the queries using POST but also send the queries using the GET HTTP method. If the caching headers are set to cache the PII content, but it does not land in CDN’s cache. Try to convert the queries from POST to GET - get requests are usually cached by CDN’s.
If you have access to code, examine if the cacheControl directive was used either globally or on the schema level to make the caching examine process more smooth.
At the end try to use APQ - Automatic Persisted Queries mechanism described in this chapter - it might be that the APQ is or would be cached on the CDN (as that was the reason to create them), so it may lead to the data leakage. Also, please note that sometimes CDNs do not respect the Cache-Control header, which is why it is important to check. Sample attack scenario:
- When using the APQ the CDN caches the queries result with PII
- Victim is authenticated to the application, the attacker sends him the link to perform the query with APQ to cache its value response on CDN.
- Then the attacker uses the know sha256Hash to perform the APQ query, retrieving the cached data from CDN instead of directly from the server which leads to bypassing the access control.
- Additionally, check if other users’ data is not served to you from in-memory LRU or external cache (it is pretty rare and you will quickly notice that information leakage). You will get the other user’s data returned for non-parameterized queries e.g. query={me{transactions}} would return the transactions of other users.

Try yourself

You can try playing with Apollo Caching issues yourself using the Damn Vulnerable Apollo Caching app, link: https://github.com/niebardzo/damn-vulnerable-apollo-caching

References:

The Sample vulnerable examples are done by modifying the full-stack apollo tutorial: https://www.apollographql.com/docs/tutorial/introduction/
Modified Full-Stack Apollo can be found here: https://github.com/niebardzo/damn-vulnerable-apollo-caching