Everything you need to know about caching in Java & Spring Boot (plus Redis)

Everything you need to know about caching in Java & Spring Boot (plus Redis)

Caching is a technique of storing frequently used data in a fast and accessible memory, such as RAM, to reduce the latency and cost of retrieving it from a slower and more expensive source, such as a database or a network.

Caching can improve the performance, scalability, and reliability of an application, but it also introduces some challenges and trade-offs, such as cache coherence, consistency, invalidation, and eviction.

Types of caching

There are different types of caching, depending on the scope, location, and implementation of the cache. Some common types are:

  • Local caching: The cache is stored within the same JVM as the application, and it is only accessible by that application. This type of caching is simple and fast, but it has limited capacity and does not support distributed or concurrent access. An example of a local cache is a HashMap or a ConcurrentHashMap that stores key-value pairs of data.
  • Distributed caching: The cache is stored in a separate JVM or a cluster of JVMs, and it is accessible by multiple applications or instances. This type of caching is scalable and resilient, but it requires network communication and synchronization, which can introduce latency and complexity. An example of a distributed cache is Ehcache, Hazelcast, or Redis, which provide distributed data structures and APIs for caching.
  • Client caching: The cache is stored on the client side, such as a web browser or a mobile device, and it is only accessible by that client. This type of caching can reduce the load on the server and improve the user experience, but it depends on the client’s capabilities and preferences, and it may not reflect the latest data on the server. An example of a client cache is the HTTP cache, which stores web resources locally and uses headers to control their freshness and validity.

Practical example

To implement caching in Java, there are several frameworks and libraries that provide various features and options, such as Spring Cache, Guava Cache, Caffeine, and JCache, which is a standard API for caching in Java. These frameworks and libraries can help you to create and manage caches, as well as to configure their properties, such as size, expiration, eviction, and concurrency.

Here is a simple example of using Spring Cache to cache the result of a method that queries a database:

// Enable caching in the application
@EnableCaching
@SpringBootApplication
public class Application {
  // ...
}

// Define a service that uses caching
@Service
public class BookService {

  // Inject a repository that accesses the database
  @Autowired
  private BookRepository bookRepository;

  // Annotate the method with @Cacheable to cache its result
  // Specify the name of the cache and the key of the cached data
  @Cacheable(value = "books", key = "#isbn")
  public Book findBookByIsbn(String isbn) {
    // This method will only be executed if the cache does not contain the data
    // Otherwise, the cached data will be returned
    return bookRepository.findByIsbn(isbn);
  }
}

Caching in Spring Boot

Spring Boot provides a powerful abstraction for caching, which allows us to use various cache providers with minimal configuration and annotation-based caching.

To work with cache in Spring Boot, we need to follow these steps:

  • Add the spring-boot-starter-cache dependency to our pom.xml or build.gradle file.
  • Enable caching in our application by adding the @EnableCaching annotation to one of our configuration classes.
  • Choose a cache provider that suits our needs. Spring Boot supports several cache providers, such as JCache, EhCache, Caffeine, Redis, and more. We can also use the default ConcurrentMapCacheManager, which uses a simple ConcurrentHashMap as the cache store.
  • Annotate the methods that we want to cache with the @Cacheable annotation. We can specify the name of the cache, the key, the condition, and other attributes to customize the caching behavior.
  • Optionally, we can also use other caching annotations, such as @CachePut, @CacheEvict, and @Caching, to update or remove data from the cache.

Here is an example of a simple book repository that uses caching with Spring Boot:

package com.example.caching;

import org.springframework.cache.annotation.Cacheable;
import org.springframework.stereotype.Component;

@Component
public class SimpleBookRepository implements BookRepository {

    @Override
    @Cacheable("books")
    public Book getByIsbn(String isbn) {
        simulateSlowService();
        return new Book(isbn, "Some book");
    }

    // Don't do this at home ;)
    private void simulateSlowService() {
        try {
            long time = 3000L;
            Thread.sleep(time);
        } catch (InterruptedException e) {
            throw new IllegalStateException(e);
        }
    }
}

In this example, we use the @Cacheable(“books”) annotation to cache the result of the getByIsbn method in a cache named “books”. This means that the first time we call this method with a given isbn, it will invoke the slow service and return the book. The next time we call this method with the same isbn, it will return the cached book without invoking the slow service.

We can also use the @CachePut annotation to update the cache with the result of a method invocation. For example, if we have a save method that persists a book to a database, we can use @CachePut(“books”) to update the cache with the saved book. Similarly, we can use the @CacheEvict annotation to remove an entry from the cache. For example, if we have a delete method that removes a book from the database, we can use @CacheEvict(“books”) to evict the book from the cache.

Caching scenarios

Caching costly operations

One of the common scenarios where caching is useful is when we have a method that performs a costly or time-consuming operation, such as querying a database, calling a web service, or performing a complex calculation. By caching the result of such a method, we can avoid repeating the operation for subsequent calls with the same input, and thus save time and resources.

For example, suppose we have a service that calculates the factorial of a given number:

@Service
public class MathService {

  public BigInteger factorial(int number) {
    // This method may take a long time for large numbers
    BigInteger result = BigInteger.ONE;
    for (int i = 2; i <= number; i++) {
      result = result.multiply(BigInteger.valueOf(i));
    }
    return result;
  }
}

We can use the @Cacheable annotation to cache the result of this method in a cache named “factorials”. We can also specify the key of the cached data, which is the number parameter in this case:

@Service
public class MathService {

  @Cacheable(value = "factorials", key = "#number")
  public BigInteger factorial(int number) {
    // This method will only be executed if the cache does not contain the data
    // Otherwise, the cached data will be returned
    BigInteger result = BigInteger.ONE;
    for (int i = 2; i <= number; i++) {
      result = result.multiply(BigInteger.valueOf(i));
    }
    return result;
  }
}

Now, the first time we call this method with a given number, it will invoke the factorial calculation and return the result. The next time we call this method with the same number, it will return the cached result without invoking the calculation.

Cache a result of a method

Another scenario where caching is useful is when we want to update the cache with the result of a method invocation, without interfering with the method execution. For example, suppose we have a service that updates the profile of a user:

@Service
public class UserService {

  @Autowired
  private UserRepository userRepository;

  public User updateProfile(User user) {
    // This method updates the user in the database and returns the updated user
    return userRepository.save(user);
  }
}

We can use the @CachePut annotation to update the cache with the updated user, so that the next time we query the user by id, we get the latest data. We can also specify the name of the cache and the key of the cached data, which is the user id in this case:

@Service
public class UserService {

  @Autowired
  private UserRepository userRepository;

  @CachePut(value = "users", key = "#user.id")
  public User updateProfile(User user) {
    // This method updates the user in the database and returns the updated user
    // The cache is also updated with the updated user
    return userRepository.save(user);
  }
}

Now, the cache will always contain the most recent version of the user, and we can avoid querying the database for the user by id.

Removing entries from cache

A third scenario where caching is useful is when we want to remove an entry from the cache, either manually or automatically. For example, suppose we have a service that deletes a user by id:

@Service
public class UserService {

  @Autowired
  private UserRepository userRepository;

  public void deleteUserById(Long id) {
    // This method deletes the user from the database
    userRepository.deleteById(id);
  }
}

We can use the @CacheEvict annotation to evict the user from the cache, so that the cache does not contain stale data. We can also specify the name of the cache and the key of the cached data, which is the id parameter in this case:

@Service
public class UserService {

  @Autowired
  private UserRepository userRepository;

  @CacheEvict(value = "users", key = "#id")
  public void deleteUserById(Long id) {
    // This method deletes the user from the database
    // The cache is also evicted with the deleted user
    userRepository.deleteById(id);
  }
}

Now, the cache will not contain the deleted user, and we can avoid returning invalid data from the cache.

How does it all differ from Redis?

Caching in Redis is different from caching in Java in several ways. Redis is an in-memory data structure store that can act as a distributed cache, and it supports various data types, such as strings, lists, sets, hashes, and streams. Redis also provides features such as replication, persistence, transactions, pub/sub, and Lua scripting. Some of the differences are:

  • Redis is a separate process from the Java application, and it requires a client library, such as Jedis, Lettuce, or Redisson, to communicate with it. Java caching frameworks and libraries are usually integrated with the application and do not need external dependencies.
  • Redis can store and cache data in various formats and structures, while Java caching frameworks and libraries typically use key-value pairs or objects. Redis also supports operations and commands on different data types, such as sorting, ranking, aggregating, and streaming, which can be useful for complex caching scenarios.
  • Redis can scale horizontally by adding more nodes to the cluster, and it can handle high availability and failover by using replication and sentinel. Java caching frameworks and libraries may have different ways of scaling and handling failures, depending on their implementation and configuration.
  • Redis can persist data to disk or snapshot it to another location, which can prevent data loss in case of a crash or a restart. Java caching frameworks and libraries may or may not support persistence, and they may use different strategies, such as write-through, write-behind, or write-around, to synchronize the cache and the source of truth.

Using Redis with Spring Boot

Spring Boot uses the spring-boot-starter-cache dependency to enable caching in the application. This dependency provides an abstraction layer for various cache providers, such as JCache, EhCache, Hazelcast, Infinispan, Couchbase, Redis, Caffeine, and Simple 

By default, Spring Boot will auto-configure a ConcurrentMapCacheManager, which uses a simple ConcurrentHashMap as the cache store.

If you want to use Redis as the cache provider, you need to add the spring-boot-starter-data-redis dependency to your project, which brings in all the required dependencies for Redis integration.

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-redis</artifactId>
</dependency>

You also need to have a Redis server running on your machine or on a remote host. Spring Boot will automatically configure a RedisCacheManager with default cache configuration, but you can also customize it by registering configuration beans. For example, you can set the time-to-live values, the serialization strategy, and the cache names for different caches.

Then, you need to configure the Redis connection properties in your application.properties file:

spring.redis.host=localhost
spring.redis.port=6379

Now, you can use the RedisTemplate or the StringRedisTemplate to perform operations on the Redis server. For example, you can set and get values using the opsForValue() method:

@Service
public class RedisService {

    @Autowired
    private RedisTemplate<String, Object> redisTemplate;

    public void setValue(String key, Object value) {
        redisTemplate.opsForValue().set(key, value);
    }

    public Object getValue(String key) {
        return redisTemplate.opsForValue().get(key);
    }
}

You can also use the @Cacheable, @CachePut, and @CacheEvict annotations to cache the results of your methods in Redis. For example, you can cache the result of a method that queries a database by isbn:

@Service
public class BookService {

    @Autowired
    private BookRepository bookRepository;

    @Cacheable(value = "books", key = "#isbn")
    public Book findBookByIsbn(String isbn) {
        return bookRepository.findByIsbn(isbn);
    }
}

This way, the first time you call this method with a given isbn, it will query the database and return the book. The next time you call this method with the same isbn, it will return the cached book from Redis without querying the database.

When do I want to use Redis, and when is Spring's default cache enough?

As previously mentioned, Spring’s default cache is the ConcurrentMapCacheManager, which uses a simple ConcurrentHashMap as the cache store. This cache is easy to use and fast, as it does not require any external dependencies or network communication. However, it has some limitations, such as:

  • It is local to the JVM, which means it does not support distributed or concurrent access by multiple applications or instances.
  • It has limited capacity, as it depends on the available memory of the JVM.
  • It does not support advanced features, such as expiration, eviction, persistence, or transactions.
  • Redis is an in-memory data structure store that can act as a distributed cache2. It supports various data types, such as strings, lists, sets, hashes, and streams. It also provides features such as replication, persistence, transactions, pub/sub, and Lua scripting. Some of the advantages of using Redis as a cache provider are:
  • It can scale horizontally by adding more nodes to the cluster, and it can handle high availability and failover by using replication and sentinel.

However, using Redis as a cache provider also has some drawbacks, such as:

  • It requires a separate process from the Java application, and it requires a client library, such as Jedis, Lettuce, or Redisson, to communicate with it.
  • It introduces network latency and complexity, as it needs to synchronize the data between the cache and the source of truth.
  • It consumes more memory than a simple ConcurrentHashMap, as it stores more metadata and overhead for each entry.

Therefore, the choice of cache provider depends on the trade-offs between simplicity, speed, capacity, scalability, and reliability. Some general guidelines are:

  • Use Spring’s default cache if you have a simple and single-instance application that does not need to cache a large amount of data or complex data structures, and does not require distributed or concurrent access.
  • Use Redis as a cache provider if you have a complex and distributed application that needs to cache a large amount of data or various data types, and requires scalability and resilience.
  • It can store and cache data in various formats and structures, and it supports operations and commands on different data types, such as sorting, ranking, aggregating, and streaming.
  • It can persist data to disk or snapshot it to another location, which can prevent data loss in case of a crash or a restart.

Conclusion

In this article, we have discussed some of the types, frameworks, and libraries of caching in Java, and how they compare to caching in Redis.

We have learned that caching in Java and caching in Redis have different advantages and disadvantages, depending on the use case and the requirements. Therefore, it is important to choose the right cache provider and configuration for our application, and to test and tune the caching behavior to achieve the optimal results. Some general guidelines are:

  • Use Spring’s default cache if you have a simple and single-instance application that does not need to cache a large amount of data or complex data structures, and does not require distributed or concurrent access.
  • Use Redis as a cache provider if you have a complex and distributed application that needs to cache a large amount of data or various data types, and requires scalability and resilience.

Caching is a powerful and useful technique, but it also requires careful design and tuning to avoid potential pitfalls and trade-offs. We hope this article has given you some basic understanding of caching in Java and how it differs from caching in Redis, and that you will explore more about this topic in the future.

Read more