This article is about building a distributed cache like Redis and learning the concepts around it.

Things we will cover

  • RESP protocol communication
  • Client interface to interact with Redis
  • Handling load
  • Persistence in both modes, just like Redis: append-only file and RDB file
  • Crash recovery
  • Replication (in next part)
  • Master / Replica architecture (in next part)

Lets dive in.

First and foremost, the most essential part is communication between the client and server. A cache does not work well with an HTTP-based protocol because we need to establish a connection with every request this takes time at scale, following the three-way handshake, then the payload exchange, and then freeing the resource, which is too slow.

We will proceed with WebSockets and reuse the already established connection channel, which will help conserve resources at scale.

Redis works using the RESP protocol, so I created a small implementation of this protocol for parsing and transferring bytes over the channel. Redis docs for RESP and my implementation here for the project.

Whenever the client receives a command, it establishes a connection with the server. The commands are serialized and converted into bytes, then transferred to the server, which deserializes them and sends them to the command parser module.

I have kept only the essential commands that serve the purpose of a cache: GET, SET with expiry, and a batch SET to simulate how load is handled when multiple commands hit the server.

Client Side

  1. Command Interface file
  2. Connection Manager file
  3. RESP file, Config file

The main point of focus here is the Connection Manager, which is essential. What it does is reuse the socket connection if the next command arrives within the timeout window, defaulting to 5 seconds.

So when we are under bulk load, the Connection Manager reuses the already established connection to avoid the handshake overhead, saving both time and resources. Additionally, the call to the server is non-blocking as soon as one command is sent, we can send another without the thread waiting for the IO to complete. When the server sends back the response, the thread resumes.

Server Side

The server receives raw bytes, converts them into RESP format, deserializes them, and then passes them to the command parser.

Command

It makes sense of each piece, finds the command type, keys, values, and other parameters, and then executes it. The code here misses many things in terms of separation of responsibility this was made a year ago.

The architecture should be structured as follows: a stateless Command Parser that returns a Command object, a Command itself as just a data object, and a ProcessCommand module to execute it and perform the operation. This should be followed by the Command Pattern for each command, which can be inherited to implement the specific operation.

Command file

Storage

This class is my access point to all the in-memory key-value pairs. It stores and fetches key-based objects, and that’s it. One additional thing it does is make a copy of the entire memory map.

Storage file

Persistence

Persistence in Redis is handled in 2 main ways: keeping a snapshot file which is a snapshot of the memory state, and the other is a log of all write commands.

RDB Snapshot

In this method, I have created a scheduler that runs every x minutes and takes a copy of memory and saves it to a file. Files are saved in FIFO order, so the oldest file is removed and the latest is added, keeping at least 5 files saved in the directory — depending on which version of the snapshot the program wants to restore.

The file is stored with a checksum to validate that the file data is not corrupted.

In Redis, this is performed by creating a fork of the current process. Since Windows does not have the option of forking, I created a deep copy instead, so the RDB operation does not conflict with the ongoing transactions of the main memory space.

RDB file

Append Only File

In this method, every write command or related command is kept in a log file. This is written instantly, and in my project, asynchronously. RDB loses some data that is created between two snapshots, but this method loses no data it records every modification and addition to memory.

To restore, just replay the file and it will restore the memory state as it was. There is also a compaction process that stops the file from growing infinitely, but I have not added that in this project.

AOF file

Which one to choose?

Choose both, based on the scenario. No data loss is a requirement for most use cases, but transferring the AOF file to a secondary replica is not the correct approach since it can get too large. That’s where snapshots come in — they also take less time to restore since we don’t have to replay everything from the start.

This project allows you to select one or both persistence methods and lets you choose the restoration method.

In the next part, I will be adding Master/Slave replication along with state sync among replicas.

Thanks for reading.

feel free to connect with me on (rahuldsoni2001@gmail.com)