Multi-Leader Replication
We discussed in Part 1 about Single Leader Replication. In this article, we’ll focus on the other type of replication algorithm, Multi-Leader Replication. But before that let us understand what caused the emergence of this new algorithm.
In single-leader, we have only one leader and all writes go through it, and henceforth it gets copied to other replicas. But what if the connection between the client and the leader gets broken(the leader is alive, only the connection to the client dies)? In this case, the application might get broken. And hence the concept of “Multi-leader” is born. It implies that now multiple nodes can take up writes and henceforth send the logs to other followers and leaders(which become followers to other leaders).
Let’s discuss a few cases where using a multi-leader replication would be reasonable, considering the complexity it adds to the entire configuration or architecture.
- Multi-datacenter operation: Let’s suppose you have a database and its replicas in a different data center. Since you have more than one data center, you need to have a leader in every data center. The interaction would look something like this :
In this case, this architecture can enhance the performance by reducing latency, which in the case of a single leader would have taken much time to replicate logs across multiple data centers. It also enhances the tolerance against data center outrages and network problems, which in the case of a single leader would have caused the entire distributed system to come down, thus increasing the locality of the system and making it more decoupled.
2. Clients with the offline operation: Let’s take an example to understand this. You set an event in your google calendar(while your phone is not connected to the internet) and later, let’s suppose you check the same synced calendar on your pc, and you find both the calendars have the same event. Now what goes behind this is multi-leader replication.
Here each of your devices has a local database and acts as a leader. So when you connect to the internet, there is an asynchronous multi-leader replication process between the replicas of the calendar on all your devices. Architecturally, it’s like each device is a data center and the network connection between them is unreliable. So now you can understand, why the calendar apps might not sometimes be in sync.
3. Collaborative writing: You must have at least once used google Docs or excel to edit when others would also be simultaneously editing the same. Here, also we use multi-leader replication. Let’s understand it in a reverse way, imagine the doc would have been using single-leader replication, so before a user edits, a lock must be put on the document(just like waiting for the leader to take all the writes), and once it’s done it would unlock the doc and the second user starts, thus making the whole point of collaboration vague. And so, multi-leader replication would help in by making every user leader and writing.
Now, in all the above use cases, there’s one big issue that overall dampens the entire advantage of this replication and that is “conflicts”. Conflicts occur when data is replicated or collaborated. Eg — Two writes concurrently modifying the same object on different leaders. So, when dealing with multi-leader replication, handling conflicts or “conflict resolution” is very important. Let’s discuss a few of them :
- Conflict avoidance: It is the simplest strategy to avoid conflicts. We just need to ensure that all writes for a particular record goes to the same leader, or more aptly to the same data center. It might look simple, but edge cases, when the entire data center is down or such may hamper the entire application.
- Convergent Conflict Resolution: In multi-leader replication, there is no defined ordering of writes, thus making it unclear what the final value should be. This inconsistency questions the durability of the data and every replication must ensure that the data is the same at all places. This method of handling conflicts can be done in various ways :
- LWW(Last Write Win) — Each write is given a unique ID and the write with the highest write is chosen as the winner.
- Give each replica a unique Id and let writes originated at higher-numbered replicas take precedence over the lower counterparts.
- Merge the values.
- Record the conflict in an explicit data structure that preserves all the information , and write application code that resolves conflict later by notifying the user.
3. Custom conflict resolution logic: Most multi-leader replication tools provide the option to custom define your conflict resolution in the application code. On write, as soon as a conflict is detected, the conflict handler is called, and it runs in the background to resolve it. On read, if a conflict is detected, all conflicting writes are stored and the next time data is read, these multiple versions of the data are returned to the application, which in turn prompts the user or automatically resolve the conflict, and write back to the database.
4. Automatic conflict resolution: There has been a lot of research on building automatic conflict resolutions which would be intelligent enough to resolve the conflicts caused by concurrent data modifications.
- Conflict-free replicated data types( CRDTs) are a family of data structures for sets, maps, ordered lists, counters, etc that can be concurrently edited by multiple users. It uses two-way merges.
- Mergeable data structure tracks history explicitly, just like it, and uses a three-way merge function
- Operational transformation is the algorithm behind collaborative editing applications such as Google docs. It’s a whole big topic which is very interesting to study.
Implementation of these algorithms is still young, and as more research goes into it, it would be soon integrated with our replicated data system and would make multi-leader data synchronization much simpler.
We have now seen so many conflict resolution techniques in multi-leader replication, but most of them are still not properly implemented, so it’s very important to be aware of these issues of “improper sequence of writes” or “casual ordering” while using it. Some databases support multi-leader, but mostly it is implemented with external tools, such as Tungsten Replicator for MySQL, BDR for PostgreSQL, and GoldenGate for Oracle.
You can also read more about the other type of replications :
Part 1: Single Leader Replication
Part 2: Replication logs and lags
Part 4: Leaderless Replication
Thanks for reading :)