Optimizing state-based CRDTs (part 1)

Updating state with deltas

As I mentioned in previous post, a state-based CRDT replication depends on the ability to serialize and disseminate the entire data structure over the network. While this approach has some advantages - i.e. very simple and straightforward replication protocol - it also comes with a cost.

Imagine that you have a set of 1000 elements. Now, when we'll try to add another item to it, the only way to synchronize this replica with others is to send an entire payload over the wire. It means serializing and sending 1001 elements, even though most of them have remain unchanged. This is even more painful given the fact that state-based CRDTs often carry additional metadata with them.

To face this problem, an alternative approach has been proposed: a delta-state CRDT. As the name suggests, instead of sending an entire state, we're going to accumulate the updates since the last synchronization as part of change set (delta). Once sync message was send, we simply reset delta and start from scratch. This way we'll send only the unseen parts.

This design doesn't stop us from performing full-state merge from time to time: delta-aware CRDTs still maintain the semantics of a state-based ones. Keep in mind that in terms of persistence, the delta itself doesn't have to be stored on disk.

If you want to play with the example data structures, I'm presenting here, feel free to take a look at this repository.

Delta-state growing-only counter

Let's start from defining some helper functions, that we'll use later on:

[<AutoOpen>]
module Helpers =

    /// Insert or update the value of the map.
    let upsert k v fn map =
        match Map.tryFind k map with
        | None -> Map.add k v map
        | Some v -> Map.add k (fn v) map

    /// Option-aware merge operation.
    let mergeOption merge a b =
        match a, b with
        | Some x, Some y -> Some (merge x y)
        | Some x, None   -> Some x
        | None, Some y   -> Some y
        | None, None     -> None

Now, let's write the G-Counter implementation and the explanation will follow:

type GCounter = GCounter of values:Map<ReplicaId, int64> * delta:GCounter option

[<RequireQualifiedAccess>]
module GCounter =
    /// Empty G-counter
    let zero = GCounter(Map.empty, None)
    /// Compute value of the G-counter
    let value (GCounter(v, _)) =
        v |> Map.toSeq |> Seq.map snd |> Seq.fold (+) 0L
    /// Increment G-counter value for a given replica.
    let inc r (GCounter(v, d)) =
        let add1 map = upsert r 1L ((+) 1L) map
        let (GCounter(dmap,None)) = defaultArg d zero
        GCounter(add1 v, Some(GCounter(add1 dmap, None)))
    /// Merge two G-counters.
    let rec merge (GCounter(a, da)) (GCounter(b, db)) = 
        let values = a |> Map.fold (fun acc k va -> upsert k va (max va) acc) b
        let delta = mergeOption merge da db
        GCounter(values, delta)
    /// Merge full-state G-counter with G-counter delta.
    let mergeDelta delta counter = merge counter delta
    /// Split G-counter into full-state G-counter with empty delta, and a delta itself.
    let split (GCounter(v, d)) = GCounter(v, None), d

While this example is more complex than the original GCounter implementation, we again try to keep things fairly easy. Here, we're gonna represent GCounter values as map of partial counters for each replica (just like the last time)... while the detla is optionally a GCounter itself! Why? Because of composition of course :) This way we can merge counters with deltas and deltas themselves using the same merge operation we defined before. In this design the delta of a delta counter will always be None - we don't use it, so there's no risk of creating possibly infinite recursive data structure.

Delta-state increment/decrement counter

Previously, we've build a PNCounter using two GCounters (one for incremented and one for decremented values). While you may ask, how will this work now? This is pretty simple: we'll build a delta of a PNCounter dynamically from deltas of its two components.

type PNCounter = PNCounter of inc:GCounter * dec:GCounter

[<RequireQualifiedAccess>]
module PNCounter =
    let zero = PNCounter(GCounter.zero, GCounter.zero)
    let value (PNCounter(inc, dec)) = GCounter.value inc - GCounter.value dec
    let inc r (PNCounter(inc, dec)) = PNCounter(GCounter.inc r inc, dec)
    let dec r (PNCounter(inc, dec)) = PNCounter(inc, GCounter.inc r dec)
    let rec merge (PNCounter(inc1, dec1)) (PNCounter(inc2, dec2)) =
        PNCounter(GCounter.merge inc1 inc2, GCounter.merge dec1 dec2)
    let mergeDelta delta counter = merge counter delta
    let split (PNCounter(GCounter(inc, a), GCounter(dec, b))) = 
        let delta =
            match a, b with
            | None, None -> None
            | _, _       -> 
                let inc = defaultArg a GCounter.zero
                let dec = defaultArg b GCounter.zero
                Some <| PNCounter(inc, dec)
        PNCounter(GCounter(inc, None), GCounter(dec, None)), delta

I think, the only tricky part here is delta construction. We want it to be a (optionally) PNCounter itself - because of this way we still can preserve commutativity, associativity and idempotency rules, all of our CRDT conform to, but also to reuse implementations of existing functions. Thankfully we can simply build PNCounter from two GCounter and since GCounter.delta itself is a GCounter instance, we can leverage that fact. If one of the deltas is not provided, we simply use a zero element.

Delta-state growing-only set

For the GSet, its implementation is pretty analogous to GCounter:

type GSet<'a when 'a: comparison> = GSet of values:Set<'a> * delta:GSet<'a> option

module GSet =
    let zero = GSet(Set.empty, None)
    let value (GSet(s, _)) = s
    let add elem (GSet(v, d)) =
        let (GSet(delta, None)) = defaultArg d zero
        GSet(Set.add elem v, Some(GSet(Set.add elem delta, None)))
    let rec merge (GSet(a, da)) (GSet(b, db)) = 
        let values = a + b
        let delta = Helpers.mergeOption merge da db
        GSet(values, delta)
    let mergeDelta delta gset = merge gset delta
    let split (GSet(v, d)) = GSet(v, None), d

Just like in case of original state-based CRDTs, we can take advantage of composability.

The technical difference here is that for G-Sets we should acknowledge either deltas or full state updates being broadcasted to all replicas. It was not so necessary for the counter implementations, as if we sent two following updates (D1, D2), if a first one (D1) didn't reach the target, but the later one (D2) did, we'll still end up in correct state - simply because D2 already override the partial counter value of D1.

Final notes about replication protocol

Up to this point our replication protocol is still pretty simple - the only requirement here is that we have to eventually be able to reach every single replica. However this doesn't have to occur immediately and doesn't have to occur in order (read: you're not constrained to use TCP for delta replication).

In the code presented above I'm merging deltas as part of a merge operation. This however is not a requirement, i.e. if you disseminate delta right after the operation being performed, you may as well clear deltas on each merge.

Summary

This part presented how to optimize counters and simple growing only set to take advantage of delta-state CRDTs. In the next post we'll take a closer look as specific optimizations targeting OR-Sets.

Optimizing state-based CRDTs (part 1)

Updating state with deltas

Delta-state growing-only counter

Delta-state increment/decrement counter

Delta-state growing-only set

Final notes about replication protocol

Summary

Comments

Read Next

How does continuous backup and point-in-time recovery work in databases

Conflict-free Replicated Spread Sheets

Updating state with deltas

Delta-state growing-only counter

Delta-state increment/decrement counter

Delta-state growing-only set

Final notes about replication protocol

Summary

Comments

Read Next

Tags