Skip to main content

CAP Theorem and PACELE Theorem

What is CAP Theorem?

CAP Theorem, also known as Brewer's theorem, states that a distributed system can only simultaneously provide two out of the follwing three desirable properties: Consistency (C), Availability (A), Partition Tolerance (p).

  • Consistency (C): Users can read or write from/to any node in the system with all nodes returning the same and most-recently-written data at the same time, or return error/timeout when consistency cannot be guaranteed.

  • Availability (A): Every request received by a non-failing node in the system must return a response in a reasonable amout of time. Even when severe network failures occur, every request must terminate.

  • Partition tolerance (P): System's ability to continue operating even if there are partitions due to failed connection (even if nodes are working properly) in the system by having sufficiently duplicated data scattered across nodes to keep the system up before partitions are resolved.

This statement leaves us with three possible combinations: CA, CP, and AP.

However, one of the eight fallacies about the distributed system is that networks are reliable. In fact, network failure happens unpredictably and frequently (e.g packet loss, rourter malfunction), such that systems that don't guarantee partition tolerance in fact resembles a monolithic system and is not a coherent option.

Thus, the essence of CAP can be rephrased as: In the presence of a network partition, a distributed system must choose either Consistency or Availability.

What to choose? CP or AP?

  • High Availability (AP):

    A system is is considered High Availability when it aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period (wiki).

    Systems used in hospitals and data centers are examples that require high availability to perform daily routine activities.

    • Three principles are used when designing systems to ensure high availability:

      • Single points of failure: A single point of failure is a component that would cause the whole system to fail if it fails.

      • Reliable crossover: System must have redundant backup components to take over a failed one to perform reliable crossover or failove without losing data or affecting performance.

      • Failure detectability: Failures must be visible and, ideally, systems have built-in automation to handle the failure on their own.

  • High Consistency (CP):

What didn't CAP theorem cover?

Conclusion

CAP merely states the tradeoffs when designing a distributed system, and one system does not necessarily falls into a specific category as a whole. For instance, nodes performing login within a social network system like Twitter might choose consistency over availablity (CP), but nodes performing timeline related functions can allow a short period of inconsistency (AP).

Most importantly, long period of network partition, though inevitable, is in fact not that common in practice and thus we as programmers should still try to attempt CA but not strictly stick to the CAP theorem. As Eric Brewer quoted:

The modern CAP goal should be to maximize combinations of consistency and availability that make sense for the specific application. Such an approach incorporates plans for operation during a partition and for recovery afterward, thus helping designers think about CAP beyond its historically perceived limitations.

-- Eric Brewer, CAP Twelve Years Later: How the "Rules" Have Changed

Reference