Day 6: System Design(Consistency)

5 min readFeb 24, 2025

In distributed systems, data is spread across multiple servers, making it hard to keep all copies of data perfectly synchronized. Consistency models define how up-to-date and synchronized data should be when multiple users access a system.

Different applications require different levels of consistency. A banking system needs strong consistency, while a social media feed can work with eventual consistency.

What is Consistency in Distributed Systems?

  • Consistency ensures that all users see the same data across different servers.
  • When multiple copies of data exist (due to replication or sharding), the system must decide when and how to update them.
  • Trade-off: Stronger consistency = Slower performance, Weaker consistency = Faster performance.

Types of Consistency Models

Strong Consistency

  • Every read returns the most recent write, ensuring all users see up-to-date data.
  • Once data is updated, all replicas are immediately updated before allowing further reads.

Example:

  • Banking systems: A user transferring money must see the updated balance instantly.

Pros:

  • Guarantees accuracy and avoids conflicts.

Cons:

  • High latency and lower availability due to synchronization delays.

Eventual Consistency

  • Data updates do not happen immediately across all replicas.
  • Over time, all copies eventually become consistent, but users might see slightly outdated data.
  • Used in systems where high availability and speed are more important than real-time accuracy.

Example:

  • Social media posts (Twitter, Facebook): A tweet might appear on one server but take a few seconds to sync across all servers.

Pros:

  • Fast reads and writes, scalable for high-traffic applications.

Cons:

  • Users might see outdated or missing data temporarily.

Read-Your-Writes Consistency

  • A user always sees their own latest updates, even if the system is eventually consistent for others.

Example:

  • Google Docs: If you edit a document, you immediately see your changes, even if others don’t see them right away.

Pros:

  • Ensures personal changes are reflected instantly.

Cons:

  • Doesn’t guarantee other users see the latest updates immediately.

Monotonic Read Consistency

  • A user will never read older versions of data after seeing a newer version.
  • Ensures that once a user sees an update, they will always see at least that version.

Example:

  • Email systems: If you open an email on one device, it should not disappear when opened on another device later.

Pros:

  • Ensures a smooth user experience.

Cons:

  • Slightly increases complexity in data synchronization.

Causal Consistency

  • Ensures that if one operation logically depends on another, the dependent operation must see the latest data.
  • If Action A happens before Action B, then Action B should see the updated result of A.

Example:

  • Facebook comments: If a user likes a post, their like should appear before any new comments they make.

Pros:

  • Avoids logical errors in distributed applications.

Cons:

  • More complex than eventual consistency.

Linearizability vs. Sequential Consistency

  • Linearizability: Guarantees that all operations appear instantaneously in real-time order.
  • Sequential Consistency: Ensures all nodes see operations in the same order, but the order may not be real-time.

Example:

  • Stock trading platforms need linearizability to prevent users from buying the same stock twice.
  • Online multiplayer games use sequential consistency to ensure a fair order of moves.

ACID Transactions (Strong Consistency)

  • Atomicity: All operations in a transaction happen together or not at all.
  • Consistency: Data always remains valid before and after a transaction.
  • Isolation: Concurrent transactions don’t interfere with each other.
  • Durability: Once committed, a transaction is permanently stored.

Example:

  • Banking transactions (MySQL, PostgreSQL, Google Spanner) must follow ACID to prevent data loss.

BASE (Eventually Consistent)

  • Basically Available: System remains functional even if some nodes fail.
  • Soft State: System allows temporary inconsistencies.
  • Eventual Consistency: Data synchronizes over time, rather than immediately.

Example:

  • NoSQL databases (Cassandra, DynamoDB) use BASE to provide high availability and scalability.

Distributed Consensus Algorithms

Distributed Consensus Algorithms

To maintain consistency across multiple servers, distributed systems use consensus algorithms to ensure agreement on data changes.

1. Paxos Algorithm

  • A leader-based algorithm that ensures nodes agree on a single value, even with failures.
  • Works in high-fault environments but is complex to implement.

Example:

  • Google’s Chubby lock service uses Paxos to manage distributed configurations.

2. Raft Algorithm

  • A simpler alternative to Paxos, used for leader election and log replication.
  • Ensures all nodes agree on the latest data.

Example:

  • Etcd and Consul use Raft for distributed coordination in Kubernetes.

3. Two-Phase Commit (2PC)

  • A transaction protocol ensuring consistency across multiple databases.
  • Step 1: Prepare Phase — Nodes agree to commit or abort.
  • Step 2: Commit Phase — If all agree, the transaction is finalized.

Example:

  • Bank transfers between two accounts stored in different databases.

4. Three-Phase Commit (3PC)

  • Extends 2PC by adding a timeout to avoid deadlocks.
  • Ensures progress even if one node fails.

Example:

  • Used in large-scale financial systems where transactions span multiple regions.

Read and Write Consistency Techniques

Read Consistency Models

  • Strong Reads: Always returns the latest committed write.
  • Bounded Staleness Reads: Returns data that is slightly outdated but within a defined limit.
  • Monotonic Reads: Ensures that once a user sees a version, they never see an older version.

Example:

  • Google Drive ensures strong reads so a user always sees the latest document version.

Write Consistency Models

  • Read-Your-Writes: Ensures a user sees their own writes immediately.
  • Write-Through vs. Write-Behind Caching: Defines how quickly data is committed to storage.
  • Quorum-Based Writes: Requires a majority of nodes to confirm a write before considering it successful.

Example:

  • DynamoDB uses quorum-based writes to balance consistency and speed.

Conflict Resolution Strategies in Distributed Systems

In eventual consistency, different replicas may have conflicting data. These conflicts must be resolved.

1. Last-Write-Wins (LWW)

  • The latest timestamped update overwrites older ones.
  • Fast but may cause data loss if writes occur in different locations.

Example:

  • Redis cache expiration strategy uses LWW to handle updates.

2. Version Vectors

  • Each replica keeps track of its update history.
  • Helps identify conflicting versions and merge changes.

Example:

  • Amazon DynamoDB uses version vectors to track shopping cart updates.

3. Operational Transformation (OT)

  • Used in real-time collaborative applications.
  • Ensures users see consistent changes even if edits happen simultaneously.

Example:

  • Google Docs uses OT to merge real-time edits.

4. Conflict-Free Replicated Data Types (CRDTs)

  • Data structures designed to automatically merge conflicting updates.
  • Used for highly available distributed databases.

Example:

  • Facebook Messenger uses CRDTs to keep chat messages in sync.

Multi-Leader and Leaderless Replication

1. Leader-Based (Primary-Replica) Consistency

  • One server acts as a leader and handles writes, while replicas synchronize reads.
  • Provides strong consistency but can be a bottleneck.

Example:

  • MySQL with primary-replica replication.

2. Multi-Leader Replication

  • Multiple nodes accept writes, then synchronize with each other.
  • Solves single-leader failures but increases conflict resolution complexity.

Example:

  • Google Spanner uses multi-leader replication for global consistency.

3. Leaderless Replication

  • Any replica can process a read or write request.
  • Uses quorum-based consistency to determine the final state.

Example:

  • DynamoDB and Cassandra use leaderless replication for high availability.

Advanced Database Consistency Techniques

1. Global Consistency vs. Local Consistency

  • Global Consistency: Ensures that all data copies worldwide remain synchronized.
  • Local Consistency: Guarantees consistency only within a specific region.

Example:

  • Google Spanner offers global consistency, while AWS Aurora prioritizes local consistency.

2. Geo-Distributed Consistency Challenges

  • High latency in cross-region replication.
  • Time synchronization issues affect data ordering.
  • Conflict resolution becomes harder in multi-region deployments.

Example:

  • Cloud databases like CockroachDB solve geo-distributed consistency issues.

3. Hybrid Consistency Models

  • Many modern databases provide a mix of strong and eventual consistency.
  • Users can choose consistency levels per query to optimize performance.

Example:

  • Azure CosmosDB allows tunable consistency per request.

From strong consistency in banking systems to eventual consistency in social networks, different applications need different trade-offs.

I’ll be posting daily to stay consistent in both my learning followed by daily pushups. Thank you!

Follow my journey:
Medium: https://ankittk.medium.com/
Instagram: https://www.instagram.com/ankitengram/

--

--

Ankit Kumar
Ankit Kumar

No responses yet