Storage Limits

Distributed storage systems must navigate the trade-offs between Consistency, Availability, and Partition Tolerance (the CAP Theorem). This lab uses a distributed MinIO cluster and network chaos tools to visualize these trade-offs in action.

Learning Objectives

Experience network partitions and latency in a distributed storage system.
Understand the real-world implications of the CAP Theorem.
Observe quorum behavior and cluster recovery.

Prerequisites

Docker and Docker Compose installed
Basic understanding of distributed systems
A terminal with curl or a load-testing tool like ab (Apache Bench)

Step 1: Distributed Cluster Setup

We will deploy a 4-node distributed MinIO cluster. In a 4-node setup, MinIO requires a quorum of nodes to be available to perform operations.

Create a docker-compose.yml file:

services:
  minio1:
    image: quay.io/minio/minio
    volumes:
      - data1-1:/data1
    ports:
      - '9001:9000'
    environment:
      MINIO_ROOT_USER: admin
      MINIO_ROOT_PASSWORD: password
    command: server http://minio{1...4}/data{1...1}

  minio2:
    image: quay.io/minio/minio
    volumes:
      - data2-1:/data1
    ports:
      - '9002:9000'
    environment:
      MINIO_ROOT_USER: admin
      MINIO_ROOT_PASSWORD: password
    command: server http://minio{1...4}/data{1...1}

  minio3:
    image: quay.io/minio/minio
    volumes:
      - data3-1:/data1
    ports:
      - '9003:9000'
    environment:
      MINIO_ROOT_USER: admin
      MINIO_ROOT_PASSWORD: password
    command: server http://minio{1...4}/data{1...1}

  minio4:
    image: quay.io/minio/minio
    volumes:
      - data4-1:/data1
    ports:
      - '9004:9000'
    environment:
      MINIO_ROOT_USER: admin
      MINIO_ROOT_PASSWORD: password
    command: server http://minio{1...4}/data{1...1}

volumes:
  data1-1:
  data2-1:
  data3-1:
  data4-1:

Start the cluster:
```
docker-compose up -d
```
Verify: Access any node (e.g., localhost:9001) and ensure you can see the health of all 4 nodes in the dashboard.

Step 2: Baseline Benchmarking

Upload an object:

./mc alias set cluster http://localhost:9001 admin password
./mc mb cluster/test-bucket
dd if=/dev/urandom of=sample.bin bs=1M count=10
./mc cp sample.bin cluster/test-bucket/

Measure Latency: Use a loop to repeatedly read the object and measure the time taken.

time for i in {1..20}; do ./mc cat cluster/test-bucket/sample.bin > /dev/null; done

Step 3: Inducing Chaos with Toxiproxy

Toxiproxy allows us to simulate network conditions. We will inject latency and then a partition.

Add Toxiproxy to your Compose file: (Conceptual - in a real lab, you would route minio1's traffic through a proxy container).
Simulate Latency: Introduce 500ms of latency to minio4. Observe how the "time for i in..." loop now takes significantly longer as the cluster waits for nodes to synchronize.
Induce a Partition: Completely block traffic to minio3 and minio4.
- Your cluster now has 2 nodes online and 2 offline.
- For a 4-node cluster, MinIO typically requires (N/2 + 1) nodes for a "Read+Write" quorum.

Step 4: Observation & Recovery

Test Quorum:
- Attempt to Write a new object: mc cp sample.bin cluster/test-bucket/new-file.bin
- Attempt to Read the existing object: mc cat cluster/test-bucket/sample.bin
Analysis:
- Does the write fail? Why? (Likely due to lack of write quorum).
- Does the read succeed? MinIO often allows reads if at least N/2 nodes are up.

Heal the Cluster: Restore connectivity to the partitioned nodes.

# Remove the toxiproxy blocks or restart the containers
docker-compose restart

Verify Synchronization: Check the logs to see the nodes handshaking and re-syncing any missed data.

Lab Reflection

In the CAP theorem, which two letters does MinIO prioritize by default during a partition?
How does adding more nodes (e.g., moving from 4 to 16 nodes) affect the probability of maintaining availability during a network split?
What is the difference between "Strong Consistency" and "Eventual Consistency" in the context of what you saw during the recovery phase?