Distributed storage systems must navigate the trade-offs between Consistency, Availability, and Partition Tolerance (the CAP Theorem). This lab uses a distributed MinIO cluster and network chaos tools to visualize these trade-offs in action.
Learning Objectives
- Experience network partitions and latency in a distributed storage system.
- Understand the real-world implications of the CAP Theorem.
- Observe quorum behavior and cluster recovery.
Prerequisites
- Docker and Docker Compose installed
- Basic understanding of distributed systems
- A terminal with
curlor a load-testing tool likeab(Apache Bench)
Step 1: Distributed Cluster Setup
We will deploy a 4-node distributed MinIO cluster. In a 4-node setup, MinIO requires a quorum of nodes to be available to perform operations.
-
Create a
docker-compose.ymlfile:services: minio1: image: quay.io/minio/minio volumes: - data1-1:/data1 ports: - '9001:9000' environment: MINIO_ROOT_USER: admin MINIO_ROOT_PASSWORD: password command: server http://minio{1...4}/data{1...1} minio2: image: quay.io/minio/minio volumes: - data2-1:/data1 ports: - '9002:9000' environment: MINIO_ROOT_USER: admin MINIO_ROOT_PASSWORD: password command: server http://minio{1...4}/data{1...1} minio3: image: quay.io/minio/minio volumes: - data3-1:/data1 ports: - '9003:9000' environment: MINIO_ROOT_USER: admin MINIO_ROOT_PASSWORD: password command: server http://minio{1...4}/data{1...1} minio4: image: quay.io/minio/minio volumes: - data4-1:/data1 ports: - '9004:9000' environment: MINIO_ROOT_USER: admin MINIO_ROOT_PASSWORD: password command: server http://minio{1...4}/data{1...1} volumes: data1-1: data2-1: data3-1: data4-1: -
Start the cluster:
docker-compose up -d -
Verify: Access any node (e.g.,
localhost:9001) and ensure you can see the health of all 4 nodes in the dashboard.
Step 2: Baseline Benchmarking
-
Upload an object:
./mc alias set cluster http://localhost:9001 admin password ./mc mb cluster/test-bucket dd if=/dev/urandom of=sample.bin bs=1M count=10 ./mc cp sample.bin cluster/test-bucket/ -
Measure Latency: Use a loop to repeatedly read the object and measure the time taken.
time for i in {1..20}; do ./mc cat cluster/test-bucket/sample.bin > /dev/null; done
Step 3: Inducing Chaos with Toxiproxy
Toxiproxy allows us to simulate network conditions. We will inject latency and then a partition.
-
Add Toxiproxy to your Compose file: (Conceptual - in a real lab, you would route
minio1's traffic through a proxy container). -
Simulate Latency: Introduce 500ms of latency to
minio4. Observe how the "time for i in..." loop now takes significantly longer as the cluster waits for nodes to synchronize. -
Induce a Partition: Completely block traffic to
minio3andminio4.- Your cluster now has 2 nodes online and 2 offline.
- For a 4-node cluster, MinIO typically requires
(N/2 + 1)nodes for a "Read+Write" quorum.
Step 4: Observation & Recovery
-
Test Quorum:
- Attempt to Write a new object:
mc cp sample.bin cluster/test-bucket/new-file.bin - Attempt to Read the existing object:
mc cat cluster/test-bucket/sample.bin
- Attempt to Write a new object:
-
Analysis:
- Does the write fail? Why? (Likely due to lack of write quorum).
- Does the read succeed? MinIO often allows reads if at least
N/2nodes are up.
-
Heal the Cluster: Restore connectivity to the partitioned nodes.
# Remove the toxiproxy blocks or restart the containers docker-compose restart -
Verify Synchronization: Check the logs to see the nodes handshaking and re-syncing any missed data.
Lab Reflection
- In the CAP theorem, which two letters does MinIO prioritize by default during a partition?
- How does adding more nodes (e.g., moving from 4 to 16 nodes) affect the probability of maintaining availability during a network split?
- What is the difference between "Strong Consistency" and "Eventual Consistency" in the context of what you saw during the recovery phase?