Erasure Coding
Erasure coding is a data protection method that encodes data across multiple storage nodes using mathematical algorithms, allowing data to be reconstructed even when some nodes fail. Unlike simple replication, erasure coding achieves fault tolerance with significantly lower storage overhead — typically 1.2–1.5× vs 2–3× for full replication.
How Erasure Coding Works
Erasure coding splits data into k data chunks and generates m parity chunks using algorithms like Reed-Solomon. The data can be reconstructed from any k of the total k+m chunks. A common configuration is 4+2 (4 data chunks, 2 parity) — the system tolerates 2 simultaneous drive or node failures.
Erasure Coding vs Replication
| Attribute | 3× Replication | 4+2 Erasure Coding |
|---|---|---|
| Storage overhead | 3× | 1.5× |
| Fault tolerance | 2 node failures | 2 node failures |
| Write latency | Low | Slightly higher (encode step) |
| Read performance | High (read any copy) | High (parallel stripe reads) |
| Best for | Low-latency writes | Cost-efficient at scale |
Erasure Coding in NVMe-oF Storage
Distributed NVMe-oF storage systems like simplyblock implement erasure coding in user space (often using SPDK). Data stripes are written across multiple NVMe-oF storage nodes simultaneously, and parity chunks allow reconstruction after node failures — all with near-NVMe latency because the encoding/decoding runs in memory without kernel overhead.