Erasure Coding — NVMe Storage Glossary

Erasure coding is a data protection method that encodes data across multiple storage nodes using mathematical algorithms, allowing data to be reconstructed even when some nodes fail. Unlike simple replication, erasure coding achieves fault tolerance with significantly lower storage overhead — typically 1.2–1.5× vs 2–3× for full replication.

NVMe Storage

How Erasure Coding Works

Erasure coding splits data into k data chunks and generates m parity chunks using algorithms like Reed-Solomon. The data can be reconstructed from any k of the total k+m chunks. A common configuration is 4+2 (4 data chunks, 2 parity) — the system tolerates 2 simultaneous drive or node failures.

Erasure Coding vs Replication

Attribute	3× Replication	4+2 Erasure Coding
Storage overhead	3×	1.5×
Fault tolerance	2 node failures	2 node failures
Write latency	Low	Slightly higher (encode step)
Read performance	High (read any copy)	High (parallel stripe reads)
Best for	Low-latency writes	Cost-efficient at scale

Erasure Coding in NVMe-oF Storage

Distributed NVMe-oF storage systems like simplyblock implement erasure coding in user space (often using SPDK). Data stripes are written across multiple NVMe-oF storage nodes simultaneously, and parity chunks allow reconstruction after node failures — all with near-NVMe latency because the encoding/decoding runs in memory without kernel overhead.