NVMe-oF vs iSCSI: The Protocol Comparison
iSCSI has been the dominant networked block storage protocol since 2003. NVMe over Fabrics (NVMe-oF) — specifically NVMe/TCP — is its successor. The difference is not incremental: NVMe-oF removes an entire protocol translation layer and replaces a 22-year-old command model with one designed for flash storage. The result is 10× lower latency, 10× higher IOPS, and near-zero CPU overhead.
Quick Verdict
- New deployments: use NVMe/TCP. Linux kernel support is stable since 5.0. No special hardware required. Same Ethernet infrastructure as iSCSI.
- Existing iSCSI: keep running until hardware refresh. iSCSI is not broken — it just has a latency ceiling that NVMe-oF does not.
- Legacy applications on iSCSI that don't exceed 80K IOPS per volume see no practical reason to migrate immediately.
NVMe-oF vs iSCSI: Head-to-Head
| Attribute | NVMe-oF / NVMe/TCP | iSCSI |
|---|---|---|
| Latency (4K read) | 25–40µs (TCP) | 100–500µs |
| p99 latency under load | <200µs | 10–50ms (protocol overhead + HoL blocking) |
| Max random IOPS (single target) | 1M–5M+ | 100K–300K |
| Command queue depth | 64,000 queues × 64,000 cmds | 1 queue × 128 cmds (SAM limit) |
| Protocol layers | NVMe → NVMe/TCP → TCP/IP | SCSI → iSCSI → TCP/IP (extra translation) |
| Command set size | 13 NVMe commands | 148+ SCSI commands |
| Network requirement | Standard Ethernet (1/10/25/100GbE) | Standard Ethernet (same) |
| Special hardware | None (software only for TCP) | Optional iSCSI HBA (software mode works) |
| Linux kernel support | Stable since kernel 5.0 (2019) | Stable since kernel 2.6 (2004) |
| Kubernetes CSI drivers | Growing: simplyblock, Lightbits, SPDK, etc. | Mature: many vendors (Dell, NetApp, Pure) |
| Multipath / HA | NVMe ANA (Asymmetric Namespace Access) | MPIO (mature, well-tested) |
The SCSI Translation Problem
iSCSI encapsulates SCSI commands inside TCP/IP. When the target receives an iSCSI PDU, it must:
- Parse the TCP/IP header
- Reassemble the iSCSI PDU
- Translate the SCSI CDB (Command Descriptor Block) into a native storage command
- Issue the command to the underlying NVMe SSD
- Translate the NVMe completion back to SCSI status
- Wrap in iSCSI and TCP/IP for the response
NVMe-oF/TCP eliminates steps 3 and 5. NVMe commands travel natively from initiator to target — no SCSI translation. That translation eliminated is worth 50–200µs per I/O under load, and it is the primary reason NVMe/TCP p99 latency is 25× lower than iSCSI p99 under the same workload.
Queue Depth: Why iSCSI IOPS Hit a Ceiling
IOPS = Queue Depth ÷ Latency. iSCSI uses the SCSI Architecture Model (SAM) with a single command queue of 128 commands. At 300µs average latency, the theoretical maximum IOPS per queue is 128 / 0.0003 = ~426K — but SAM's head-of-line blocking degrades this in practice to 100–200K.
NVMe supports 64,000 queues with 64,000 commands each. Modern databases and VMs use many parallel queues. A PostgreSQL instance under heavy load opens 16–32 parallel I/O paths; NVMe lets each path run independently at maximum depth, while iSCSI serializes them through the single SAM queue.
Migrating from iSCSI to NVMe/TCP
The migration is protocol-only — no hardware changes required if you're already on Ethernet:
The resulting device (/dev/nvme0n1) is formatted and used identically to a local NVMe SSD.
Existing filesystems, Kubernetes PVCs, and databases require no changes.
See the full NVMe-oF guide
and the NVMe/TCP deep-dive at nvme-tcp.com →