NVMe Storage

NVMe-oF vs iSCSI: The Protocol Comparison

iSCSI has been the dominant networked block storage protocol since 2003. NVMe over Fabrics (NVMe-oF) — specifically NVMe/TCP — is its successor. The difference is not incremental: NVMe-oF removes an entire protocol translation layer and replaces a 22-year-old command model with one designed for flash storage. The result is 10× lower latency, 10× higher IOPS, and near-zero CPU overhead.

Quick Verdict

NVMe-oF vs iSCSI: Head-to-Head

Attribute NVMe-oF / NVMe/TCP iSCSI
Latency (4K read) 25–40µs (TCP) 100–500µs
p99 latency under load <200µs 10–50ms (protocol overhead + HoL blocking)
Max random IOPS (single target) 1M–5M+ 100K–300K
Command queue depth 64,000 queues × 64,000 cmds 1 queue × 128 cmds (SAM limit)
Protocol layers NVMe → NVMe/TCP → TCP/IP SCSI → iSCSI → TCP/IP (extra translation)
Command set size 13 NVMe commands 148+ SCSI commands
Network requirement Standard Ethernet (1/10/25/100GbE) Standard Ethernet (same)
Special hardware None (software only for TCP) Optional iSCSI HBA (software mode works)
Linux kernel support Stable since kernel 5.0 (2019) Stable since kernel 2.6 (2004)
Kubernetes CSI drivers Growing: simplyblock, Lightbits, SPDK, etc. Mature: many vendors (Dell, NetApp, Pure)
Multipath / HA NVMe ANA (Asymmetric Namespace Access) MPIO (mature, well-tested)

The SCSI Translation Problem

iSCSI encapsulates SCSI commands inside TCP/IP. When the target receives an iSCSI PDU, it must:

  1. Parse the TCP/IP header
  2. Reassemble the iSCSI PDU
  3. Translate the SCSI CDB (Command Descriptor Block) into a native storage command
  4. Issue the command to the underlying NVMe SSD
  5. Translate the NVMe completion back to SCSI status
  6. Wrap in iSCSI and TCP/IP for the response

NVMe-oF/TCP eliminates steps 3 and 5. NVMe commands travel natively from initiator to target — no SCSI translation. That translation eliminated is worth 50–200µs per I/O under load, and it is the primary reason NVMe/TCP p99 latency is 25× lower than iSCSI p99 under the same workload.

Queue Depth: Why iSCSI IOPS Hit a Ceiling

IOPS = Queue Depth ÷ Latency. iSCSI uses the SCSI Architecture Model (SAM) with a single command queue of 128 commands. At 300µs average latency, the theoretical maximum IOPS per queue is 128 / 0.0003 = ~426K — but SAM's head-of-line blocking degrades this in practice to 100–200K.

NVMe supports 64,000 queues with 64,000 commands each. Modern databases and VMs use many parallel queues. A PostgreSQL instance under heavy load opens 16–32 parallel I/O paths; NVMe lets each path run independently at maximum depth, while iSCSI serializes them through the single SAM queue.

Migrating from iSCSI to NVMe/TCP

The migration is protocol-only — no hardware changes required if you're already on Ethernet:

# Install nvme-cli
apt-get install nvme-cli
# Load NVMe/TCP kernel module
modprobe nvme-tcp
# Discover targets
nvme discover -t tcp -a <target-ip> -s 4420
# Connect
nvme connect -t tcp -a <target-ip> -s 4420 -n <subsystem-nqn>

The resulting device (/dev/nvme0n1) is formatted and used identically to a local NVMe SSD. Existing filesystems, Kubernetes PVCs, and databases require no changes. See the full NVMe-oF guide and the NVMe/TCP deep-dive at nvme-tcp.com →

More Comparisons