Latency
Storage latency is the time elapsed between issuing an I/O request and receiving the response. It is measured in microseconds (µs) or milliseconds (ms). NVMe SSDs achieve 10–50µs latency; NVMe-oF over TCP adds ~20µs for a total of 25–40µs. Legacy storage protocols (iSCSI, NFS) typically operate at 500µs–5ms.
Latency in NVMe Storage
Latency has two major components for networked storage: device latency (time inside the SSD) and network latency (transmission time). NVMe-oF over TCP minimizes both: NVMe SSDs deliver 10–20µs device latency, and a single TCP round-trip on a local network adds 5–20µs.
Latency Reference Points
- DRAM access: ~100ns (0.1µs)
- Local NVMe SSD (PCIe 4): 10–20µs
- NVMe-oF over TCP (local LAN): 25–40µs
- NVMe-oF over RDMA (RoCE): 10–20µs
- iSCSI: 100–300µs
- NFS over 10GbE: 500µs–2ms
- Cloud EBS (AWS): 100–500µs typical
- HDD: 5–10ms
Tail Latency (p99, p99.9)
Average latency is often misleading. Database workloads care about tail latency — the 99th or 99.9th percentile response time. NVMe-oF over TCP with a well-tuned network typically shows p99 below 200µs; iSCSI p99 can spike to 10–50ms under load due to protocol overhead and head-of-line blocking.