NVMe Storage on Bare Metal
Bare-metal servers provide direct NVMe PCIe access with zero hypervisor overhead. The architectural choice is local NVMe for single-node maximum performance or NVMe-oF for storage disaggregation across a cluster — both on your own hardware.
NVMe-Equipped Instance Types
| Instance Family | NVMe Characteristics |
|---|---|
| PCIe 4.0 U.2 / M.2 NVMe | 5–7 GB/s, 500K–7M IOPS, 10–20µs latency per device |
| PCIe 5.0 NVMe (Gen5) | 10–14 GB/s sequential; emerging in server platforms (2024+) |
| E1.S / EDSFF NVMe | Data center form factor; higher density per 1U chassis |
| All-NVMe JBOF | NVMe-oF Just-a-Bunch-of-Flash: shared NVMe pool over 100GbE |
Bare metal gives full rated device IOPS with zero virtualization overhead. Linux 5.x+ NVMe driver with io_uring + poll mode delivers sub-5µs submission latency — faster than any cloud managed disk offering.
NVMe-oF/TCP on Bare Metal
For multi-server bare-metal clusters, NVMe-oF/TCP disaggregates storage from compute. Storage servers (or a JBOF) expose NVMe namespaces over 25/100GbE to compute nodes. Total latency: 25–40µs. Compute and storage scale independently — add compute without adding storage nodes, and vice versa.
For the NVMe/TCP protocol deep-dive, see nvme-tcp.com → For a full NVMe-oF architecture overview, see the NVMe-oF guide.
Recommended Storage Architecture
| Tier / Use Case | Recommendation |
|---|---|
| Single-node max perf | Local NVMe PCIe 4/5 + io_uring poll mode |
| Multi-node disaggregated | NVMe-oF/TCP over 25–100GbE Ethernet |
| Kernel config | io_uring, nvme poll mode, NUMA-aware IRQ affinity |
| User-space bypass | SPDK for 10M+ IOPS (dedicated CPU core) |
| HA | NVMe ANA multipath across dual storage controllers |
simplyblock: NVMe/TCP storage for Bare Metal
simplyblock deploys as a software-defined NVMe/TCP storage cluster on standard Bare Metal instances. It provides Kubernetes CSI, dynamic provisioning, and sub-40µs persistent block storage — without proprietary hardware or cloud-managed disk limits.