NVMe Storage
Bare Metal

NVMe Storage on Bare-Metal Servers

Bare-metal servers give you full access to NVMe hardware without hypervisor overhead. The architectural choice is local NVMe (maximum single-node performance) vs NVMe-oF disaggregation (independent scaling of compute and storage across a cluster).

The Storage Challenge

Why NVMe Storage Fits

Zero hypervisor overhead

Bare metal gives NVMe direct PCIe access from the OS. No virtio-blk, no emulation layer, no hypervisor scheduler. The full rated device IOPS is available to applications.

NVMe-oF as a scale-out strategy

NVMe-oF disaggregation lets you add storage capacity independently of compute. A cluster of 10 bare-metal compute nodes can share a single NVMe-oF storage pool, balancing storage utilization without adding compute servers.

Kernel NVMe tuning

Linux's block/nvme driver exposes per-queue depth, CPU affinity, and poll mode settings. With proper tuning (io_uring + nvme poll mode), bare-metal NVMe achieves <5µs submission latency.

SPDK for maximum throughput

SPDK (Storage Performance Development Kit) bypasses the kernel block layer entirely, achieving 10M+ IOPS from a single NVMe device on bare metal with a dedicated CPU core.

Reference Architecture

Layer Recommendation
Local NVMe use case Single-node maximum performance, no HA requirement
NVMe-oF use case Multi-node cluster with shared storage pool
Kernel config io_uring, nvme poll mode, NUMA-aware IRQ affinity
SPDK User-space NVMe driver for 10M+ IOPS (dedicated core)
Transport (NVMe-oF) NVMe/TCP (software) or NVMe/RoCE (RDMA NICs)

Benchmark This Workload

io_uring engine + direct I/O for bare-metal NVMe benchmark

fio --name=nvme-direct --ioengine=io_uring --iodepth=128 \ --rw=randread --bs=4k --direct=1 \ --size=10G --filename=/dev/nvme0n1 --runtime=60

Need shared block storage at NVMe speed?

NVMe over Fabrics (NVMe-oF) extends NVMe performance across standard Ethernet — delivering 25–40µs block storage to any host in your cluster. NVMe/TCP guide →

simplyblock provides production NVMe/TCP block storage for Kubernetes and bare-metal — no proprietary hardware required.