Ceph Benchmark - Fast SSDs and network speeds in a Proxmox VE Ceph Reef cluster
Fast SSDs and network speeds in a Proxmox VE Ceph Reef cluster Current fast SSD disks provide great performance, and fast network cards are becoming more affordable. Hence, this is a good point to reevaluate how quickly different network setups for Ceph can be saturated depending on how many OSDs are present in each node.
In this paper we will present the following three key findings regarding hyper-converged Ceph setups with fast disks and high network bandwidth:
- Our benchmarks show that a 10 Gbit/s network can be easily overwhelmed. Even when only using one very fast disk the network becomes a bottleneck quickly.
- A network with a bandwidth of 25 Gbit/s can also become a bottleneck. Nevertheless, some improvements can be gained through configuration changes. Routing via FRR is preferred for a full-mesh cluster over Rapid Spanning Tree Protocol (RSTP). If no fallback is needed, a simple routed setup may also be a (less resilient) option.
- When using a 100 Gbit/s network the bottleneck in the cluster seems to finally shift away from the actual hardware and toward the Ceph client. Here we observed write speeds of up to 6000 MiB/s and read speeds of up to 7000 MiB/s for a single client. However, when using multiple clients in parallel, writing at up to 9800 MiB/s and reading at 19 500 MiB/s was possible.