[Openstack] CEPH Speed Limit

Caitlin Bestler caitlin.bestler at nexenta.com
Wed Jan 20 17:11:41 UTC 2016



On 1/19/16 5:28 PM, John van Ommen wrote:
> I have a client who isn't happy with the performance of their storage.
> The client is currently running a mix of SAS HDDs and SATA SSDs.
>
> They wanted to remove the SAS HDDs and replace them with SSDs, so the 
> entire array would be SSDs.
>
> I was running benchmarks on the current hardware and I found that the 
> performance of the HDD array was close to the performance of the SSD 
> array.
>
> To me, this indicates that we're reaching the limits of the controller 
> that it's attached to. (An LSI RAID controller that's built into the 
> system board.)
>
> I was about to recommend that they add a controller, when I realized 
> that we may be reaching the limits of the PCI-E bus itself.
>
> Before I go and make a bad recommendation, I have a few questions:
>
> 1) Am I correct in assuming that the RAID controller, though 
> physically on the system board, is still running through the PCI-E 
> bus, just as if it was plugged into a slot?
> 2) Am I correct in assuming that the limit for the PCI-E bus (version 
> 2) is 500Mb/s? (https://en.wikipedia.org/wiki/PCI_Express)
>
> And if points one and two are correct, is my hypothesis that adding 
> more SSDs won't improve things true?
>
> Right now my benchmarks are showing that sequential reads are hitting 
> about 600Mb/s. (I haven't confirmed if their server is PCI-E 2.0 or 3.0)
>
You need to examine both bandwidth and latency. The PCI-e bandwidth is 
probably not determinant of storage server throughput.

If the typical fetch from one server, a RADOS block for Ceph, is a few 
MB or smaller then the latency of starting the transfer
will have more impact than the sustained bandwidth of the local transfer.

The other big question to examine is what your network bottleneck speed 
is. Are your server using GbE or 10 GbE? What efficiency are you getting 
from TCP?





More information about the Openstack mailing list