Open Stack

Thu Mar 3 06:10:41 UTC 2016

Hi Preston,

> The benchmark scripts are in:
> 
>   https://github.com/pbannister/openstack-bootstrap
in case that might help, here are a few notes and hints about doing 
benchmarks for the DRDB block device driver:

    http://blogs.linbit.com/p/897/benchmarking-drbd/

Perhaps there's something interesting for you.

> Found that if I repeatedly scanned the same 8GB volume from the physical
> host (with 1/4TB of memory), the entire volume was cached in (host) memory
> (very fast scan times).
If the iSCSI target (or QEMU, for direct access) is set up to use buffer 
cache, yes.
Whether you really want that is up to discussion - it might be much more 
beneficial to move that RAM from the Hypervisor to the VM, which should 
then be able to do more efficient caching of the filesystem contents that 
it should operate on.

> Scanning the same volume from within the instance still gets the same
> ~450MB/s that I saw before. 
Hmmm, with iSCSI inbetween that could be the TCP memcpy limitation.

> The "iostat" numbers from the instance show ~44 %iowait, and ~50 %idle.
> (Which to my reading might explain the ~50% loss of performance.) Why so
> much idle/latency?
> 
> The in-instance "dd" CPU use is ~12%. (Not very interesting.)
Because your "dd" testcase will be single-threaded, io-depth 1.
And that means synchronous access, each IO has to wait for the preceeding 
one to finish...

> Not sure from where the (apparent) latency comes. The host iSCSI target?
> The QEMU iSCSI initiator? Onwards...
Thread scheduling, inter-CPU cache trashing (if the iSCSI target is on 
a different physical CPU package/socket than the VM), ...

Benchmarking is a dark art.

Open Stack

[openstack-dev] [nova][cinder] Limits on volume read throughput?

OpenStack

Community

Documentation

Branding & Legal