<div dir="ltr"><div>Hi,</div><p>I wanted to express my sincere gratitude for all the help and advice you've given me. I followed your suggestions and carried out a bunch of tests, but unfortunately, the performance boost I was hoping for hasn't materialized.</p><p>Let me break down the configurations I've tried and the results I've got. Just to give you some context, all my tests were done using two INTEL SSDPE2MD400G4 NVMe disks and Ubuntu 20.04LTS as the OS on the compute node. You can find all the nitty-gritty details in [1] and [2]. Additionally, I've shared the results of the fio tests directly executed on the RAID directory within the compute node in [3].</p><p>Then, I expanded my testing to instances, and here's what I found:</p><ol><li>I tested things out with the default settings and Ubuntu 22.04 LTS image. The iOPS results were hovering around 18-18.5k. Check out [4] and [5] for the specifics.</li><li>I tweaked the nova.conf file with two changes: force_raw_images = true and images_type = flat. Unfortunately, this only brought the iOPS down a bit, to just under 18k. You can see more in [6] and [7].</li><li>I made an extra change in nova.conf by switching the cpu_model from SandyBridge to IvyBridge. This change dropped the iOPS further, to around 17k. Details are in [8] and [9].</li><li>Lastly, I played around with image properties, setting hw_scsi_model=virtio-scsi and hw_disk_bus=scsi. However, this also resulted in around 17k iOPS. You can find out more in [10] and [11].</li></ol><p>It's a bit disheartening that none of these changes seemed to have the impact I was aiming for. So, I'm starting to think there might be a crucial piece of the puzzle that I'm missing here. If you have any ideas or insights, I'd be incredibly grateful for your input.</p><p>Thanks once more for all your help and support.</p><p>/Jan Wasilewski<br></p><div><br></div><div><i>References: <br></i></div><div><i>[1] Disk details and raid details: <a href="https://paste.openstack.org/show/bRyLPZ6TDHpIEKadLC7z/">https://paste.openstack.org/show/bRyLPZ6TDHpIEKadLC7z/</a></i></div><div><i>[2] Compute node and nova details: <a href="https://paste.openstack.org/show/bcGw3Glm6U0r1kUsg8nU/">https://paste.openstack.org/show/bcGw3Glm6U0r1kUsg8nU/</a></i></div><div><i>[3] fio results executed in raid directory inside compute node: <a href="https://paste.openstack.org/show/bN0EkBjoAP2Ig5PSSfy3/">https://paste.openstack.org/show/bN0EkBjoAP2Ig5PSSfy3/</a></i></div><div><i>[4] dumpxml of instance from test 1: <a href="https://paste.openstack.org/show/bVSq8tz1bSMdiYXcF3IP/">https://paste.openstack.org/show/bVSq8tz1bSMdiYXcF3IP/</a></i></div><div><i>[5] fio results from test 1: <a href="https://paste.openstack.org/show/bKlxom8Yl7NtHO8kO53a/">https://paste.openstack.org/show/bKlxom8Yl7NtHO8kO53a/</a></i></div><div><i>[6] dumpxml of instance from test 2: <a href="https://paste.openstack.org/show/bN2JN9DXT4DGKNZnzkJ8/">https://paste.openstack.org/show/bN2JN9DXT4DGKNZnzkJ8/</a></i></div><div><i>[7] fio results from test 2: <a href="https://paste.openstack.org/show/b7GXIVI2Cv0qkVLQaAF3/">https://paste.openstack.org/show/b7GXIVI2Cv0qkVLQaAF3/</a></i></div><div><i>[8] dumpxml of instance from test 3: <a href="https://paste.openstack.org/show/b0821V4IUq8N7YPb73sg/">https://paste.openstack.org/show/b0821V4IUq8N7YPb73sg/</a></i></div><div><i>[9] fio results from test 3: <a href="https://paste.openstack.org/show/bT1Erfxq4XTj0ubTTgdj/">https://paste.openstack.org/show/bT1Erfxq4XTj0ubTTgdj/</a></i></div><div><i>[10] dumpxml of instance from test 4: <a href="https://paste.openstack.org/show/bjTXM0do1xgzmVZO02Q7/">https://paste.openstack.org/show/bjTXM0do1xgzmVZO02Q7/</a></i></div><div><i>[11] fio results from test 4: <a href="https://paste.openstack.org/show/bpbVJntkR5aNke3trtRd/">https://paste.openstack.org/show/bpbVJntkR5aNke3trtRd/</a></i></div><div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">śr., 9 sie 2023 o 19:56 Damian Pietras <<a href="mailto:damian.pietras@hardit.pl">damian.pietras@hardit.pl</a>> napisał(a):<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>I would suggest to:</p>
<p>- make sure that "none" I/O scheduler is used inside VM (e.g.
/sys/block/sda/queue/scheduler). I assume quite recent kernel,
otherwise "noop".<br>
</p>
<p>- make sure that host has CPU C-States above C1 disabled (check
values of all /sys/devices/system/cpu/*/cpuidle/state*/disable for
while [..]/name is different than "POLL", C1, C1E) or use some
tool that disables that.<br>
</p>
<p>- Use raw images instead of qcow2: in [libvirt] section of
nova.conf set force_raw_images=True and images_type=flat and
recreate the instance<br>
</p>
<p>Is the difference so big also when you lower I/O depth (for
example to 1) or increase block size (for example to 64k) ?<br>
</p>
<p><br>
</p>
<div>On 09/08/2023 10:02, Jan Wasilewski
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Hi,</div>
<p>I am reaching out to inquire about the performance of our
local storage setup. Currently, I am conducting tests using
NVMe disks; however, the results appear to be underwhelming.</p>
<p>In terms of my setup, I have recently incorporated two NVMe
disks into my compute node. These disks have been configured
as RAID1 under md127 and subsequently mounted at
/var/lib/nova/instances [1]. During benchmarking using the fio
tool within this directory, I am achieving approximately
160,000 IOPS [2]. This figure serves as a satisfactory
baseline and reference point for upcoming VM tests.</p>
<p>As the next phase, I have established a flavor that employs a
root disk for my virtual machine [3]. Regrettably, the
resulting performance yields around 18,000 IOPS, which is
nearly ten times poorer than the compute node results [4].
While I expected some degradation, a tenfold decrease seems
excessive. Realistically, I anticipated no more than a twofold
reduction compared to the compute node's performance. Hence, I
am led to ask: what should be configured to enhance
performance?</p>
<p>I have already experimented with the settings recommended on
the Ceph page for image properties [5]; however, these changes
did not yield the desired improvements. In addition, I
attempted to modify the CPU architecture within the nova.conf
file, switching to Cascade Lake architecture, yet this
endeavor also proved ineffective. For your convenience, I have
included a link to my current dumpxml results [6].</p>
<p>Your insights and guidance would be greatly appreciated. I am
confident that there is a solution to this performance
disparity that I may have overlooked. Thank you in advance for
your help.</p>
<div>/Jan Wasilewski<br>
</div>
<div><br>
</div>
<div><i>References:</i></div>
<div><i>[1] nvme allocation and raid configuration: <a href="https://paste.openstack.org/show/bMMgGqu5I6LWuoQWV7TV/" target="_blank">https://paste.openstack.org/show/bMMgGqu5I6LWuoQWV7TV/</a></i></div>
<div><i>[2] fio performance inside compute node: <a href="https://paste.openstack.org/show/bcMi4zG7QZwuJZX8nyct/" target="_blank">https://paste.openstack.org/show/bcMi4zG7QZwuJZX8nyct/</a></i></div>
<div><i>[3] Flavor configuration: <a href="https://paste.openstack.org/show/b7o9hCKilmJI3qyXsP5u/" target="_blank">https://paste.openstack.org/show/b7o9hCKilmJI3qyXsP5u/</a></i></div>
<div><i>[4] fio performance inside VM: <a href="https://paste.openstack.org/show/bUjqxfU4nEtSFqTlU8oH/" target="_blank">https://paste.openstack.org/show/bUjqxfU4nEtSFqTlU8oH/</a></i></div>
<div><i>[5] image properties: <a href="https://docs.ceph.com/en/pacific/rbd/rbd-openstack/#image-properties" target="_blank">https://docs.ceph.com/en/pacific/rbd/rbd-openstack/#image-properties</a></i></div>
<div><i>[6] dumpxml of vm: <a href="https://paste.openstack.org/show/bRECcaSMqa8TlrPp0xrT/" target="_blank">https://paste.openstack.org/show/bRECcaSMqa8TlrPp0xrT/</a></i></div>
</div>
</blockquote>
<pre cols="72">--
Damian Pietras</pre>
</div>
</blockquote></div>