Poor I/O performance on OpenStack block device (OpenStack Centos8:Ussuri)
I have a problem with I/O performance on Openstack block device HDD. *Environment:**Openstack version: Ussuri* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 - KVM: qemu-kvm-5.1.0-20.el8 *CEPH version: Octopus * *15.2.8-0.el8.x84_64* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 In CEPH Cluster we have 2 class: - Bluestore - HDD (only for cinder volume) - SSD (images, cinder volume) *Hardware:* - Ceph-client: 2x10Gbps (bond) MTU 9000 - Ceph-replicate: 2x10Gbps (bond) MTU 9000 *VM:* - Swapoff - non LVM *Issue*When create VM on Openstack using cinder volume HDD, have really poor performance: 60-85 MB/s writes. And when tests with ioping have high latency. *Diagnostic* 1. I have checked the performance between Compute Host (Openstack) and CEPH, and created an RBD (HDD class) mounted on Compute Host. And the performance is 300-400 MB/s. => So i think the problem is in the hypervisor But when I check performance on VM using cinder Volume SSD, the result equals performance when test RBD (SSD) mounted on a Compute host. 2. I already have to configure disk_cachemodes="network=writeback"(and enable rbd cache client) or test with disk_cachemodes="none" but nothing different. 3. Push iperf3 from compute host to random ceph host still has 20Gb traffic. 4. Compute Host and CEPH host connected to the same switch (layer2). Where else can I look for issues? Please help me in this case. Thank you.
On 07/07, Vinh Nguyen Duc wrote:
I have a problem with I/O performance on Openstack block device HDD.
*Environment:**Openstack version: Ussuri* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 - KVM: qemu-kvm-5.1.0-20.el8 *CEPH version: Octopus * *15.2.8-0.el8.x84_64* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 In CEPH Cluster we have 2 class: - Bluestore - HDD (only for cinder volume) - SSD (images, cinder volume) *Hardware:* - Ceph-client: 2x10Gbps (bond) MTU 9000 - Ceph-replicate: 2x10Gbps (bond) MTU 9000 *VM:* - Swapoff - non LVM
*Issue*When create VM on Openstack using cinder volume HDD, have really poor performance: 60-85 MB/s writes. And when tests with ioping have high latency. *Diagnostic* 1. I have checked the performance between Compute Host (Openstack) and CEPH, and created an RBD (HDD class) mounted on Compute Host. And the performance is 300-400 MB/s.
Hi, I probably won't be able to help you on the hypervisor side, but I have a couple of questions that may help narrow down the issue: - Are Cinder volumes using encryption? - How did you connect the volume to the Compute Host, using krbd or rbd-nbd? - Do both RBD images (Cinder and yours) have the same Ceph flags? - Did you try connecting to the Compute Host the same RBD image created by Cinder instead of creating a new one? Cheers, Gorka.
=> So i think the problem is in the hypervisor But when I check performance on VM using cinder Volume SSD, the result equals performance when test RBD (SSD) mounted on a Compute host. 2. I already have to configure disk_cachemodes="network=writeback"(and enable rbd cache client) or test with disk_cachemodes="none" but nothing different. 3. Push iperf3 from compute host to random ceph host still has 20Gb traffic. 4. Compute Host and CEPH host connected to the same switch (layer2). Where else can I look for issues? Please help me in this case. Thank you.
On 07/07, Vinh Nguyen Duc wrote:
I have a problem with I/O performance on Openstack block device HDD.
*Environment:**Openstack version: Ussuri* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 - KVM: qemu-kvm-5.1.0-20.el8 *CEPH version: Octopus * *15.2.8-0.el8.x84_64* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 In CEPH Cluster we have 2 class: - Bluestore - HDD (only for cinder volume) - SSD (images, cinder volume) *Hardware:* - Ceph-client: 2x10Gbps (bond) MTU 9000 - Ceph-replicate: 2x10Gbps (bond) MTU 9000 *VM:* - Swapoff - non LVM
*Issue*When create VM on Openstack using cinder volume HDD, have really poor performance: 60-85 MB/s writes. And when tests with ioping have high latency. *Diagnostic* 1. I have checked the performance between Compute Host (Openstack) and CEPH, and created an RBD (HDD class) mounted on Compute Host. And the performance is 300-400 MB/s.
Hi,
I probably won't be able to help you on the hypervisor side, but I have a couple of questions that may help narrow down the issue:
- Are Cinder volumes using encryption? if you are not using encyrption you might be encountering librados issue
On Thu, 2022-07-07 at 12:06 +0200, Gorka Eguileor wrote: tracked downstream by https://bugzilla.redhat.com/show_bug.cgi?id=1897572 this is unfixable without moving to updating to a new version fo the cpeh client libs.
- How did you connect the volume to the Compute Host, using krbd or rbd-nbd?
in ussuri we still technially have the workaround options to use krbd but they are deprecated and removed in xena. https://github.com/openstack/nova/blob/stable/ussuri/nova/conf/workarounds.p... in generaly using these options might invlaidate any support agreement you may have with a vendeor. we are aware of at least once edgecase currently where enableing this with encyrpted volume breaks live migration potentally causeing dataloss. https://bugs.launchpad.net/nova/+bug/1939545 there is a backport inflight for the fix to train https://review.opendev.org/q/topic:bug%252F1939545 but its only been backported to wallaby so far so it is not safe to enable those options and use live migration today. you should also be aware that to enabel this optionon a host you need to drain the host first then enable the option adn cold migrate instance to the host. live migration betwen hosts with local attach enabeld and disabled is not supported. if you want to disable it again in the futrue which you will have to do to upgrade to xena you need to cold migrate all instances again. so if you are deploying your own version fo cpeh and can move to a newer version which has the librados perforamce enhacment feature that is operationlly less painful then using these workaround. the only reason we developed this workaroudn to use krbd in nova was because our hands were tieed downstream since we could not ship a new version of ceph but needed to support release with this perfromance limiation for multiple years. so unless your in a simialr situration upgradeing ceph and ensuring you use the new versions of the ceph libs with qemu and a new enough qemu to leverave the performance enhancments is the best option. so with those disclaimer you may want to consider evaluating those workaround options but keep in mind the limiatation and the fact that you cannot live migrate until that bug is fixt before considering using it in production.
- Do both RBD images (Cinder and yours) have the same Ceph flags?
- Did you try connecting to the Compute Host the same RBD image created by Cinder instead of creating a new one?
Cheers, Gorka.
=> So i think the problem is in the hypervisor But when I check performance on VM using cinder Volume SSD, the result equals performance when test RBD (SSD) mounted on a Compute host. 2. I already have to configure disk_cachemodes="network=writeback"(and enable rbd cache client) or test with disk_cachemodes="none" but nothing different. 3. Push iperf3 from compute host to random ceph host still has 20Gb traffic. 4. Compute Host and CEPH host connected to the same switch (layer2). Where else can I look for issues? Please help me in this case. Thank you.
Thank for your email We are not using encryption volume. If this is a bug of librados, i do not see any effect of throughput when VM using volume SSD. And the performance of ceph HDD mounted directly from compute still good. We already disable debug in ceph.conf On Thu, 7 Jul 2022 at 19:26 Sean Mooney <smooney@redhat.com> wrote:
On 07/07, Vinh Nguyen Duc wrote:
I have a problem with I/O performance on Openstack block device HDD.
*Environment:**Openstack version: Ussuri* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 - KVM: qemu-kvm-5.1.0-20.el8 *CEPH version: Octopus * *15.2.8-0.el8.x84_64* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 In CEPH Cluster we have 2 class: - Bluestore - HDD (only for cinder volume) - SSD (images, cinder volume) *Hardware:* - Ceph-client: 2x10Gbps (bond) MTU 9000 - Ceph-replicate: 2x10Gbps (bond) MTU 9000 *VM:* - Swapoff - non LVM
*Issue*When create VM on Openstack using cinder volume HDD, have really poor performance: 60-85 MB/s writes. And when tests with ioping have high latency. *Diagnostic* 1. I have checked the performance between Compute Host (Openstack) and CEPH, and created an RBD (HDD class) mounted on Compute Host. And the performance is 300-400 MB/s.
Hi,
I probably won't be able to help you on the hypervisor side, but I have a couple of questions that may help narrow down the issue:
- Are Cinder volumes using encryption? if you are not using encyrption you might be encountering librados issue
On Thu, 2022-07-07 at 12:06 +0200, Gorka Eguileor wrote: tracked downstream by https://bugzilla.redhat.com/show_bug.cgi?id=1897572 this is unfixable without moving to updating to a new version fo the cpeh client libs.
- How did you connect the volume to the Compute Host, using krbd or rbd-nbd?
in ussuri we still technially have the workaround options to use krbd but they are deprecated and removed in xena.
https://github.com/openstack/nova/blob/stable/ussuri/nova/conf/workarounds.p... in generaly using these options might invlaidate any support agreement you may have with a vendeor.
we are aware of at least once edgecase currently where enableing this with encyrpted volume breaks live migration potentally causeing dataloss. https://bugs.launchpad.net/nova/+bug/1939545 there is a backport inflight for the fix to train https://review.opendev.org/q/topic:bug%252F1939545 but its only been backported to wallaby so far so it is not safe to enable those options and use live migration today.
you should also be aware that to enabel this optionon a host you need to drain the host first then enable the option adn cold migrate instance to the host. live migration betwen hosts with local attach enabeld and disabled is not supported.
if you want to disable it again in the futrue which you will have to do to upgrade to xena you need to cold migrate all instances again.
so if you are deploying your own version fo cpeh and can move to a newer version which has the librados perforamce enhacment feature that is operationlly less painful then using these workaround.
the only reason we developed this workaroudn to use krbd in nova was because our hands were tieed downstream since we could not ship a new version of ceph but needed to support release with this perfromance limiation for multiple years. so unless your in a simialr situration upgradeing ceph and ensuring you use the new versions of the ceph libs with qemu and a new enough qemu to leverave the performance enhancments is the best option.
so with those disclaimer you may want to consider evaluating those workaround options but keep in mind the limiatation and the fact that you cannot live migrate until that bug is fixt before considering using it in production.
- Do both RBD images (Cinder and yours) have the same Ceph flags?
- Did you try connecting to the Compute Host the same RBD image created by Cinder instead of creating a new one?
Cheers, Gorka.
=> So i think the problem is in the hypervisor But when I check performance on VM using cinder Volume SSD, the result equals performance when test RBD (SSD) mounted on a Compute host. 2. I already have to configure disk_cachemodes="network=writeback"(and enable rbd cache client) or test with disk_cachemodes="none" but
nothing
different. 3. Push iperf3 from compute host to random ceph host still has 20Gb traffic. 4. Compute Host and CEPH host connected to the same switch (layer2). Where else can I look for issues? Please help me in this case. Thank you.
On 07/07, Vinh Nguyen Duc wrote:
Thank for your email We are not using encryption volume.
If this is a bug of librados, i do not see any effect of throughput when VM using volume SSD. And the performance of ceph HDD mounted directly from compute still good.
Hi, Did the Cinder volume that was performing poorly in the VM perform well when manually connected directly to the Compute Host? Cheers, Gorka.
We already disable debug in ceph.conf
On Thu, 7 Jul 2022 at 19:26 Sean Mooney <smooney@redhat.com> wrote:
On 07/07, Vinh Nguyen Duc wrote:
I have a problem with I/O performance on Openstack block device HDD.
*Environment:**Openstack version: Ussuri* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 - KVM: qemu-kvm-5.1.0-20.el8 *CEPH version: Octopus * *15.2.8-0.el8.x84_64* - OS: CentOS8 - Kernel: 4.18.0-240.15.1.el8_3.x86_64 In CEPH Cluster we have 2 class: - Bluestore - HDD (only for cinder volume) - SSD (images, cinder volume) *Hardware:* - Ceph-client: 2x10Gbps (bond) MTU 9000 - Ceph-replicate: 2x10Gbps (bond) MTU 9000 *VM:* - Swapoff - non LVM
*Issue*When create VM on Openstack using cinder volume HDD, have really poor performance: 60-85 MB/s writes. And when tests with ioping have high latency. *Diagnostic* 1. I have checked the performance between Compute Host (Openstack) and CEPH, and created an RBD (HDD class) mounted on Compute Host. And the performance is 300-400 MB/s.
Hi,
I probably won't be able to help you on the hypervisor side, but I have a couple of questions that may help narrow down the issue:
- Are Cinder volumes using encryption? if you are not using encyrption you might be encountering librados issue
On Thu, 2022-07-07 at 12:06 +0200, Gorka Eguileor wrote: tracked downstream by https://bugzilla.redhat.com/show_bug.cgi?id=1897572 this is unfixable without moving to updating to a new version fo the cpeh client libs.
- How did you connect the volume to the Compute Host, using krbd or rbd-nbd?
in ussuri we still technially have the workaround options to use krbd but they are deprecated and removed in xena.
https://github.com/openstack/nova/blob/stable/ussuri/nova/conf/workarounds.p... in generaly using these options might invlaidate any support agreement you may have with a vendeor.
we are aware of at least once edgecase currently where enableing this with encyrpted volume breaks live migration potentally causeing dataloss. https://bugs.launchpad.net/nova/+bug/1939545 there is a backport inflight for the fix to train https://review.opendev.org/q/topic:bug%252F1939545 but its only been backported to wallaby so far so it is not safe to enable those options and use live migration today.
you should also be aware that to enabel this optionon a host you need to drain the host first then enable the option adn cold migrate instance to the host. live migration betwen hosts with local attach enabeld and disabled is not supported.
if you want to disable it again in the futrue which you will have to do to upgrade to xena you need to cold migrate all instances again.
so if you are deploying your own version fo cpeh and can move to a newer version which has the librados perforamce enhacment feature that is operationlly less painful then using these workaround.
the only reason we developed this workaroudn to use krbd in nova was because our hands were tieed downstream since we could not ship a new version of ceph but needed to support release with this perfromance limiation for multiple years. so unless your in a simialr situration upgradeing ceph and ensuring you use the new versions of the ceph libs with qemu and a new enough qemu to leverave the performance enhancments is the best option.
so with those disclaimer you may want to consider evaluating those workaround options but keep in mind the limiatation and the fact that you cannot live migrate until that bug is fixt before considering using it in production.
- Do both RBD images (Cinder and yours) have the same Ceph flags?
- Did you try connecting to the Compute Host the same RBD image created by Cinder instead of creating a new one?
Cheers, Gorka.
=> So i think the problem is in the hypervisor But when I check performance on VM using cinder Volume SSD, the result equals performance when test RBD (SSD) mounted on a Compute host. 2. I already have to configure disk_cachemodes="network=writeback"(and enable rbd cache client) or test with disk_cachemodes="none" but
nothing
different. 3. Push iperf3 from compute host to random ceph host still has 20Gb traffic. 4. Compute Host and CEPH host connected to the same switch (layer2). Where else can I look for issues? Please help me in this case. Thank you.
participants (3)
-
Gorka Eguileor
-
Sean Mooney
-
Vinh Nguyen Duc