Re: [openstack-hpc] CPU intensive apps on OpenStack
Hi Tim, What's your reference on disabling EPT? I can find some fairly old stuff on this, but I thought newer processors had improved performance here to lower page fault overheads... so I guess I'm wondering if you've seen a recent evaluation somewhere? Re. NUMA etc, it seems like the most important thing if you're running large memory apps inside your guests is to make sure the guest is pinned (both CPU and NUMA wise) and sees a NUMA topology that matches how it is pinned on the host. I haven't had a chance to try it yet but I thought this was all now possible Juno. Beyond that you probably want the guest kernel to be NUMA sassy as well, so it needs (IIRC) Linux 3.13+ to get the numa_balancing (nee autonuma) goodies. I'm not sure if for such cases you might actually want to disable numa balancing on the host or whether the pinning would effectively remove any overhead there anyway... PS: +1 on the meetup suggestion -- Cheers, ~Blairo
-----Original Message----- From: Blair Bethwaite [mailto:blair.bethwaite@gmail.com] Sent: 17 April 2015 07:36 To: openstack-hpc@lists.openstack.org; Tim Bell Subject: Re: [openstack-hpc] CPU intensive apps on OpenStack
Hi Tim,
What's your reference on disabling EPT? I can find some fairly old stuff on this, but I thought newer processors had improved performance here to lower page fault overheads... so I guess I'm wondering if you've seen a recent evaluation somewhere?
In High Energy Physics, we have a subset of the Spec2006 benchmark which has been found to scale similarly to our application code (and is much simpler to run). The benchmark is run in 'n' times in parallel according to the number of cores in the system (as our applications are high throughput rather than individual high performance applications). With this application set running on a 2 CPU box with Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz and hyperthreading on, we get the following - bare metal(Centos 7 (3.10)): 366 - single VM ept on (Centos 6 guest on KVM on Centos 7): 297 - single VM ept off (Centos 6 guest on KVM on Centos 7): 316 I also had suspected that this was a solved problem. We're working through the options (NUMA, pinning, huge pages etc.). Some of these we can already do with OpenStack directly through flavour/image flags (or are coming in Juno).
Re. NUMA etc, it seems like the most important thing if you're running large memory apps inside your guests is to make sure the guest is pinned (both CPU and NUMA wise) and sees a NUMA topology that matches how it is pinned on the host. I haven't had a chance to try it yet but I thought this was all now possible Juno. Beyond that you probably want the guest kernel to be NUMA sassy as well, so it needs (IIRC) Linux 3.13+ to get the numa_balancing (nee autonuma) goodies. I'm not sure if for such cases you might actually want to disable numa balancing on the host or whether the pinning would effectively remove any overhead there anyway...
This is the kind of experiences I would like to sharing at the summit. In particular, how some standard recommendations from KVM tuning can be translated into the OpenStack equivalents and whether the property settings in flavors/images coming for the NFV work is going to allow us all to explore the phase space without adjusting the XML. My expectation is that somethings will be default recommendations for all HPC/HTC use cases. Others would be more to iterate over the potential setting options benchmarking your sample workload. I think we're getting to enough people for the meetup. It will be a smallish set but with a lot to discuss... Tim
PS: +1 on the meetup suggestion
-- Cheers, ~Blairo
May I humbly ask that good notes be taken for those of us strongly interested but unable to attend? Cheers, Adam On Sat, Apr 18, 2015 at 3:18 PM, Tim Bell <Tim.Bell@cern.ch> wrote:
-----Original Message----- From: Blair Bethwaite [mailto:blair.bethwaite@gmail.com] Sent: 17 April 2015 07:36 To: openstack-hpc@lists.openstack.org; Tim Bell Subject: Re: [openstack-hpc] CPU intensive apps on OpenStack
Hi Tim,
What's your reference on disabling EPT? I can find some fairly old stuff on this, but I thought newer processors had improved performance here to lower page fault overheads... so I guess I'm wondering if you've seen a recent evaluation somewhere?
In High Energy Physics, we have a subset of the Spec2006 benchmark which has been found to scale similarly to our application code (and is much simpler to run). The benchmark is run in 'n' times in parallel according to the number of cores in the system (as our applications are high throughput rather than individual high performance applications). With this application set running on a 2 CPU box with Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz and hyperthreading on, we get the following
- bare metal(Centos 7 (3.10)): 366 - single VM ept on (Centos 6 guest on KVM on Centos 7): 297 - single VM ept off (Centos 6 guest on KVM on Centos 7): 316
I also had suspected that this was a solved problem.
We're working through the options (NUMA, pinning, huge pages etc.). Some of these we can already do with OpenStack directly through flavour/image flags (or are coming in Juno).
Re. NUMA etc, it seems like the most important thing if you're running large memory apps inside your guests is to make sure the guest is pinned (both CPU and NUMA wise) and sees a NUMA topology that matches how it is pinned on the host. I haven't had a chance to try it yet but I thought this was all now possible Juno. Beyond that you probably want the guest kernel to be NUMA sassy as well, so it needs (IIRC) Linux 3.13+ to get the numa_balancing (nee autonuma) goodies. I'm not sure if for such cases you might actually want to disable numa balancing on the host or whether the pinning would effectively remove any overhead there anyway...
This is the kind of experiences I would like to sharing at the summit. In particular, how some standard recommendations from KVM tuning can be translated into the OpenStack equivalents and whether the property settings in flavors/images coming for the NFV work is going to allow us all to explore the phase space without adjusting the XML.
My expectation is that somethings will be default recommendations for all HPC/HTC use cases. Others would be more to iterate over the potential setting options benchmarking your sample workload.
I think we're getting to enough people for the meetup. It will be a smallish set but with a lot to discuss...
Tim
PS: +1 on the meetup suggestion
-- Cheers, ~Blairo
_______________________________________________ OpenStack-HPC mailing list OpenStack-HPC@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-hpc
-----Original Message----- From: Adam Huffman [mailto:adam.huffman@gmail.com] Sent: 19 April 2015 08:44 To: Tim Bell Cc: Blair Bethwaite; openstack-hpc@lists.openstack.org Subject: Re: [openstack-hpc] CPU intensive apps on OpenStack
May I humbly ask that good notes be taken for those of us strongly interested but unable to attend?
Along with all the other ops summit sessions, notes will be added to the design summit etherpad (one per session). I'll set one up in the next few days to start collecting the agenda / topics. Tim
Cheers, Adam
-----Original Message----- From: Blair Bethwaite [mailto:blair.bethwaite@gmail.com] Sent: 17 April 2015 07:36 To: openstack-hpc@lists.openstack.org; Tim Bell Subject: Re: [openstack-hpc] CPU intensive apps on OpenStack
Hi Tim,
What's your reference on disabling EPT? I can find some fairly old stuff on this, but I thought newer processors had improved performance here to lower page fault overheads... so I guess I'm wondering if you've seen a recent evaluation somewhere?
In High Energy Physics, we have a subset of the Spec2006 benchmark which has been found to scale similarly to our application code (and is much simpler to run). The benchmark is run in 'n' times in parallel according to the number of cores in the system (as our applications are high throughput rather than individual high performance applications). With this application set running on a 2 CPU box with Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz and hyperthreading on, we get the following
- bare metal(Centos 7 (3.10)): 366 - single VM ept on (Centos 6 guest on KVM on Centos 7): 297 - single VM ept off (Centos 6 guest on KVM on Centos 7): 316
I also had suspected that this was a solved problem.
We're working through the options (NUMA, pinning, huge pages etc.). Some of
On Sat, Apr 18, 2015 at 3:18 PM, Tim Bell <Tim.Bell@cern.ch> wrote: these we can already do with OpenStack directly through flavour/image flags (or are coming in Juno).
Re. NUMA etc, it seems like the most important thing if you're running large memory apps inside your guests is to make sure the guest is pinned (both CPU and NUMA wise) and sees a NUMA topology that matches how it is pinned on the host. I haven't had a chance to try it yet but I thought this was all now possible Juno. Beyond that you probably want the guest kernel to be NUMA sassy as well, so it needs (IIRC) Linux 3.13+ to get the numa_balancing (nee autonuma) goodies. I'm not sure if for such cases you might actually want to disable numa balancing on the host or whether the pinning would effectively remove any overhead there anyway...
This is the kind of experiences I would like to sharing at the summit. In
particular, how some standard recommendations from KVM tuning can be translated into the OpenStack equivalents and whether the property settings in flavors/images coming for the NFV work is going to allow us all to explore the phase space without adjusting the XML.
My expectation is that somethings will be default recommendations for all
HPC/HTC use cases. Others would be more to iterate over the potential setting options benchmarking your sample workload.
I think we're getting to enough people for the meetup. It will be a smallish set
but with a lot to discuss...
Tim
PS: +1 on the meetup suggestion
-- Cheers, ~Blairo
_______________________________________________ OpenStack-HPC mailing list OpenStack-HPC@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-hpc
----- Original Message -----
From: "Tim Bell" <Tim.Bell@cern.ch> To: "Blair Bethwaite" <blair.bethwaite@gmail.com>, openstack-hpc@lists.openstack.org
-----Original Message----- From: Blair Bethwaite [mailto:blair.bethwaite@gmail.com] Sent: 17 April 2015 07:36 To: openstack-hpc@lists.openstack.org; Tim Bell Subject: Re: [openstack-hpc] CPU intensive apps on OpenStack
Hi Tim,
What's your reference on disabling EPT? I can find some fairly old stuff on this, but I thought newer processors had improved performance here to lower page fault overheads... so I guess I'm wondering if you've seen a recent evaluation somewhere?
In High Energy Physics, we have a subset of the Spec2006 benchmark which has been found to scale similarly to our application code (and is much simpler to run). The benchmark is run in 'n' times in parallel according to the number of cores in the system (as our applications are high throughput rather than individual high performance applications). With this application set running on a 2 CPU box with Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz and hyperthreading on, we get the following
- bare metal(Centos 7 (3.10)): 366 - single VM ept on (Centos 6 guest on KVM on Centos 7): 297 - single VM ept off (Centos 6 guest on KVM on Centos 7): 316
I also had suspected that this was a solved problem.
We're working through the options (NUMA, pinning, huge pages etc.). Some of these we can already do with OpenStack directly through flavour/image flags (or are coming in Juno).
Re. NUMA etc, it seems like the most important thing if you're running large memory apps inside your guests is to make sure the guest is pinned (both CPU and NUMA wise) and sees a NUMA topology that matches how it is pinned on the host. I haven't had a chance to try it yet but I thought this was all now possible Juno. Beyond that you probably want the guest kernel to be NUMA sassy as well, so it needs (IIRC) Linux 3.13+ to get the numa_balancing (nee autonuma) goodies. I'm not sure if for such cases you might actually want to disable numa balancing on the host or whether the pinning would effectively remove any overhead there anyway...
With regards to the work done in *Kilo* (some of this work was done in Juno but the real end to end story is in Kilo), if you choose to enable and use the pinning functionality then while you *can* run numa balancing (either numad or autonuma) on those compute nodes it should leave the pinned guests alone. Thanks, Steve
participants (4)
-
Adam Huffman
-
Blair Bethwaite
-
Steve Gordon
-
Tim Bell