[nova][cyborg][gpu] GPU performance testing
Hi, I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer. The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough). All the help and pointers are very much appreciated! Thanks, Ildikó
Hi, As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with? Thanks, Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with?
You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet. Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer. -Sylvain Thanks,
Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
On Tue, Aug 31, 2021 at 5:10 PM Sylvain Bauza <sbauza@redhat.com> wrote:
On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with?
You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet.
Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer.
Btw, I can understand this can be a frustrating by short answer, so I'll develop. Technically, benchmarking is a huge word : depending on your usecase, performance can very differ for the same card (take the general example of CPU-bound vs. IO-bound tasks and you get the idea for GPUs) For this reason, I'd recommend you to first consider the metrics you'd like to stress on and only then identify the tools than can sustain your needs. For a standard test which can be errorprone but still a bit interesting, I'd propose you to run a couple of tensorflow examples against different environments (one with baremetal GPU, one with passthrough, one with virtual GPUs on Nova directly, one with Cyborg). This would give you the idea of the performance penalities but I suspect those to be less than minor. For real benchmarking cases, I can't answer, hence my call to other folks. By the way, I know CERN invested a bit into HPC testing with GPUs, maybe someone from their team or someone from the related Scientific WG could provide more insights ? -Sylvain -Sylvain
Thanks,
Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
Hi Sylvain, […]
Technically, benchmarking is a huge word : depending on your usecase, performance can very differ for the same card (take the general example of CPU-bound vs. IO-bound tasks and you get the idea for GPUs) For this reason, I'd recommend you to first consider the metrics you'd like to stress on and only then identify the tools than can sustain your needs.
That is a very good point. The scope is very much around basic validation of new GPU HW in an OpenStack environment with both virtual and passthrough setup.
For a standard test which can be errorprone but still a bit interesting, I'd propose you to run a couple of tensorflow examples against different environments (one with baremetal GPU, one with passthrough, one with virtual GPUs on Nova directly, one with Cyborg). This would give you the idea of the performance penalities but I suspect those to be less than minor.
That sounds like a good starting point for a basic setup, I will pass the info along. Thank you!
For real benchmarking cases, I can't answer, hence my call to other folks. By the way, I know CERN invested a bit into HPC testing with GPUs, maybe someone from their team or someone from the related Scientific WG could provide more insights ?
I did find a few CERN slides and videos, but they were a few years old so I thought to drop in the question here as well to see if there are any new tools or methods to consider. Thanks, Ildikó
-Sylvain
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
one with baremetal GPU, one with passthrough, one with virtual GPUs on Nova directly, one with Cyborg
Agree with Sylvain Bauza, if you want to test the GPU performance you can use benchmarking, but performance may be related to the running business. Whether it is through nova or cyborg, the GPU is bound, and there should be little difference in performance. brinzhang Inspur Electronic Information Industry Co.,Ltd. 发件人: Sylvain Bauza [mailto:sbauza@redhat.com] 发送时间: 2021年8月31日 23:21 收件人: Ildiko Vancsa <ildiko.vancsa@gmail.com> 抄送: OpenStack Discuss <openstack-discuss@lists.openstack.org> 主题: Re: [nova][cyborg][gpu] GPU performance testing On Tue, Aug 31, 2021 at 5:10 PM Sylvain Bauza <sbauza@redhat.com<mailto:sbauza@redhat.com>> wrote: On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com<mailto:ildiko.vancsa@gmail.com>> wrote: Hi, As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with? You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet. Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer. Btw, I can understand this can be a frustrating by short answer, so I'll develop. Technically, benchmarking is a huge word : depending on your usecase, performance can very differ for the same card (take the general example of CPU-bound vs. IO-bound tasks and you get the idea for GPUs) For this reason, I'd recommend you to first consider the metrics you'd like to stress on and only then identify the tools than can sustain your needs. For a standard test which can be errorprone but still a bit interesting, I'd propose you to run a couple of tensorflow examples against different environments (one with baremetal GPU, one with passthrough, one with virtual GPUs on Nova directly, one with Cyborg). This would give you the idea of the performance penalities but I suspect those to be less than minor. For real benchmarking cases, I can't answer, hence my call to other folks. By the way, I know CERN invested a bit into HPC testing with GPUs, maybe someone from their team or someone from the related Scientific WG could provide more insights ? -Sylvain -Sylvain Thanks, Ildikó > On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com<mailto:ildiko.vancsa@gmail.com>> wrote: > > Hi, > > I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer. > > The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough). > > All the help and pointers are very much appreciated! > > Thanks, > Ildikó > >
Hi, I recommend a tool to test GPU performance, which depends on CUDA installation. link: https://github.com/GPUburn/gpuburn/tree/master/GPUBurn Thanks, Ke Chen 在 2021-08-31 23:10:03,"Sylvain Bauza" <sbauza@redhat.com> 写道: On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote: Hi, As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with? You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet. Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer. -Sylvain Thanks, Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
On Wed, 2021-09-01 at 20:37 +0800, 陈克 wrote:
Hi, I recommend a tool to test GPU performance, which depends on CUDA installation.
link: https://github.com/GPUburn/gpuburn/tree/master/GPUBurn
if youtube hard ware review have show us anything the best way to benchmark gpu hardware is actully to use repsentivte application rather then synteic benchmarks as gpu vendors have a habbit of tuninging the gpu driver to perfrom better in popular syntitich benchmarks. i would suggest selecting a representive set that cover a number of usecase from cad, gaming, to video transcoding to ml (inference and trianing) to compute vission. blender is a popular opensouce 3d modelely software and it provides some in build bench marks that can be used for evaulating the performace of 3d cad and rendering. opencv is a popular toolkit for compute vission and they have a numbner of benchmark examples such as https://docs.opencv.org/4.5.2/dc/d69/tutorial_dnn_superres_benchmark.html on the more pure ml side there are some syntetic benachmarks provided by tensorflow https://github.com/tensorflow/benchmarks/tree/master/perfzero but often a better approch is to use a common opensocue data set and model and messue the time it takes to train the same model with the same data set in differnet environemnts you also want to messure the infernce rate which involved takign a pre trained model and feeding it data form out side of its training set and mesurrign the performance. its important that that model and data used for the inferce is the same used across all your deployments. pytouch is also another popular ai/ml framework that can be used in a simialr way. looking quickly at https://github.com/GPUburn/gpuburn/tree/master/GPUBurn it looks liek its actully not a benchmark utility but instead a stress test tool for mesuring stablity which is a very differnt thing and i dont think it would be helpful in your cases. in fact using it in a public cloud for example could be considered a break of fair use since its really desiing to put the gpu under as much stress as possibel to detect fault hardware which will impact other users of the gpu/ most of the gaming/windows based tool are propritaty which makes them hard to use as benhcmarks in an opensouce project due to licensing so they likely shoudl be avoided. but ya my advice would be dont look for benchmarks in general but look for framworks/tools/applcation that peopel use to do actual useful work that can be sued to executate a repatable action like useing handbrake to transcode a video file form one format to another to messur the hardwarre encoder performance with fixed inputs then use that out in this case transcoding time as one aspect of the performance mesurment to establish your benchmark.
Thanks, Ke Chen
在 2021-08-31 23:10:03,"Sylvain Bauza" <sbauza@redhat.com> 写道:
On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with?
You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet.
Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer. -Sylvain
Thanks, Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
On 1 Sep 2021, at 14:37, 陈克 <joykechen@163.com> wrote:
Hi, I recommend a tool to test GPU performance, which depends on CUDA installation.
link: https://github.com/GPUburn/gpuburn/tree/master/GPUBurn
gpuburn is also the tool that CERN uses also when new hardware is delivered and we want to check the hardware is working under load. It’s also a good way to stress your PDUs and cooling if you have several cards (see https://indico.cern.ch/event/950196/contributions/3993259/attachments/211375...) We also have some specific High Energy Physics benchmarks - https://gitlab.cern.ch/hep-benchmarks/hep-workloads-gpu Cheers Tim
Thanks, Ke Chen
在 2021-08-31 23:10:03,"Sylvain Bauza" <sbauza@redhat.com> 写道:
On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com <mailto:ildiko.vancsa@gmail.com>> wrote: Hi,
As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with?
You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet.
Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer. -Sylvain
Thanks, Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com <mailto:ildiko.vancsa@gmail.com>> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
Hi, Thank you all for the recommendations and guidance. This is exactly the information I was looking for! :) Best Regards, Ildikó
On Sep 1, 2021, at 07:46, Tim Bell <Tim.Bell@cern.ch> wrote:
On 1 Sep 2021, at 14:37, 陈克 <joykechen@163.com> wrote:
Hi, I recommend a tool to test GPU performance, which depends on CUDA installation.
link: https://github.com/GPUburn/gpuburn/tree/master/GPUBurn
gpuburn is also the tool that CERN uses also when new hardware is delivered and we want to check the hardware is working under load.
It’s also a good way to stress your PDUs and cooling if you have several cards (see https://indico.cern.ch/event/950196/contributions/3993259/attachments/211375...)
We also have some specific High Energy Physics benchmarks - https://gitlab.cern.ch/hep-benchmarks/hep-workloads-gpu
Cheers Tim
Thanks, Ke Chen
在 2021-08-31 23:10:03,"Sylvain Bauza" <sbauza@redhat.com> 写道:
On Tue, Aug 31, 2021 at 4:34 PM Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote: Hi,
As we are approaching the end of the holiday season I wanted to surface back my question about GPU performance testing. Does anyone have any hints to find the best tools to do some benchmarking with?
You made the point, Ildiko, I was on a long-running time-off so I didn't had time to look at your question yet.
Good concern tho, I have no knowledge about this, but I can ping a few other folks to get you an answer. -Sylvain
Thanks, Ildikó
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó
participants (6)
-
Brin Zhang(张百林)
-
Ildiko Vancsa
-
Sean Mooney
-
Sylvain Bauza
-
Tim Bell
-
陈克