Hi Sylvain, […]
Technically, benchmarking is a huge word : depending on your usecase, performance can very differ for the same card (take the general example of CPU-bound vs. IO-bound tasks and you get the idea for GPUs) For this reason, I'd recommend you to first consider the metrics you'd like to stress on and only then identify the tools than can sustain your needs.
That is a very good point. The scope is very much around basic validation of new GPU HW in an OpenStack environment with both virtual and passthrough setup.
For a standard test which can be errorprone but still a bit interesting, I'd propose you to run a couple of tensorflow examples against different environments (one with baremetal GPU, one with passthrough, one with virtual GPUs on Nova directly, one with Cyborg). This would give you the idea of the performance penalities but I suspect those to be less than minor.
That sounds like a good starting point for a basic setup, I will pass the info along. Thank you!
For real benchmarking cases, I can't answer, hence my call to other folks. By the way, I know CERN invested a bit into HPC testing with GPUs, maybe someone from their team or someone from the related Scientific WG could provide more insights ?
I did find a few CERN slides and videos, but they were a few years old so I thought to drop in the question here as well to see if there are any new tools or methods to consider. Thanks, Ildikó
-Sylvain
On Aug 9, 2021, at 08:46, Ildiko Vancsa <ildiko.vancsa@gmail.com> wrote:
Hi,
I got a question about tools and practices to check GPU performance in an OpenStack environment that I need some help to answer.
The question is about recommended GPU performance testing/benchmarking tools if there are a few that people in the community are using and would recommend? The scope of the testing work is to check GPU performance in OpenStack VMs (both virtualized and passthrough).
All the help and pointers are very much appreciated!
Thanks, Ildikó