Hi James,

thank you for the suggestion. During Wallaby PTG we have considered having different images (than just cirros ones), see "Use different guest image for gate jobs to run tempest tests" topic in [1], although it wasn't pursued at the end (we've had more pressing topics to deal with) and the action item got closed in Xena cycle [2].

I think we could start by creating a new option which would allow us to skip the failing tests on a different architecture. If we had at least an experimental job in the gates, which would run a different architecture, we could add a new test exercising that as you suggested. Then let's see where that gets us.

[1] https://etherpad.opendev.org/p/qa-wallaby-ptg
[2] https://etherpad.opendev.org/p/qa-xena-priority

Regards,

On Mon, 13 Dec 2021 at 22:49, James LaBarre <jlabarre@redhat.com> wrote:

Recently I had been running Tempest on my setup testing a mixed-architecture deployment (x86_64 ans ppc64le compute nodes at the same time).  It seems that some of the migration and affinity tests will check if there's more than one Compute node before they run.  However, it would seem that's as far as they check, without checking if they are in fact compatible or even of the same architecture.  (my test cluster is very small, and normally includes two ppc64le Compute nodes, and sometimes one x86_64 Compute node.  Currently one ppc64le machine is down for repair).

Because the two compute nodes are different architectures, I am getting failures in various migration and affinity tests, maybe more if I tested a larger subset.  Now granted my particular setup is a special case, but it does bring to mind some extensions that may be needed for Tempest in the future.  I could see it being possible to have x86_64 and ARM mixed together in one stack, maybe even tossing in RISC-v someday. 

I'm thinking we need to start adding in extra test images, flavors, etc into the Tempest configurations (as in defining multiple options so that each architecture can have test images, etc assigned to it, rather than the current primary and alt image for just one architecture)  Additionally, there should be testcases taking into account the architectures involved (such as seeing that an instance on one arch cannot be migrated to the other, as an example).  I know this involves a bit of refactoring, I didn't know if it had even been considered yet.


--

James LaBarre

Software Engineer, OpenStack MultiArch

Red Hat

jlabarre@redhat.com   



--
Martin Kopec
Senior Software Quality Engineer
Red Hat EMEA