[infra][cyborg] cyborg-tempest-plugin test failed due to the zuul server has no accelerators

Jeremy Stanley fungi at yuggoth.org
Tue Jun 22 03:44:28 UTC 2021


On 2021-06-22 00:23:32 +0000 (+0000), Brin Zhang(张百林) wrote:
> There is a patch you can check
> https://review.opendev.org/c/openstack/cyborg/+/790937 ,  tempest
> failed
> https://050bde8a54f119be7071-8157e9570cd7007a824b373cbf52d06c.ssl.cf2.rackcdn.com/790937/6/check/cyborg-tempest/82fd3ce/testr_results.html

Thanks, that helps. The build history indicates that the job was
succeeding for openstack/cyborg up through 2021-06-09 08:22 UTC, but
was failing consistently as of 2021-06-10 09:26 UTC, so something
probably changed in that 24 hour period to affect the job:

https://zuul.opendev.org/t/openstack/builds?job_name=cyborg-tempest&project=openstack/cyborg

Broadening that query to other projects, I can see it succeeded as
recently as 2021-06-10 01:34 for an openstack/nova change in check.
What's more interesting is that it's continuing to succeed
consistently for stable branches, even stable/wallaby, just not
master.

Both the succeeding and failing builds for master ran on regular
ubuntu-focal nodes in a number of different cloud providers which
don't have any specialized accelerator hardware, so I have to assume
what's changed has nothing to do with the underlying test
environment.

> Recently there is always report "no valid host" when create an
> accelerator server, as below, that out of our control :(,
> """
> tempest.exceptions.BuildErrorException: Server feef6015-5211-481b-813f-c5924cdf6931 failed to build and is in ERROR status
> Details: {'code': 500, 'created': '2021-06-21T01:13:52Z', 'message': 'No valid host was found. '}
> """
[...]

This is when scheduling an accelerator within DevStack, right? Were
you maybe using some sort of mock/fake accelerator for testing
purposes? Because there wouldn't have been actual accelerators
exposed to that environment even back when the job was still
succeeding.

Regardless, I suspect something merged early UTC on 2021-06-10 to
the master branch of one of the services or tools with which Cyborg
interacting to cause this error to begin appearing. The fact that
the same job is running fine for stable/wallaby also indicates it's
probably some behavior which hasn't been backported yet. Hopefully
that helps narrow it down.
-- 
Jeremy Stanley
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 963 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210622/0987042c/attachment.sig>


More information about the openstack-discuss mailing list