On Wed, Apr 13, 2022 at 8:21 PM Pierre Riteau <pierre@stackhpc.com> wrote:
Hello,

In the blazar project, we have been seeing a job timeout failure in openstack-tox-py39 affecting master and stable/yoga. tox starts the "lockutils-wrapper python setup.py testr --slowest --testr-args=" process which doesn't show progress until job timeout.

I think you should ideally be migrating to stestr which does not have the issue. I've revived  https://review.opendev.org/c/openstack/blazar/+/581547.


It started happening sometime between 2022-03-10 16:01:39 (last success on master) and 2022-03-24 16:35:15 (first timeout occurrence) [1], with no change in blazar itself and few changes in requirements.

I resumed debugging today and managed to reproduce it using Ubuntu 20.04 (it doesn't happen on macOS). Here is the traceback after interrupting it if anyone wants to take a look [2]. The python process is using 100% of the CPU until interrupted.

I tracked down the regression to the upper constraint on setuptools. For example, stable/yoga has:

setuptools===59.6.0;python_version=='3.6'
setuptools===60.9.3;python_version=='3.8'

It appears this is ignored in the py39 job so the job runs with the latest setuptools. Indeed, there were some releases between March 10 and March 24. I still have to figure out what changed in setuptools to cause this behaviour.

Question for requirements maintainers: is this expected behaviour, or should upper constraints also include lines for python_version=='3.9' on yoga?

Thanks,
Pierre Riteau (priteau)

[1] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&project=openstack%2Fblazar&skip=0
[2] https://paste.opendev.org/show/bZO7ELmTvfMUGJPdlQ4k/


--
Regards,
Rabi Mishra