[blazar][requirements] setuptools and python_version in upper constraints
Hello, In the blazar project, we have been seeing a job timeout failure in openstack-tox-py39 affecting master and stable/yoga. tox starts the "lockutils-wrapper python setup.py testr --slowest --testr-args=" process which doesn't show progress until job timeout. It started happening sometime between 2022-03-10 16:01:39 (last success on master) and 2022-03-24 16:35:15 (first timeout occurrence) [1], with no change in blazar itself and few changes in requirements. I resumed debugging today and managed to reproduce it using Ubuntu 20.04 (it doesn't happen on macOS). Here is the traceback after interrupting it if anyone wants to take a look [2]. The python process is using 100% of the CPU until interrupted. I tracked down the regression to the upper constraint on setuptools. For example, stable/yoga has: setuptools===59.6.0;python_version=='3.6' setuptools===60.9.3;python_version=='3.8' It appears this is ignored in the py39 job so the job runs with the latest setuptools. Indeed, there were some releases between March 10 and March 24. I still have to figure out what changed in setuptools to cause this behaviour. Question for requirements maintainers: is this expected behaviour, or should upper constraints also include lines for python_version=='3.9' on yoga? Thanks, Pierre Riteau (priteau) [1] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&project=openstack%2Fblazar&skip=0 [2] https://paste.opendev.org/show/bZO7ELmTvfMUGJPdlQ4k/
On 22-04-13 16:48:12, Pierre Riteau wrote:
Hello,
In the blazar project, we have been seeing a job timeout failure in openstack-tox-py39 affecting master and stable/yoga. tox starts the "lockutils-wrapper python setup.py testr --slowest --testr-args=" process which doesn't show progress until job timeout.
It started happening sometime between 2022-03-10 16:01:39 (last success on master) and 2022-03-24 16:35:15 (first timeout occurrence) [1], with no change in blazar itself and few changes in requirements.
I resumed debugging today and managed to reproduce it using Ubuntu 20.04 (it doesn't happen on macOS). Here is the traceback after interrupting it if anyone wants to take a look [2]. The python process is using 100% of the CPU until interrupted.
I tracked down the regression to the upper constraint on setuptools. For example, stable/yoga has:
setuptools===59.6.0;python_version=='3.6' setuptools===60.9.3;python_version=='3.8'
It appears this is ignored in the py39 job so the job runs with the latest setuptools. Indeed, there were some releases between March 10 and March 24. I still have to figure out what changed in setuptools to cause this behaviour.
Question for requirements maintainers: is this expected behaviour, or should upper constraints also include lines for python_version=='3.9' on yoga?
Thanks, Pierre Riteau (priteau)
[1] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&project=openstack%2Fblazar&skip=0 [2] https://paste.opendev.org/show/bZO7ELmTvfMUGJPdlQ4k/
I plan on removing py36 and adding py39 constraints today or tomorrow. -- Matthew Thode
W dniu 13.04.2022 o 17:02, Matthew Thode pisze:
I tracked down the regression to the upper constraint on setuptools. For example, stable/yoga has:
setuptools===59.6.0;python_version=='3.6' setuptools===60.9.3;python_version=='3.8'
It appears this is ignored in the py39 job so the job runs with the latest setuptools. Indeed, there were some releases between March 10 and March 24. I still have to figure out what changed in setuptools to cause this behaviour.
Question for requirements maintainers: is this expected behaviour, or should upper constraints also include lines for python_version=='3.9' on yoga?
I plan on removing py36 and adding py39 constraints today or tomorrow.
So we will have ones for 3.8, other ones for 3.9 and then for 3.10 too? Can we just do one set with "3.8 is minimal, if someone runs older then it is their problem"? 3.8 - Ubuntu 'focal' 20.04 3.9 - Debian 'bullseye' 11, CentOS Stream 9/RHEL 9 (and rebuilds) 3.10 - Ubuntu 'jammy' 22.04 Those are "main" distributions OpenStack Zed runs on. We should have one set with "python_version" limits used only when it is REALLY needed.
On Wed, Apr 13, 2022, at 11:29 AM, Marcin Juszkiewicz wrote:
W dniu 13.04.2022 o 17:02, Matthew Thode pisze:
I tracked down the regression to the upper constraint on setuptools. For example, stable/yoga has:
setuptools===59.6.0;python_version=='3.6' setuptools===60.9.3;python_version=='3.8'
It appears this is ignored in the py39 job so the job runs with the latest setuptools. Indeed, there were some releases between March 10 and March 24. I still have to figure out what changed in setuptools to cause this behaviour.
Question for requirements maintainers: is this expected behaviour, or should upper constraints also include lines for python_version=='3.9' on yoga?
I plan on removing py36 and adding py39 constraints today or tomorrow.
So we will have ones for 3.8, other ones for 3.9 and then for 3.10 too?
Can we just do one set with "3.8 is minimal, if someone runs older then it is their problem"?
I don't think doing that would be a good idea (or possible in all cases). The idea here is that we're always trying to use the newest possible package versions. If a dependency drops support for 3.8 then you get a 3.8 specific entry for that python version and another for 3.9/3.10 for the newer stuff. It is possible (and likely) that a dependency could drop 3.8 and have newer versions for 3.9. It is also possible that a dependency could have no versions that satisfy all of 3.8, 3.9, and 3.10. Basically you have to accept that there may be entries for any version of python that you support due to the way dependencies handle python support, and the desire to have up to date dependencies.
3.8 - Ubuntu 'focal' 20.04 3.9 - Debian 'bullseye' 11, CentOS Stream 9/RHEL 9 (and rebuilds) 3.10 - Ubuntu 'jammy' 22.04
Those are "main" distributions OpenStack Zed runs on. We should have one set with "python_version" limits used only when it is REALLY needed.
On Wed, Apr 13, 2022 at 8:21 PM Pierre Riteau <pierre@stackhpc.com> wrote:
Hello,
In the blazar project, we have been seeing a job timeout failure in openstack-tox-py39 affecting master and stable/yoga. tox starts the "lockutils-wrapper python setup.py testr --slowest --testr-args=" process which doesn't show progress until job timeout.
I think you should ideally be migrating to stestr which does not have the issue. I've revived https://review.opendev.org/c/openstack/blazar/+/581547.
It started happening sometime between 2022-03-10 16:01:39 (last success on master) and 2022-03-24 16:35:15 (first timeout occurrence) [1], with no change in blazar itself and few changes in requirements.
I resumed debugging today and managed to reproduce it using Ubuntu 20.04 (it doesn't happen on macOS). Here is the traceback after interrupting it if anyone wants to take a look [2]. The python process is using 100% of the CPU until interrupted.
I tracked down the regression to the upper constraint on setuptools. For example, stable/yoga has:
setuptools===59.6.0;python_version=='3.6' setuptools===60.9.3;python_version=='3.8'
It appears this is ignored in the py39 job so the job runs with the latest setuptools. Indeed, there were some releases between March 10 and March 24. I still have to figure out what changed in setuptools to cause this behaviour.
Question for requirements maintainers: is this expected behaviour, or should upper constraints also include lines for python_version=='3.9' on yoga?
Thanks, Pierre Riteau (priteau)
[1] https://zuul.openstack.org/builds?job_name=openstack-tox-py39&project=openstack%2Fblazar&skip=0 [2] https://paste.opendev.org/show/bZO7ELmTvfMUGJPdlQ4k/
-- Regards, Rabi Mishra
On Thu, 14 Apr 2022 at 05:32, Rabi Mishra <ramishra@redhat.com> wrote:
On Wed, Apr 13, 2022 at 8:21 PM Pierre Riteau <pierre@stackhpc.com> wrote:
Hello,
In the blazar project, we have been seeing a job timeout failure in openstack-tox-py39 affecting master and stable/yoga. tox starts the "lockutils-wrapper python setup.py testr --slowest --testr-args=" process which doesn't show progress until job timeout.
I think you should ideally be migrating to stestr which does not have the issue. I've revived https://review.opendev.org/c/openstack/blazar/+/581547.
Many thanks Rabi! Switching to stestr fixes the issue.
participants (5)
-
Clark Boylan
-
Marcin Juszkiewicz
-
Matthew Thode
-
Pierre Riteau
-
Rabi Mishra