Updating OS-TRAITS in placement stable/yoga and stable/zed

Sean Mooney smooney at redhat.com
Thu Apr 13 11:31:48 UTC 2023


On Wed, 2023-04-12 at 19:00 -0500, Tony Breeds wrote:
> On Wed, 12 Apr 2023 at 18:27, Karl Kloppenborg <kkloppenborg at rwts.com.au>
> wrote:
> 
> > Hello Placement and OS-Traits teams.
> > 
> > Karl from Openstack-Helm.
> > 
> > 
> > 
> > After debugging a number of issues in Cyborg, it has become necessary to
> > update the pinned OS-TRAITs version in stable/yoga and stable/zed.
> > 
> > I have submitted the following two reviews for updating the pinned
> > versions to 2.10.0
> > 
> > 
> > 
> > https://review.opendev.org/c/openstack/placement/+/880249
> > 
> > https://review.opendev.org/c/openstack/placement/+/880249
> > 
> 
> This won't do what you want.  The version that will be installed is
> controlled by constraints in the openstack/requirements repo
> 
> https://opendev.org/openstack/requirements/src/branch/stable/yoga/upper-constraints.txt#L388
> and
> https://opendev.org/openstack/requirements/src/branch/stable/zed/upper-constraints.txt#L361
> 
> We don't typically allow this kind of bump on stable branches as it often
> indicates a feature backport which we avoid in order to keep the release
> and gate stable.
ya this is not somethign we can or should change upstream

> 
> So we could *potentially* do what you need but the community as a whole
> will need to understand the rationale for why.
> 
my understanidng is they are tryign to use a features from antelop cyborg with a yoga placement.

the reall fix would be just update placemtn to at least zed in there deployemnt.
>   Upping the version on a
> stable branch will also have vendor impacts for supported versions of
> os-traints.
> 
> By merging these changes, it will allow us to continue to use vGPU
> > capabilities in cyborg in the stable/yoga and stable/zed onwards branches.
> > 
> 
> Continue to use?  Please help us understand this.
they have a downstream backport of a feature that merged in febuary but was written 3 years ago
specificaly they somehow have https://github.com/openstack/cyborg/commit/79e1928554b6a03dd481ebefd3f550adeb457aed
in there yoga/zed verions which uses the "OWNER_CYBORG" trait
https://github.com/openstack/cyborg/blob/master/cyborg/accelerator/drivers/gpu/nvidia/sysinfo.py#L52

which does not exsit until zed released in os-traits 2.8.0 https://github.com/openstack/os-traits/commit/f64d50e4dd2f21558fb73dd4b59cd1d4b121b707
placement in zed depens on 2.8.0 https://github.com/openstack/placement/commit/c4e89253a52839a514bd87fdebc4785fe0e64146

the fact tehy need os-traits 2.10.0 probaly means they have other backports.

standard trait shoudl not really be backproted.
when we consider feature backports downstream we do not backport feature that require new traits to funciton.
we treat it like an api change (microversion). strictly speaking it spossibel to do but all that they really need
to do is update the installed version of os-triats they use.

the other thing to not is that we dont curretnly use the owner traits for anythign.

the nova part of https://specs.openstack.org/openstack/nova-specs/specs/zed/approved/owner-nova-trait-usage.html was not implemented
so the nova managed resouce are not tagged it OWNER_NOVA so you cant have boot nova managed gpus and nova manged vgpus shareing hte same resouce class
yet. at least not out of the box.

they 3 paths forward really are as follows.
1.) work with your downstream vendor distibution to have them remove the usage of the ownwer trait on ther yoga backport and any other standard traits
that dont exist there. i.e. traits = ["OWNER_CYBORG"] -> traits = [] here
https://github.com/openstack/cyborg/blob/master/cyborg/accelerator/drivers/gpu/nvidia/sysinfo.py#L52

2.) work with your downstrem vendor distribuion  to update the os-traits lib they ship. the newer versions will work with older placement.
it will just break the unit test that count the number of standard traits in the db. that is trivial to fix to the new count.

3.) upgrade your placement version in your deployment to a zed/antelope container, newer placement will work with older everything else.

if you are using the upstream version fo cyborg with the upstream version of placment/os-traits there shoudl be no incompatiblity.



> 
> Tony.




More information about the openstack-discuss mailing list