[openstack-dev] [Cyborg] [Nova] Backup plan without nested RPs

Eric Fried openstack at fried.cc
Mon Jun 4 18:24:30 UTC 2018


Sundar-

	We've been discussing the upgrade path on another thread [1] and are
working toward a solution [2][3] that would not require downtime or
special scripts (other than whatever's normally required for an upgrade).

	We still hope to have all of that ready for Rocky, but if you're
concerned about timing, this work should make it a viable option for you
to start out modeling everything in the compute RP as you say, and then
move it over later.

		Thanks,
		Eric

[1] http://lists.openstack.org/pipermail/openstack-dev/2018-May/130783.html
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-June/131045.html
[3] https://etherpad.openstack.org/p/placement-migrate-operations

On 06/04/2018 12:49 PM, Nadathur, Sundar wrote:
> Hi,
>      Cyborg needs to create RCs and traits for accelerators. The
> original plan was to do that with nested RPs. To avoid rushing the Nova
> developers, I had proposed that Cyborg could start by applying the
> traits to the compute node RP, and accept the resulting caveats for
> Rocky, till we get nested RP support. That proposal did not find many
> takers, and Cyborg has essentially been in waiting mode.
> 
> Since it is June already, and there is a risk of not delivering anything
> meaningful in Rocky, I am reviving my older proposal, which is
> summarized as below:
> 
>   * Cyborg shall create the RCs and traits as per spec
>     (https://review.openstack.org/#/c/554717/), both in Rocky and
>     beyond. Only the RPs will change post-Rocky.
>   * In Rocky:
>       o Cyborg will not create nested RPs. It shall apply the device
>         traits to the compute node RP.
>       o Cyborg will document the resulting caveat, i.e., all devices in
>         the same host should have the same traits. In particular, we
>         cannot have a GPU and a FPGA, or 2 FPGAs of different types, in
>         the same host.
>       o Cyborg will document that upgrades to post-Rocky releases will
>         require operator intervention (as described below).
>   *  For upgrade to post-Rocky world with nested RPs:
>       o The operator needs to stop all running instances that use an
>         accelerator.
>       o The operator needs to run a script that removes the Cyborg
>         traits and the inventory for Cyborg RCs from compute node RPs.
>       o The operator can then perform the upgrade. The new Cyborg
>         agent/driver(s) shall created nested RPs and publish
>         inventory/traits as specified.
> 
> IMHO, it is acceptable for Cyborg to do this because it is new and we
> can set expectations for the (lack of) upgrade plan. The alternative is
> that potentially no meaningful use cases get addressed in Rocky for Cyborg.
> 
> Please LMK what you think.
> 
> Regards,
> Sundar
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list