[openstack-dev] [nova] [placement] Ocata upgrade procedure and problems when it's optional in Newton

Sylvain Bauza sbauza at redhat.com
Tue Jan 10 16:51:00 UTC 2017



Le 10/01/2017 14:49, Sylvain Bauza a écrit :
> Aloha folks,
> 
> Recently, I was discussing with TripleO folks. Disclaimer, I don't think
> it's only a TripleO related discussion but rather a larger one for all
> our deployers.
> 
> So, the question I was asked was about how to upgrade from Newton to
> Ocata for the Placement API when the deployer is not using yet the
> Placement API for Newton (because it was optional in Newton).
> 
> The quick answer was to say "easy, just upgrade the service and run the
> placement API *before* the scheduler upgrade". That's because we're
> working on a change for the scheduler calling the Placement API instead
> of getting all the compute nodes [1]
> 
> That said, I thought about something else : wait, the Newton compute
> nodes work with the Placement API, cool. Cool, but what if the Placement
> API is optional in Newton ? Then, the Newton computes are stopping to
> call the Placement API because of a nice decorator [2] (okay with me)
> 
> Then, imagine the problem for the upgrade : given we don't have
> deployers running the Placement API in Newton, they would need to
> *first* deploy the (Newton or Ocata) Placement service, then SIGHUP all
> the Newton compute nodes to have them reporting the resources (and
> creating the inventories), then wait for some minutes that all the
> inventories are reported, and then upgrade all the services (but the
> compute nodes of course) to Ocata, including the scheduler service.
> 
> The above looks a different upgrade policy, right?
>  - Either we say you need to run the Newton placement service *before*
> upgrading - and in that case, the Placement service is not optional for
> Newton, right?
>  - Or, we say you need to run the Ocata placement service and then
> restart the compute nodes *before* upgrading the services - and that's a
> very different situation than the current upgrade way.
> 
> For example, I know it's not a Nova stuff, but most of our deployers
> have what they say "controllers" vs. "compute" services, ie. all the
> Nova services but computes running on a single (or more) machine(s). In
> that case, the "controller" upgrade is monotonic and all the services
> are upgraded and restarted at the same stage. If so, that looks
> difficult for those deployers to just be asked to have a very different
> procedure.
> 
> Anyway, I think we need to carefully consider that, and probably find
> some solutions. For example, we could imagine (disclaimer #2, that's
> probably silly solutions, but that's the ones I'm thinking now) :
>  - a DB migration for creating the inventories and allocations before
> upgrading (ie. not asking the computes to register themselves to the
> placement API). That would be terrible because it's a data upgrade, I
> know...
>  - having the scheduler having a backwards compatible behaviour in [1],
> ie. trying to call the Placement API for getting the list of RPs or
> failback to calling all the ComputeNodes if that's not possible. But
> that would mean that the Placement API is still optional for Ocata :/
>  - merging the scheduler calling the Placement API [1] in a point
> release after we deliver Ocata (and still make the Placement API
> mandatory for Ocata) so that we would be sure that all computes are
> reporting their status to the Placement once we restart the scheduler in
> the point release.
> 

FWIW, a possible other solution has been discussed upstream in the
#openstack-nova channel and proposed by Dan Smith : we could remove the
try-once behaviour made in the decorator, backport it to Newton and do a
point release which would allow the compute nodes to try to reconcile
with the Placement API in a self-heal manner.

That would mean that deployers would have to upgrade to the latest
Newton point release before upgrading to Ocata, which is I think the
best supported model.

I'll propose a patch for that in my series as a bottom change for [1].

-Sylvain



> 
> Thoughts ?
> -Sylvain
> 
> 
> [1] https://review.openstack.org/#/c/417961/
> 
> [2]
> https://github.com/openstack/nova/blob/180e6340a595ec047c59365465f36fed7a669ec3/nova/scheduler/client/report.py#L40-L67
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list