[openstack-dev] [nova] Order of n-api (placement) and n-sch upgrades for Ocata
Sylvain Bauza
sbauza at redhat.com
Thu Jan 19 16:55:51 UTC 2017
Le 19/01/2017 17:00, Matt Riedemann a écrit :
> On 1/19/2017 9:43 AM, Sylvain Bauza wrote:
>>
>>
>> Le 19/01/2017 16:27, Matt Riedemann a écrit :
>>> Sylvain and I were talking about how he's going to work placement
>>> microversion requests into his filter scheduler patch [1]. He needs to
>>> make requests to the placement API with microversion 1.4 [2] or later
>>> for resource provider filtering on specific resource classes like VCPU
>>> and MEMORY_MB.
>>>
>>> The question was what happens if microversion 1.4 isn't available in the
>>> placement API, i.e. the nova-scheduler is running Ocata code now but the
>>> placement service is running Newton still.
>>>
>>> Our rolling upgrades doc [3] says:
>>>
>>> "It is safest to start nova-conductor first and nova-api last."
>>>
>>> But since placement is bundled with n-api that would cause issues since
>>> n-sch now depends on the n-api code.
>>>
>>> If you package the placement service separately from the nova-api
>>> service then this is probably not an issue. You can still roll out n-api
>>> last and restart it last (for control services), and just make sure that
>>> placement is upgraded before nova-scheduler (we need to be clear about
>>> that in [3]).
>>>
>>> But do we have any other issues if they are not packaged separately? Is
>>> it possible to install the new code, but still only restart the
>>> placement service before nova-api? I believe it is, but want to ask this
>>> out loud.
>>>
>>> I think we're probably OK here but I wanted to ask this out loud and
>>> make sure everyone is aware and can think about this as we're a week
>>> from feature freeze. We also need to look into devstack/grenade because
>>> I'm fairly certain that we upgrade n-sch *before* placement in a grenade
>>> run which will make any issues here very obvious in [1].
>>>
>>> [1] https://review.openstack.org/#/c/417961/
>>> [2]
>>> http://docs.openstack.org/developer/nova/placement.html#filter-resource-providers-having-requested-resource-capacity
>>>
>>>
>>> [3]
>>> http://docs.openstack.org/developer/nova/upgrade.html#rolling-upgrade-process
>>>
>>>
>>>
>>
>> I thought out loud in the nova channel at the following possibility :
>> since we always ask to upgrade n-cpus *AFTER* upgrading our other
>> services, we could imagine to allow the nova-scheduler gently accept to
>> have a placement service be Newton *UNLESS* you have Ocata computes.
>>
>> On other technical words, the scheduler getting a response from the
>> placement service is an hard requirement for Ocata. That said, if the
>> response code is a 400 with a message saying that the schema is
>> incorrect, it would be checking the max version of all the computes and
>> then :
>> - either the max version is Newton and then call back the
>> ComputeNodeList.get_all() for getting the list of nodes
>> - or, the max version is Ocata (at least one node is upgraded), and
>> then we would throw a NoValidHosts
>>
>> That way, the upgrade path would be :
>> 1/ upgrade your conductor
>> 2/ upgrade all your other services but n-cpus (we could upgrade and
>> restart n-sch before n-api, that would still work, or the contrary would
>> be fine too)
>> 3/ rolling upgrade your n-cpus
>>
>> I think we would keep then the existing upgrade path and we would still
>> have the placement service be mandatory for Ocata.
>>
>> Thoughts ?
>> -Sylvain
>>
>> __________________________________________________________________________
>>
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> I don't like basing the n-sch decision on the service version of the
> computes, because the computes will keep trying to connect to the
> placement service until it's available, but not fail. That doesn't
> really mean that placement is new enough for the scheduler to use the
> 1.4 microversion.
>
> So IMO we either charge forward as planned and make it clear in the docs
> that for Ocata, the placement service must be upgraded *before*
> nova-scheduler, or we punt and provide a fallback to just pulling all
> compute nodes from the database if we can't make the 1.4 request to
> placement. Given my original post here, I'd prefer to charge forward
> unless it becomes clear that is not going to work, or is at least going
> to be very painful.
>
Given the very short term for cycle-trailing projects [1] deadline which
is R+2 [2], that would mean a charge forward for asking to modify their
deployments would have to be done by the next 3 weeks (even less given
that we haven't yet agreed and haven't yet provided the documentation).
That would like a very short time for them and a fire drill then.
I'd prefer to see a possibility to rather accept the placement service
to be Newton. If you don't agree with verifying the compute node
versions, why not maybe just accepting to fallback calling the database
in case the 1.4 placement request is not accepted ?
-Sylvain
[1]
https://releases.openstack.org/reference/release_models.html#cycle-trailing
[2] https://releases.openstack.org/ocata/schedule.html#o-trailing
More information about the OpenStack-dev
mailing list