[openstack-dev] [nova] periodic task
Gary Kotton
gkotton at vmware.com
Tue Sep 1 10:49:21 UTC 2015
On 8/31/15, 9:22 PM, "Matt Riedemann" <mriedem at linux.vnet.ibm.com> wrote:
>
>
>On 8/27/2015 1:22 AM, Gary Kotton wrote:
>>
>>
>> On 8/25/15, 2:43 PM, "Andrew Laski" <andrew at lascii.com> wrote:
>>
>>> On 08/25/15 at 06:08pm, Gary Kotton wrote:
>>>>
>>>>
>>>> On 8/25/15, 9:10 AM, "Matt Riedemann" <mriedem at linux.vnet.ibm.com>
>>>>wrote:
>>>>
>>>>>
>>>>>
>>>>> On 8/25/2015 10:03 AM, Gary Kotton wrote:
>>>>>>
>>>>>>
>>>>>> On 8/25/15, 7:04 AM, "Matt Riedemann" <mriedem at linux.vnet.ibm.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 8/24/2015 9:32 PM, Gary Kotton wrote:
>>>>>>>> In item #2 below the reboot is down via the guest and not the nova
>>>>>>>> api¹s :)
>>>>>>>>
>>>>>>>> From: Gary Kotton <gkotton at vmware.com <mailto:gkotton at vmware.com>>
>>>>>>>> Reply-To: OpenStack List <openstack-dev at lists.openstack.org
>>>>>>>> <mailto:openstack-dev at lists.openstack.org>>
>>>>>>>> Date: Monday, August 24, 2015 at 7:18 PM
>>>>>>>> To: OpenStack List <openstack-dev at lists.openstack.org
>>>>>>>> <mailto:openstack-dev at lists.openstack.org>>
>>>>>>>> Subject: [openstack-dev] [nova] periodic task
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>> A couple of months ago I posted a patch for bug
>>>>>>>> https://launchpad.net/bugs/1463688. The issue is as follows: the
>>>>>>>> periodic task detects that the instance state does not match the
>>>>>>>> state
>>>>>>>> on the hypervisor and it shuts down the running VM. There are a
>>>>>>>> number
>>>>>>>> of ways that this may happen and I will try and explain:
>>>>>>>>
>>>>>>>> 1. Vmware driver example: a host where the instances are
>>>>>>>>running
>>>>>>>> goes
>>>>>>>> down. This could be a power outage, host failure, etc. The
>>>>>>>> first
>>>>>>>> iteration of the perdioc task will determine that the actual
>>>>>>>> instacne is down. This will update the state of the
>>>>>>>>instance to
>>>>>>>> DOWN. The VC has the ability to do HA and it will start the
>>>>>>>> instance
>>>>>>>> up and running again. The next iteration of the periodic
>>>>>>>>task
>>>>>>>> will
>>>>>>>> determine that the instance is up and the compute manager
>>>>>>>>will
>>>>>>>> stop
>>>>>>>> the instance.
>>>>>>>> 2. All drivers. The tenant decides to do a reboot of the
>>>>>>>>instance
>>>>>>>> and
>>>>>>>> that coincides with the periodic task state validation. At
>>>>>>>>this
>>>>>>>> point in time the instance will not be up and the compute
>>>>>>>>node
>>>>>>>> will
>>>>>>>> update the state of the instance as DWON. Next iteration the
>>>>>>>> states
>>>>>>>> will differ and the instance will be shutdown
>>>>>>>>
>>>>>>>> Basically the issue hit us with our CI and there was no CI running
>>>>>>>> for a
>>>>>>>> couple of hours due to the fact that the compute node decided to
>>>>>>>> shutdown the running instances. The hypervisor should be the
>>>>>>>>source
>>>>>>>> of
>>>>>>>> truth and it should not be the compute node that decides to
>>>>>>>>shutdown
>>>>>>>> instances. I posted a patch to deal with this
>>>>>>>> https://review.openstack.org/#/c/190047/. Which is the reason for
>>>>>>>> this
>>>>>>>> mail. The patch is backwards compatible so that the existing
>>>>>>>> deployments
>>>>>>>> and random shutdown continues as it works today and the admin now
>>>>>>>> has
>>>>>>>> an
>>>>>>>> ability just to do a log if there is a inconsistency.
>>>>>>>>
>>>>>>>> We do not want to disable the periodic task as knowing the current
>>>>>>>> state
>>>>>>>> of the instance is very important and has a ton of value, we just
>>>>>>>>do
>>>>>>>> not
>>>>>>>> want the periodic to task to shut down a running instance.
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Gary
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>___________________________________________________________________
>>>>>>>>__
>>>>>>>> __
>>>>>>>> __
>>>>>>>> _
>>>>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>>>>> Unsubscribe:
>>>>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>>>>
>>>>>>>
>>>>>>> In #2 the guest shouldn't be rebooted by the user (tenant) outside
>>>>>>>of
>>>>>>> the nova-api. I'm not sure if it's actually formally documented in
>>>>>>> the
>>>>>>> nova documentation, but from what I've always heard/known, nova is
>>>>>>> the
>>>>>>> control plane and you should be doing everything with your
>>>>>>>instances
>>>>>>> via
>>>>>>> the nova-api. If the user rebooted via nova-api, the task_state
>>>>>>> would
>>>>>>> be set and the periodic task would ignore the instance.
>>>>>>
>>>>>> Matt, this is one case that I showed where the problem occurs. There
>>>>>> are
>>>>>> others and I can invest time to see them. The fact that the periodic
>>>>>> task
>>>>>> is there is important. What I don¹t understand is why having an
>>>>>>option
>>>>>> of
>>>>>> log indication for an admin is something that is not useful and
>>>>>> instead
>>>>>> we
>>>>>> are going with having the compute node shutdown instance when this
>>>>>> should
>>>>>> not happen. Our infrastructure is behaving like cattle. That should
>>>>>> not
>>>>>> be
>>>>>> the case and the hypervisor should be the source of truth.
>>>>>>
>>>>>> This is a serious issue and instances in production can and will go
>>>>>> down.
>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Matt Riedemann
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>____________________________________________________________________
>>>>>>>__
>>>>>>> __
>>>>>>> __
>>>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>>>> Unsubscribe:
>>>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>_____________________________________________________________________
>>>>>>__
>>>>>> __
>>>>>> _
>>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>>> Unsubscribe:
>>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>>>
>>>>>
>>>>> For the HA case #1, the periodic task checks to see if the
>>>>>instance.host
>>>>> doesn't match the compute service host [1] and skips if they don't
>>>>> match.
>>>>>
>>>>> Shouldn't your HA scenario be updating which host the instance is
>>>>> running on? Or is this a vCenter-ism?
>>>>
>>>> The nova compute node has not changed. It is not the compute nodes
>>>>host.
>>>> The host that the instance was running on was down and those instances
>>>> were moved.
>>>
>>> So this is a case where a single compute node is managing multiple
>>> hypervisors? It sounds like there is an assumption being made in the
>>> periodic task that doesn't hold true for the VMware driver, that a
>>> request for the power state of an instance would fail if the host was
>>> down. This may be a better fix here: to not sync the state if the host
>>> is down.
>>>
>>>
>>>
>>>>
>>>> For libvirt the same issues could happen if a process goes down and is
>>>> restarted (there may be some race conditions). But I am not familiar
>>>> enough with the ins and outs there. Just the fact that suggesting in
>>>>some
>>>> cases that people disable the periodic task indicates that this too
>>>>is an
>>>> issue.
>>>>
>>>> But seriously, we need this and the change is non intrusive,
>>>>configuarble
>>>> and backwards compatible. Honestly I see no reason why this is bing
>>>> blocked.
>>>
>>> The change seems to be under discussion here because this is adding
>>>more
>>> complexity to an already quite complex method. I believe the desire is
>>> to find a model that simplifies, or at least doesn't add to the
>>> complexity of, the way that syncs are handled.
>>
>> I am not sure I understand what extra complexity is being added here -
>>the
>> patch in review just logs a message to the log file instead of stopping
>>a
>> running instance.
>>
>> How do you guys suggest that we move forwards with this. At the moment
>>the
>> code is blocked and this is a real problem in deployment.
>>
>> BTW I do not think that this is specific for the Vmware driver - it is
>> just that we hit it first :)
>>
>>>
>>>>
>>>>
>>>>>
>>>>> [1]
>>>>>
>>>>>http://git.openstack.org/cgit/openstack/nova/tree/nova/compute/manager
>>>>>.p
>>>>> y#
>>>>> n5871
>>>>>
>>>>> --
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Matt Riedemann
>>>>>
>>>>>
>>>>>
>>>>>______________________________________________________________________
>>>>>__
>>>>> __
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe:
>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>_______________________________________________________________________
>>>>__
>>>> _
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>>________________________________________________________________________
>>>__
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>>OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>_________________________________________________________________________
>>_
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>>OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>I think ideally the virt driver should determine that there is some
>background task going on with the instance (like the HA case of it
>migrating hosts in the VC cluster) and signal to the task in the compute
>manager that that instance should be skipped (basically the same logic
>as if instance.task_state is not None). Barring that, I've proposed an
>alternative solution for what I think is the root issue you're having:
>
>https://review.openstack.org/#/c/218975/
I have a number of issues with this approach:
1. Say there is the edge case when the user does a reboot and the periodic
task detects that it is down. Then it will be randomly rebooted at the
next periodic task iteration.
2. If we are going to go the route of a configuration variable then I do
not think that using an existing variable is the correct approach. That
may affect workloads for say libvirt.
3. What about have a new configuration variable indicating the action that
should be taken. For example:
- stop
- reboot
- continue
Lets at least give the admin the power to decide how the guests should
behave.
I would really hate to have a user being in the middle of an operation and
suddenly the instance is rebooted.
Thanks
Gary
>
>--
>
>Thanks,
>
>Matt Riedemann
>
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list