<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif;">
<div>In item #2 below the reboot is down via the guest and not the nova api’s :)</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style="font-weight:bold">From: </span>Gary Kotton <<a href="mailto:gkotton@vmware.com">gkotton@vmware.com</a>><br>
<span style="font-weight:bold">Reply-To: </span>OpenStack List <<a href="mailto:openstack-dev@lists.openstack.org">openstack-dev@lists.openstack.org</a>><br>
<span style="font-weight:bold">Date: </span>Monday, August 24, 2015 at 7:18 PM<br>
<span style="font-weight:bold">To: </span>OpenStack List <<a href="mailto:openstack-dev@lists.openstack.org">openstack-dev@lists.openstack.org</a>><br>
<span style="font-weight:bold">Subject: </span>[openstack-dev] [nova] periodic task<br>
</div>
<div><br>
</div>
<div>
<div style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif;">
<div>Hi,</div>
<div>A couple of months ago I posted a patch for bug <a href="https://launchpad.net/bugs/1463688">https://launchpad.net/bugs/1463688</a>. The issue is as follows: the periodic task detects that the instance state does not match the state on the hypervisor and
it shuts down the running VM. There are a number of ways that this may happen and I will try and explain:</div>
<ol>
<li>Vmware driver example: a host where the instances are running goes down. This could be a power outage, host failure, etc. The first iteration of the perdioc task will determine that the actual instacne is down. This will update the state of the instance
to DOWN. The VC has the ability to do HA and it will start the instance up and running again. The next iteration of the periodic task will determine that the instance is up and the compute manager will stop the instance.</li><li>All drivers. The tenant decides to do a reboot of the instance and that coincides with the periodic task state validation. At this point in time the instance will not be up and the compute node will update the state of the instance as DWON. Next iteration
the states will differ and the instance will be shutdown</li></ol>
<div>Basically the issue hit us with our CI and there was no CI running for a couple of hours due to the fact that the compute node decided to shutdown the running instances. The hypervisor should be the source of truth and it should not be the compute node
that decides to shutdown instances. I posted a patch to deal with this <a href="https://review.openstack.org/#/c/190047">https://review.openstack.org/#/c/190047</a>/. Which is the reason for this mail. The patch is backwards compatible so that the existing
deployments and random shutdown continues as it works today and the admin now has an ability just to do a log if there is a inconsistency.</div>
<div><br>
</div>
<div>We do not want to disable the periodic task as knowing the current state of the instance is very important and has a ton of value, we just do not want the periodic to task to shut down a running instance.</div>
<div><br>
</div>
<div>Thanks</div>
<div>Gary</div>
</div>
</div>
</span>
</body>
</html>