<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
</head>
<body>
<p dir="auto">This was originated because theTelco requirements I have described there. The implementation will reside in the OpenStack. So we are looking the problem described, what else operators need and then how to accomplish that. Most probably looking
a new tool instead of injecting any existing.</p>
<p dir="auto">Tomi</p>
<p dir="auto">Sent from <a href="https://aka.ms/blhgte">Outlook Mobile</a><br>
</p>
<p dir="auto">From: EXT Tim Bell<br>
Sent: Saturday, April 23, 04:46<br>
Subject: Re: [Openstack-operators] Maintenance<br>
To: Joseph Bajin, Robert Starmer<br>
Cc: OpenStack Operators<br>
<br>
</p>
<p dir="auto">The overall requirements are being reviewed in <a href="https://etherpad.openstack.org/p/AUS-ops-Nova-maint">https://etherpad.openstack.org/p/AUS-ops-Nova-maint</a>. A future tool may make its way in OSOps but I think we should keep the requirements
discussion distinct from the available community tools and their tool repository.<br>
</p>
<p dir="auto">Tim<br>
</p>
<p dir="auto">From: Joseph Bajin <<a href="mailto:josephbajin@gmail.com">josephbajin@gmail.com</a>><br>
Date: Friday 22 April 2016 at 17:55<br>
To: Robert Starmer <<a href="mailto:robert@kumul.us">robert@kumul.us</a>><br>
Cc: openstack-operators <<a href="mailto:openstack-operators@lists.openstack.org">openstack-operators@lists.openstack.org</a>><br>
Subject: Re: [Openstack-operators] Maintenance<br>
</p>
<blockquote type="cite">
<p dir="auto">Rob/Jay, <br>
</p>
<p dir="auto">The use of the OSOps Working group and its repos is a great way to address this.. If any of you are coming to the Summit, please take a look at our Etherpad that we have created.[1] This could be a great discussion topic for the working sessions
and we can brainstorm how we could help with this. <br>
</p>
<p dir="auto">Joe<br>
</p>
<p dir="auto">[1] <a href="https://etherpad.openstack.org/p/AUS-ops-OSOps">https://etherpad.openstack.org/p/AUS-ops-OSOps</a><br>
</p>
<p dir="auto">On Fri, Apr 22, 2016 at 4:02 PM, Robert Starmer <<a href="mailto:robert@kumul.us">robert@kumul.us</a>> wrote:</p>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">Maybe a result of the discussion can be a set of models (let's not go so far as to call them best pracices yet :) for how maintainance can be done at scale, perhaps solidifying the descriptions Jay has above with the user stories Tomi described
in his initial note. This seems like an achievable outcome from a working session, and the output even has a target, either creating scripable workflows that could end up in the OSops repository, or as user stories that can be mapped to the PM working group.<font color="#888888">
</font><br>
</p>
<p dir="auto">R<br>
</p>
<p dir="auto">On Fri, Apr 22, 2016 at 12:47 PM, Jay Pipes <<a href="mailto:jaypipes@gmail.com">jaypipes@gmail.com</a>> wrote:</p>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">On 04/14/2016 05:14 AM, Juvonen, Tomi (Nokia - FI/Espoo) wrote:<br>
<snip></p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">As admin I want to know when host is ready to actions to be done by admin<br>
during the maintenance. Meaning physical resources are emptied.<br>
</p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"></p>
<p dir="auto">You are equating "host maintenance mode" with the end result of a call to `nova host-evacuate-live`. The two are not the same.</p>
<p dir="auto">"host maintenance mode" typically just refers to taking a Nova compute node out of consideration for placing new workloads on that compute node. Putting a Nova compute node into host maintenance mode is as simple as calling `nova service-disable
$hostname nova-compute`.</p>
<p dir="auto">Depending on what you need to perform on the compute node that is in host maintenance mode, you *may* want to migrate the workloads from that compute node to some other compute node that isn't in host maintenance mode. The `nova host-evacuate
$hostname` and `nova host-evacuate-live $hostname` commands in the Nova CLI [1] can be used to migrate or live-migrate all workloads off the target compute node.</p>
<p dir="auto">Live migration will reduce the disruption that tenant workloads (data plane) experience during the workload migration. However, research at Mirantis has shown that libvirt/KVM/QEMU live migration performed against workloads with even a medium
rate of memory page dirtying can easily never complete. Solutions like auto-converge and xbzrle compression have minimal effect on this, unfortunately. Pausing a workload manually is typically what is done to force the live migration to complete.</p>
<p dir="auto">[1] Note that these are commands in the Nova CLI tool (python-novaclient). Neither a host-evacuate nor a host-evacuate-live REST API call exists in the Compute API. This fact alone should suggest to folks that the appropriate place to put logic
associated with performing host maintenance tasks should be *outside* of Nova entirely...</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">As owner of a server I want to prepare for maintenance to minimize downtime,<br>
keep capacity on needed level and switch HA service to server not<br>
affected by maintenance.<br>
</p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"></p>
<p dir="auto">This isn't an appropriate use case, IMHO. HA control planes should, by their very nature, be established across various failure domains. The whole *point* of having an HA service is so that you don't need to "prepare" for some maintenance event
(planned or unplanned).</p>
<p dir="auto">All HA control planes worth their salt will be able to notify some external listener of a partition in the cluster. This HA control plane is the responsibility of the tenant, not the infrastructure (i.e. Nova). I really do not want to add coupling
between infrastructure control plane services and tenant control plane services.</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">As owner of a server I want to know when my servers will be down because of<br>
host maintenance as it might be servers are not moved to another host.<br>
</p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"></p>
<p dir="auto">See above. As an owner of a server involved in an HA cluster, it is *the server owner's* responsibility to set things up so that the cluster rebalances, handles redirected load, or does the custom thing that they want. This isn't, IMHO, the domain
of the NVFi but rather a much higher-level NFVO orchestration layer.</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">As owner of a server I want to know if host is to be totally removed, so<br>
instead of keeping my servers on host during maintenance, I want to move<br>
them to somewhere else.<br>
</p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"></p>
<p dir="auto">This isn't something the owner of a server even knows about in a cloud environment. Owners of a server don't (and shouldn't) know which compute node they are, nor should they know that a host is having a planned or unplanned host maintenance event.</p>
<p dir="auto">The infrastructure owner (cloud deployer/operator) is responsible for doing the needful and performing a [live] migration of workloads off of a failing host or a host that is undergoing a cold upgrade. The tenant doesn't know anything about these
things, and shouldn't.</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">As owner of a server I want to send acknowledgement to be ready for host<br>
maintenance and I want to state if servers are to be moved or kept on host.<br>
</p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"></p>
<p dir="auto">This is describing some virtual inventory management or CMDB functionality that isn't in scope for infrastructure services like Nova. Perhaps it's worth looking into how something like Remedy can manage your virtual inventory in this manner, but
I don't see this being in the OpenStack realm really...</p>
<p dir="auto">FWIW, this is the same objection I had to Tacker joining the OpenStack Big Tent. It is essentially a monolithic, purpose-built-for-Telco application that orchestrates VNFs at layers way above the OpenStack deployment.</p>
<p dir="auto">Best,<br>
-jay</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto">Removal and creating of server is in owner's control already. Optionally<br>
server<br>
Configuration data could hold information about automatic actions to be<br>
done<br>
when host is going down unexpectedly or in controlled manner. Also<br>
actions at<br>
the same if down permanently or only temporarily. Still this needs<br>
acknowledgement from server owner as he needs time for application level<br>
controlled HA service switchover.<br>
Br,<br>
Tomi<br>
</p>
<p dir="auto">_______________________________________________<br>
OpenStack-operators mailing list<br>
<a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
</p>
</blockquote>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"></p>
<p dir="auto">_______________________________________________<br>
OpenStack-operators mailing list<br>
<a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
</p>
</blockquote>
</blockquote>
</blockquote>
<blockquote type="cite">
<blockquote type="cite">
<p dir="auto"><br>
</p>
<p dir="auto">_______________________________________________<br>
OpenStack-operators mailing list<br>
<a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
</p>
</blockquote>
</blockquote>
<blockquote type="cite">
<p dir="auto"><br>
</p>
</blockquote>
<p dir="auto"><br>
</p>
</body>
</html>