<div dir="ltr"><br><div class="gmail_quote">---------- Forwarded message ----------<br>From: <b class="gmail_sendername">Matt Riedemann</b> <span dir="ltr"><<a href="mailto:mriedemos@gmail.com">mriedemos@gmail.com</a>></span><br>Date: Thu, May 18, 2017 at 7:55 PM<br>Subject: [openstack-dev] [nova] Boston Forum session recap - claims in the scheduler (or conductor)<br>To: <a href="mailto:openstack-dev@lists.openstack.org">openstack-dev@lists.openstack.org</a><br><br><br>The etherpad for this session is here [1]. The goal for this session was to inform operators and get feedback on the plan for what we're doing with moving claims from the computes to the control layer (scheduler or conductor).<br>
<br>
We mostly talked about retries, which also came up in the cells v2 session that Dan Smith led [2] and will recap later.<br>
<br>
Without getting into too many details, in the cells v2 session we came to a compromise on build retries and said that we could pass hosts down to the cell so that the cell-level conductor could retry if needed (even though we expect doing claims at the top will fix the majority of reasons you'd have a reschedule in the first place).<br>
<br>
During the claims in the scheduler session, a new wrinkle came up which is the hosts that the scheduler returns to the top-level conductor may be in different cells. So if we have two cells, A and B, with hosts x and y in cell A and host z in cell B, we can't send z to A for retries, or x or y to B for retries. So we need some kind of post-filter/weigher filtering such that hosts are grouped by cell and then they can be sent to the cells for retries as necessary.<br>
<br>
There was also some side discussion asking if we somehow regressed pack-first strategies by using Placement in Ocata. John Garbutt and Dan Smith have the context on this (I think) so I'm hoping they can clarify if we really need to fix something in Ocata at this point, or is this more of a case of closing a loop-hole?<br>
<br>
We also spent a good chunk of the session talking about overhead calculations for memory_mb and disk_gb which happens in the compute and on a per-hypervisor basis. In the absence of automating ways to adjust for overhead, our solution for now is operators can adjust reserved host resource values (vcpus, memory, disk) via config options and be conservative or aggressive as they see fit. Chris Dent and I also noted that you can adjust those reserved values via the placement REST API but they will be overridden by the config in a periodic task - which may be a bug, if not at least a surprise to an operator.<br>
<br>
We didn't really get into this during the forum session, but there are different opinions within the nova dev team on how to do claims in the controller services (conductor vs scheduler). Sylvain Bauza has a series which uses the conductor service, and Ed Leafe has a series using the scheduler. More on that in the mailing list [3].<br>
<br>
Next steps are going to be weighing both options between Sylvain and Ed, picking a path and moving forward, as we don't have a lot of time to sit on this fence if we're going to get it done in Pike.<br>
<br>
As a side request, it would be great if companies that have teams doing performance and scale testing could help out and compare before (Ocata) and after (Pike with claims in the controller) results, because we eventually want to deprecate the caching scheduler but that currently outperforms the filter scheduler at scale because of the retries involved when using the filter scheduler, and which we expect doing claims at the top will fix.<br>
<br>
[1] <a href="https://etherpad.openstack.org/p/BOS-forum-move-claims-from-compute-to-scheduler" rel="noreferrer" target="_blank">https://etherpad.openstack.org<wbr>/p/BOS-forum-move-claims-from-<wbr>compute-to-scheduler</a><br>
[2] <a href="https://etherpad.openstack.org/p/BOS-forum-cellsv2-developer-community-coordination" rel="noreferrer" target="_blank">https://etherpad.openstack.org<wbr>/p/BOS-forum-cellsv2-developer<wbr>-community-coordination</a><br>
[3] <a href="http://lists.openstack.org/pipermail/openstack-dev/2017-May/116949.html" rel="noreferrer" target="_blank">http://lists.openstack.org/pip<wbr>ermail/openstack-dev/2017-May/<wbr>116949.html</a><span class="HOEnZb"><font color="#888888"><br>
<br>
-- <br>
<br>
Thanks,<br>
<br>
Matt<br>
<br>
______________________________<wbr>______________________________<wbr>______________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.op<wbr>enstack.org?subject:unsubscrib<wbr>e</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi<wbr>-bin/mailman/listinfo/openstac<wbr>k-dev</a><br>
</font></span></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><span style="font-size:small">-- </span><br style="font-size:small"><div style="font-size:small"><div dir="ltr"><div dir="ltr">Kind regards,<br><br>Melvin Hillsman</div><div dir="ltr"><a href="mailto:mrhillsman@gmail.com" style="color:rgb(17,85,204)" target="_blank">mrhillsman@gmail.com</a><br>mobile: (832) 264-2646<br><br>Learner | Ideation | Belief | Responsibility | Command</div></div></div></div></div></div></div></div></div>
</div>