Open Stack

Fri May 19 00:55:18 UTC 2017

The etherpad for this session is here [1]. The goal for this session was 
to inform operators and get feedback on the plan for what we're doing 
with moving claims from the computes to the control layer (scheduler or 
conductor).

We mostly talked about retries, which also came up in the cells v2 
session that Dan Smith led [2] and will recap later.

Without getting into too many details, in the cells v2 session we came 
to a compromise on build retries and said that we could pass hosts down 
to the cell so that the cell-level conductor could retry if needed (even 
though we expect doing claims at the top will fix the majority of 
reasons you'd have a reschedule in the first place).

During the claims in the scheduler session, a new wrinkle came up which 
is the hosts that the scheduler returns to the top-level conductor may 
be in different cells. So if we have two cells, A and B, with hosts x 
and y in cell A and host z in cell B, we can't send z to A for retries, 
or x or y to B for retries. So we need some kind of post-filter/weigher 
filtering such that hosts are grouped by cell and then they can be sent 
to the cells for retries as necessary.

There was also some side discussion asking if we somehow regressed 
pack-first strategies by using Placement in Ocata. John Garbutt and Dan 
Smith have the context on this (I think) so I'm hoping they can clarify 
if we really need to fix something in Ocata at this point, or is this 
more of a case of closing a loop-hole?

We also spent a good chunk of the session talking about overhead 
calculations for memory_mb and disk_gb which happens in the compute and 
on a per-hypervisor basis. In the absence of automating ways to adjust 
for overhead, our solution for now is operators can adjust reserved host 
resource values (vcpus, memory, disk) via config options and be 
conservative or aggressive as they see fit. Chris Dent and I also noted 
that you can adjust those reserved values via the placement REST API but 
they will be overridden by the config in a periodic task - which may be 
a bug, if not at least a surprise to an operator.

We didn't really get into this during the forum session, but there are 
different opinions within the nova dev team on how to do claims in the 
controller services (conductor vs scheduler). Sylvain Bauza has a series 
which uses the conductor service, and Ed Leafe has a series using the 
scheduler. More on that in the mailing list [3].

Next steps are going to be weighing both options between Sylvain and Ed, 
picking a path and moving forward, as we don't have a lot of time to sit 
on this fence if we're going to get it done in Pike.

As a side request, it would be great if companies that have teams doing 
performance and scale testing could help out and compare before (Ocata) 
and after (Pike with claims in the controller) results, because we 
eventually want to deprecate the caching scheduler but that currently 
outperforms the filter scheduler at scale because of the retries 
involved when using the filter scheduler, and which we expect doing 
claims at the top will fix.

[1] 
https://etherpad.openstack.org/p/BOS-forum-move-claims-from-compute-to-scheduler
[2] 
https://etherpad.openstack.org/p/BOS-forum-cellsv2-developer-community-coordination
[3] http://lists.openstack.org/pipermail/openstack-dev/2017-May/116949.html

-- 

Thanks,

Matt

Open Stack

[openstack-dev] [nova] Boston Forum session recap - claims in the scheduler (or conductor)

OpenStack

Community

Documentation

Branding & Legal