[openstack-dev] [nova] Boston Forum session recap - claims in the scheduler (or conductor)
mriedemos at gmail.com
Fri May 19 00:55:18 UTC 2017
The etherpad for this session is here . The goal for this session was
to inform operators and get feedback on the plan for what we're doing
with moving claims from the computes to the control layer (scheduler or
We mostly talked about retries, which also came up in the cells v2
session that Dan Smith led  and will recap later.
Without getting into too many details, in the cells v2 session we came
to a compromise on build retries and said that we could pass hosts down
to the cell so that the cell-level conductor could retry if needed (even
though we expect doing claims at the top will fix the majority of
reasons you'd have a reschedule in the first place).
During the claims in the scheduler session, a new wrinkle came up which
is the hosts that the scheduler returns to the top-level conductor may
be in different cells. So if we have two cells, A and B, with hosts x
and y in cell A and host z in cell B, we can't send z to A for retries,
or x or y to B for retries. So we need some kind of post-filter/weigher
filtering such that hosts are grouped by cell and then they can be sent
to the cells for retries as necessary.
There was also some side discussion asking if we somehow regressed
pack-first strategies by using Placement in Ocata. John Garbutt and Dan
Smith have the context on this (I think) so I'm hoping they can clarify
if we really need to fix something in Ocata at this point, or is this
more of a case of closing a loop-hole?
We also spent a good chunk of the session talking about overhead
calculations for memory_mb and disk_gb which happens in the compute and
on a per-hypervisor basis. In the absence of automating ways to adjust
for overhead, our solution for now is operators can adjust reserved host
resource values (vcpus, memory, disk) via config options and be
conservative or aggressive as they see fit. Chris Dent and I also noted
that you can adjust those reserved values via the placement REST API but
they will be overridden by the config in a periodic task - which may be
a bug, if not at least a surprise to an operator.
We didn't really get into this during the forum session, but there are
different opinions within the nova dev team on how to do claims in the
controller services (conductor vs scheduler). Sylvain Bauza has a series
which uses the conductor service, and Ed Leafe has a series using the
scheduler. More on that in the mailing list .
Next steps are going to be weighing both options between Sylvain and Ed,
picking a path and moving forward, as we don't have a lot of time to sit
on this fence if we're going to get it done in Pike.
As a side request, it would be great if companies that have teams doing
performance and scale testing could help out and compare before (Ocata)
and after (Pike with claims in the controller) results, because we
eventually want to deprecate the caching scheduler but that currently
outperforms the filter scheduler at scale because of the retries
involved when using the filter scheduler, and which we expect doing
claims at the top will fix.
More information about the OpenStack-dev