[openstack-dev] [nova][placement] Placement requests and caching in the resource tracker

Tetsuro Nakamura tnakamura.openstack at gmail.com
Mon Nov 5 14:55:07 UTC 2018


Thus we should only read from placement:
> * at compute node startup
> * when a write fails
> And we should only write to placement:
> * at compute node startup
> * when the virt driver tells us something has changed


I agree with this.

We could also prepare an interface for operators/other-projects to force
nova to pull fresh information from placement and put it into its cache in
order to avoid predictable conflicts.

Is that right? If it is not right, can we do that? If not, why not?


The same question from me.
Refreshing periodically strategy might be now an optional optimization for
smaller clouds?

2018年11月5日(月) 20:53 Chris Dent <cdent+os at anticdent.org>:

> On Sun, 4 Nov 2018, Jay Pipes wrote:
>
> > Now that we have generation markers protecting both providers and
> consumers,
> > we can rely on those generations to signal to the scheduler report
> client
> > that it needs to pull fresh information about a provider or consumer.
> So,
> > there's really no need to automatically and blindly refresh any more.
>
> I agree with this ^.
>
> I've been trying to tease out the issues in this thread and on the
> associated review [1] and I've decided that much of my confusion
> comes from the fact that we refer to a thing which is a "cache" in
> the resource tracker and either trusting it more or not having it at
> all, and I think that's misleading. To me a "cache" has multiple
> clients and there's some need for reconciliation and invalidation
> amongst them. The thing that's in the resource tracker is in one
> process, changes to it are synchronized; it's merely a data structure.
>
> Some words follow where I try to tease things out a bit more (mostly
> for my own sake, but if it helps other people, great). At the very
> end there's a bit of list of suggested todos for us to consider.
>
> What we have is a data structure which represents the resource
> tracker and virtdirver's current view on what providers and
> associates it is aware of. We maintain a boundary between the RT and
> the virtdriver that means there's "updating" going on that sometimes
> is a bit fussy to resolve (cf. recent adjustments to allocation
> ratio handling).
>
> In the old way, every now and again we get a bunch of info from
> placement to confirm that our view is right and try to reconcile
> things.
>
> What we're considering moving towards is only doing that "get a
> bunch of info from placement" when we fail to write to placement
> because of a generation conflict.
>
> Thus we should only read from placement:
>
> * at compute node startup
> * when a write fails
>
> And we should only write to placement:
>
> * at compute node startup
> * when the virt driver tells us something has changed
>
> Is that right? If it is not right, can we do that? If not, why not?
>
> Because generations change, often, they guard against us making
> changes in ignorance and allow us to write blindly and only GET when
> we fail. We've got this everywhere now, let's use it. So, for
> example, even if something else besides the compute is adding
> traits, it's cool. We'll fail when we (the compute) try to clobber.
>
> Elsewhere in the thread several other topics were raised. A lot of
> that boil back to "what are we actually trying to do in the
> periodics?". As is often the case (and appropriately so) what we're
> trying to do has evolved and accreted in an organic fashion and it
> is probably time for us to re-evaluate and make sure we're doing the
> right stuff. The first step is writing that down. That aspect has
> always been pretty obscure or tribal to me, I presume so for others.
> So doing a legit audit of that code and the goals is something we
> should do.
>
> Mohammed's comments about allocations getting out of sync are
> important. I agree with him that it would be excellent if we could
> go back to self-healing those, especially because of the "wait for
> the computes to automagically populate everything" part he mentions.
> However, that aspect, while related to this, is not quite the same
> thing. The management of allocations and the management of
> inventories (and "associates") is happening from different angles.
>
> And finally, even if we turn off these refreshes to lighten the
> load, placement still needs to be capable of dealing with frequent
> requests, so we have something to fix there. We need to do the
> analysis to find out where the cost is and implement some solutions.
> At the moment we don't know where it is. It could be:
>
> * In the database server
> * In the python code that marshals the data around those calls to
>    the database
> * In the python code that handles the WSGI interactions
> * In the web server that is talking to the python code
>
> belmoreira's document [2] suggests some avenues of investigation
> (most CPU time is in user space and not waiting) but we'd need a bit
> more information to plan any concrete next steps:
>
> * what's the web server and which wsgi configuration?
> * where's the database, if it's different what's the load there?
>
> I suspect there's a lot we can do to make our code more correct and
> efficient. And beyond that there is a great deal of standard run-of-
> the mill server-side caching and etag handling that we could
> implement if necessary. That is: treat placement like a web app that
> needs to be optimized in the usual ways.
>
> As Eric suggested at the start of the thread, this kind of
> investigation is expected and normal. We've not done something
> wrong. Make it, make it correct, make it fast is the process.
> We're oscillating somewhere between 2 and 3.
>
> So in terms of actions:
>
> * I'm pretty well situated to do some deeper profiling and
>    benchmarking of placement to find the elbows in that.
>
> * It seems like Eric and Jay are probably best situated to define
>    and refine what should really be going on with the resource
>    tracker and other actions on the compute-node.
>
> * We need to have further discussion and investigation on
>    allocations getting out of sync. Volunteers?
>
> What else?
>
> [1] https://review.openstack.org/#/c/614886/
> [2]
> https://docs.google.com/document/d/1d5k1hA3DbGmMyJbXdVcekR12gyrFTaj_tJdFwdQy-8E/edit
>
> --
> Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
> freenode: cdent                                         tw:
> @anticdent__________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20181105/f00d62b1/attachment.html>


More information about the OpenStack-dev mailing list