[dev][tc] Part 2: Evaluating projects in relation to OpenStack cloud vision

Chris Dent cdent+os at anticdent.org
Thu Feb 14 14:47:30 UTC 2019


On Thu, 14 Feb 2019, Jay Pipes wrote:

>> * No RPC, no messaging, no notifications.
>
> This is mostly just a historical artifact of wanting placement to be 
> single-purpose; not something that was actively sought after, though :)

I certainly sought it and would have fought hard to prevent it if we
ever ran into a situation where we had time to do it. These days,
given time constraints, these sort of optional nice to haves are
easier to avoid because there are fewer people to do them...

> I think having placement send event notifications would actually be A Good 
> Thing since it turns placement into a better cloud citizen, enabling 
> interested observers to trigger action instead of polling the placement API 
> for information.

I think some kind of event stream would be interesting, but there
are many ways to skin that cat. The current within-openstack
standards for such things are pretty heavyweight, better ways are on
the scene in the big wide world. By putting it off as long as
possible, we can take avantage of that new stuff.

>> * String adherence to WSGI norms (that is, any WSGI server can run a
>
> Strict adherence I think you meant? :)

My strictness is much better in wsgi than typing.

> 1) Using generation markers for concurrent update mechanisms

I agree. I'm still conflicted over whether we should have exposed
them as ETags or not (mostly from an HTTP-love standpoint), but
overall they've made lots of stuff possible and easier.

> Finally, the use of generation markers means there is nowhere in either the 
> placement API nor its clients that use any locking semantics *at all*. No 
> mutexes. No semaphores. No "lock this thing" API call. None of that 
> heavyweight old skool concurrency.

Yeah. State handling (lack of) is nice.

> 2) Separation of quantitative and qualitative things

Yes, very much agree.

>> * Because of a combination of "we might need it later", "it's a
>>    handy tool and constraint" and "that's the way we do things" the
>>    interface between the placement URL handlers and the database is
>>    mediated through oslo versioned objects. Since there's no RPC, nor
>>    inter-version interaction, this is overkill. It also turns out that
>>    OVO getters and setters are a moderate factor in performance.
>
> Data please.

When I wrote that bullet I just had some random profiling data from
running a profiler during a bunch of requests, which made it clear
that some ovo methods (in the getters and setters) were being called
a ton (in large part because of the number of objects invovled in an
allocation candidates response). I didn't copy that down anywhere at
the time because I planned to do it more formally.

Since then, I've made this:

https://review.openstack.org/#/c/636631/

That's a stack which removes OVO from placement. While we know the
perfload job is not scientific, it does provide a nice quide. An
ovo-using patch
<http://logs.openstack.org/95/633595/2/check/placement-perfload/267131a/logs/placement-perf.txt.gz>
has perfload times of 2.65-ish (seconds).

The base of that OVO removal stack (which changes allocation
candidates) <
http://logs.openstack.org/31/636631/4/check/placement-perfload/a413724/logs/placement-perf.txt>
is 2.3-ish.

The end of it
<http://logs.openstack.org/07/636807/2/check/placement-perfload/fa7d58f/logs/placement-perf.txt>
is 1.5-ish.

And there are ways in which the code is much more explicit. There's
plenty of cleanup to do, and I'm not wed to us making that change if
people aren't keen, but I can see a fair number reasons above and
beyond peformance to do it but that might be enough. Lot's more info
in the commits and comments in that stack.

>> * Despite the strict adherence to being a good WSGI citizen
>>    mentioned above, placement is using a custom (very limited)
>>    framework for the WSGI application. An initial proof of concept
>>    used flask but it was decided that introducing flask into the nova
>>    development environment would be introducing another thing to know
>>    when decoding nova. I suspect the expected outcome was that
>>    placement would reuse nova's framework, but the truth is I simply
>>    couldn't do it. Declarative URL dispatch was a critical feature
>>    that has proven worth it. The resulting code is relatively
>>    straightforward but it is unicorn where a boring pony would have
>>    been the right thing. Boring ponies are very often the right
>>    thing.
>
> Not sure I agree with this. The simplicity of the placement WSGI 
> (non-)framework is a benefit. We don't need to mess with it. Really, it 
> hasn't been an issue at all.

I agree that it is very hands off now, and not worth changing, but
as an example for new projects, it is something to think about. It
had creation costs in various forms. If there wasn't a me around
(many custom non-frameworks under my belt) it would have been harder
to create something (and then manage/maintain/educate it). sdague
and I nearly came to metaphorical blows over it. If it were just
normal to use the boring pony such things wouldn't need to happen.

> Placement allocations currently have a distinct lack of temporal awareness. 
> An allocation either exists or doesn't exist -- there is no concept of an 
> allocation "end time". What this means is that placement cannot be used for a 
> reservation system. I used to think this was OK, and that reservation systems 
> should be layered on top of the simpler placement data model.

Yeah, I was thinking about this recently too. Trying to come up with
conceptual hacks that would make it possible without drastically
changing the existing data model. There's stuff percolating in my
brain, potentially as weird as infinite resource classes but maybe
not, but nothing has gelled.

I hope, at least, that we can get the layered on top stuff working
well.

> Best,
> -jay

Thanks very much for chiming in here, I hope other people will too.

>> I'm sure there are more here, but I've run out of brain.

One thing that came up in the TC discussions [1] related to placement
governance [1] was that given there have been bumps in the
extraction road, it might be useful to also document the learnings
from that. The main one, from my perspective is:

If there's any inkling that a new service (something with what might
be described as a public interface) is ever going to be eventually
extracted, start it outside from the outset, but make sure the
people involved overlap.

[1] http://eavesdrop.openstack.org/irclogs/%23openstack-tc/%23openstack-tc.2019-02-12.log.html
[2] https://review.openstack.org/#/c/636416/

-- 
Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
freenode: cdent                                         tw: @anticdent


More information about the openstack-discuss mailing list