[openstack-dev] [Nova] Concerns around the Extensible Resource Tracker design - revert maybe?
Nikola Đipanov
ndipanov at redhat.com
Tue Aug 12 16:54:30 UTC 2014
On 08/12/2014 04:49 PM, Sylvain Bauza wrote:
> (sorry for reposting, missed 2 links...)
>
> Hi Nikola,
>
> Le 12/08/2014 12:21, Nikola Đipanov a écrit :
>> Hey Nova-istas,
>>
>> While I was hacking on [1] I was considering how to approach the fact
>> that we now need to track one more thing (NUMA node utilization) in our
>> resources. I went with - "I'll add it to compute nodes table" thinking
>> it's a fundamental enough property of a compute host that it deserves to
>> be there, although I was considering Extensible Resource Tracker at one
>> point (ERT from now on - see [2]) but looking at the code - it did not
>> seem to provide anything I desperately needed, so I went with keeping it
>> simple.
>>
>> So fast-forward a few days, and I caught myself solving a problem that I
>> kept thinking ERT should have solved - but apparently hasn't, and I
>> think it is fundamentally a broken design without it - so I'd really
>> like to see it re-visited.
>>
>> The problem can be described by the following lemma (if you take 'lemma'
>> to mean 'a sentence I came up with just now' :)):
>>
>> """
>> Due to the way scheduling works in Nova (roughly: pick a host based on
>> stale(ish) data, rely on claims to trigger a re-schedule), _same exact_
>> information that scheduling service used when making a placement
>> decision, needs to be available to the compute service when testing the
>> placement.
>> """
>>
>> This is not the case right now, and the ERT does not propose any way to
>> solve it - (see how I hacked around needing to be able to get
>> extra_specs when making claims in [3], without hammering the DB). The
>> result will be that any resource that we add and needs user supplied
>> info for scheduling an instance against it, will need a buggy
>> re-implementation of gathering all the bits from the request that
>> scheduler sees, to be able to work properly.
>
> Well, ERT does provide a plugin mechanism for testing resources at the
> claim level. This is the plugin responsibility to implement a test()
> method [2.1] which will be called when test_claim() [2.2]
>
> So, provided this method is implemented, a local host check can be done
> based on the host's view of resources.
>
>
Yes - the problem is there is no clear API to get all the needed bits to
do so - especially the user supplied one from image and flavors.
On top of that, in current implementation we only pass a hand-wavy
'usage' blob in. This makes anyone wanting to use this in conjunction
with some of the user supplied bits roll their own
'extract_data_from_instance_metadata_flavor_image' or similar which is
horrible and also likely bad for performance.
>> This is obviously a bigger concern when we want to allow users to pass
>> data (through image or flavor) that can affect scheduling, but still a
>> huge concern IMHO.
>
> And here is where I agree with you : at the moment, ResourceTracker (and
> consequently Extensible RT) only provides the view of the resources the
> host is knowing (see my point above) and possibly some other resources
> are missing.
> So, whatever your choice of going with or without ERT, your patch [3]
> still deserves it if we want not to lookup DB each time a claim goes.
>
>
>> As I see that there are already BPs proposing to use this IMHO broken
>> ERT ([4] for example), which will surely add to the proliferation of
>> code that hacks around these design shortcomings in what is already a
>> messy, but also crucial (for perf as well as features) bit of Nova code.
>
> Two distinct implementations of that spec (ie. instances and flavors)
> have been proposed [2.3] [2.4] so reviews are welcome. If you see the
> test() method, it's no-op thing for both plugins. I'm open to comments
> because I have the stated problem : how can we define a limit on just a
> counter of instances and flavors ?
>
Will look at these - but none of them seem to hit the issue I am
complaining about, and that is that it will need to consider other
request data for claims, not only data available for on instances.
Also - the fact that you don't implement test() in flavor ones tells me
that the implementation is indeed racy (but it is racy atm as well) and
two requests can indeed race for the same host, and since no claims are
done, both can succeed. This is I believe (at least in case of single
flavor hosts) unlikely to happen in practice, but you get the idea.
>
>
>> I propose to revert [2] ASAP since it is still fresh, and see how we can
>> come up with a cleaner design.
>>
>> Would like to hear opinions on this, before I propose the patch tho!
>
> IMHO, I think the problem is more likely that the regular RT misses some
> information for each host so it requires to handle it on a case-by-case
> basis, but I don't think ERT either increases complexity or creates
> another issue.
>
RT does not miss info about the host, but about the particular request
which we have to fish out of different places like image_metadata
extra_specs etc, yet - it can't really work without them. This is
definitely a RT issue that is not specific to ERT.
However, I still see several issues with the current ERT implementation,
but the one I am getting at here, and why I think we should revert it,
is that we are building a 3rd party plugin API that is tightly coupled
to an obviously flawed internal API (RT and Claims).
We have no policy AFIK about what guarantees we provide to 3rd party
plugins, but if we are going to be breaking them all the time, or in
this case providing very limited usefulness - I see little value in the
current implementation of ERT, and see issues with it staying, which
means future work will be based on it.
N.
>
> Thanks,
> -Sylvain
>
>> Thanks all,
>>
>> Nikola
>>
>> [1]
>> https://blueprints.launchpad.net/nova/+spec/virt-driver-numa-placement
>> [2] https://review.openstack.org/#/c/109643/
>> [3] https://review.openstack.org/#/c/111782/
>> [4] https://review.openstack.org/#/c/89893
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> [2.1]
> https://github.com/openstack/nova/blob/master/nova/compute/resources/__init__.py#L75
>
> [2.2]
> https://github.com/openstack/nova/blob/master/nova/compute/claims.py#L134
> [2.3] https://review.openstack.org/112578
> [2.4] https://review.openstack.org/113373
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list