[openstack-dev] [Nova][FFE] Feature freeze exception for virt-driver-numa-placement

Gary Kotton gkotton at vmware.com
Thu Sep 4 13:51:17 UTC 2014



On 9/4/14, 4:30 PM, "Sean Dague" <sean at dague.net> wrote:

>On 09/04/2014 09:21 AM, Daniel P. Berrange wrote:
>> On Thu, Sep 04, 2014 at 03:07:24PM +0200, Nikola Đipanov wrote:
>>> On 09/04/2014 02:31 PM, Sean Dague wrote:
>>>> On 09/04/2014 07:58 AM, Nikola Đipanov wrote:
>>>>> Hi team,
>>>>>
>>>>> I am requesting the exception for the feature from the subject (find
>>>>> specs at [1] and outstanding changes at [2]).
>>>>>
>>>>> Some reasons why we may want to grant it:
>>>>>
>>>>> First of all all patches have been approved in time and just lost the
>>>>> gate race.
>>>>>
>>>>> Rejecting it makes little sense really, as it has been commented on
>>>>>by a
>>>>> good chunk of the core team, most of the invasive stuff (db
>>>>>migrations
>>>>> for example) has already merged, and the few parts that may seem
>>>>> contentious have either been discussed and agreed upon [3], or can
>>>>> easily be addressed in subsequent bug fixes.
>>>>>
>>>>> It would be very beneficial to merge it so that we actually get real
>>>>> testing on the feature ASAP (scheduling features are not tested in
>>>>>the
>>>>> gate so we need to rely on downstream/3rd party/user testing for
>>>>>those).
>>>>
>>>> This statement bugs me. It seems kind of backwards to say we should
>>>> merge a thing that we don't have a good upstream test plan on and put
>>>>it
>>>> in a release so that the testing will happen only in the downstream
>>>>case.
>>>>
>>>
>>> The objective reality is that many other things have not had upstream
>>> testing for a long time (anything that requires more than 1 compute
>>>node
>>> in Nova for example, and any scheduling feature - as I mention clearly
>>> above), so not sure how that is backwards from any reasonable point.
>> 
>> More critically with NUMA feature, AFAIK, there is no public cloud in
>> existance which exposes NUMA to the guest. So unless someone is willing
>> to pay for 100's of bare metal servers to run tempest on, I don't know
>> of any infrastructure on which we can test NUMA today.
>> 
>> Of course once we include NUMA features in Nova and release Nova, then
>> the Rackspace and/or HP clouds will be in a position to start
>>considering
>> how & when they might expose NUMA features for instances they host. So
>>by
>> including it in Nova today, we would be helping move towards a future
>> where we will be able to run tempest against NUMA features.
>> 
>> Blocking NUMA from Nova for lack of automated testing will leave us
>>trapped
>> in a chicken and egg scenario, potentially forever. That's not in
>>anyones
>> best interests IMHO
>
>The spec specifically calls out the scheduler piece being the part that
>probably most needs to be tested, especially at large scales here. Those
>pieces don't need Tempest to test them, they need more solid functional
>tests around the scheduler under those circumstances.
>
>There are interesting (and not all that difficult) ways to do this given
>the resources we have, which don't seem to be being explored, which is
>my concern.

I share your concern with this feature. I stated it on review
https://review.openstack.org/#/c/115007/ in PS 16. I think that we have
well known scheduling issues and these will be accentuated by a feature
like this. My feeling is that this feature and the PCI feature are both
going to be problematic under scale.

My reservations are when the feature is not enabled that a lot of
unnecessary data will be passed between hosts and the scheduler (this is
why we should have gone with the extensible resources (but that is opening
a can of worms)).

Having said that I think that Nova needs features like this. I am in favor
of moving ahead with this for a number of reasons:
1. The filter is not enabled by default
2. We can fix things moving forwards

So I am +1 on this. If we can document that it is experimental or use at
your own risk then I am +2. But I think that the fact that the admin needs
to configure the filter she/he knows it is at their own risk.

A luta continua


>
>	-Sean
>
>-- 
>Sean Dague
>https://urldefense.proofpoint.com/v1/url?u=http://dague.net/&k=oIvRg1%2BdG
>AgOoM1BIlLLqw%3D%3D%0A&r=eH0pxTUZo8NPZyF6hgoMQu%2BfDtysg45MkPhCZFxPEq8%3D%
>0A&m=Vr9ci4W1jJwlMVh7NJWsxGeY52I2AJ113JDTFO2CluA%3D%0A&s=45070dc04c1c3bb93
>93b6273d23a8310ea404b716cf40c299b487e24ba5a8552
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list