[openstack-dev] [Nova] [Gantt] Scheduler split status (updated)

Debojyoti Dutta ddutta at gmail.com
Tue Jul 15 15:45:39 UTC 2014


https://etherpad.openstack.org/p/SchedulerUseCases

[08:43:35] <n0ano> #action all update the use case etherpad
athttps://etherpad.openstack.org/p/SchedulerUseCases

Please update your use cases here ......

thx
debo

On Tue, Jul 15, 2014 at 2:50 AM, Sylvain Bauza <sbauza at redhat.com> wrote:
> Le 14/07/2014 20:10, Jay Pipes a écrit :
>> On 07/14/2014 10:16 AM, Sylvain Bauza wrote:
>>> Le 12/07/2014 06:07, Jay Pipes a écrit :
>>>> On 07/11/2014 07:14 AM, John Garbutt wrote:
>>>>> On 10 July 2014 16:59, Sylvain Bauza <sbauza at redhat.com> wrote:
>>>>>> Le 10/07/2014 15:47, Russell Bryant a écrit :
>>>>>>> On 07/10/2014 05:06 AM, Sylvain Bauza wrote:
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> === tl;dr: Now that we agree on waiting for the split
>>>>>>>> prereqs to be done, we debate on if ResourceTracker should
>>>>>>>> be part of the scheduler code and consequently Scheduler
>>>>>>>> should expose ResourceTracker APIs so that Nova wouldn't
>>>>>>>> own compute nodes resources. I'm proposing to first come
>>>>>>>> with RT as Nova resource in Juno and move ResourceTracker
>>>>>>>> in Scheduler for K, so we at least merge some patches by
>>>>>>>> Juno. ===
>>>>>>>>
>>>>>>>> Some debates occured recently about the scheduler split, so
>>>>>>>> I think it's important to loop back with you all to see
>>>>>>>> where we are and what are the discussions. Again, feel free
>>>>>>>> to express your opinions, they are welcome.
>>>>>>> Where did this resource tracker discussion come up?  Do you
>>>>>>> have any references that I can read to catch up on it?  I
>>>>>>> would like to see more detail on the proposal for what should
>>>>>>> stay in Nova vs. be moved.  What is the interface between
>>>>>>> Nova and the scheduler here?
>>>>>>
>>>>>> Oh, missed the most important question you asked. So, about
>>>>>> the interface in between scheduler and Nova, the original
>>>>>> agreed proposal is in the spec
>>>>>> https://review.openstack.org/82133 (approved) where the
>>>>>> Scheduler exposes : - select_destinations() : for querying the
>>>>>> scheduler to provide candidates - update_resource_stats() : for
>>>>>> updating the scheduler internal state (ie. HostState)
>>>>>>
>>>>>> Here, update_resource_stats() is called by the
>>>>>> ResourceTracker, see the implementations (in review)
>>>>>> https://review.openstack.org/82778 and
>>>>>> https://review.openstack.org/104556.
>>>>>>
>>>>>> The alternative that has just been raised this week is to
>>>>>> provide a new interface where ComputeNode claims for resources
>>>>>> and frees these resources, so that all the resources are fully
>>>>>> owned by the Scheduler. An initial PoC has been raised here
>>>>>> https://review.openstack.org/103598 but I tried to see what
>>>>>> would be a ResourceTracker proxified by a Scheduler client here
>>>>>> : https://review.openstack.org/105747. As the spec hasn't been
>>>>>> written, the names of the interfaces are not properly defined
>>>>>> but I made a proposal as : - select_destinations() : same as
>>>>>> above - usage_claim() : claim a resource amount -
>>>>>> usage_update() : update a resource amount - usage_drop(): frees
>>>>>> the resource amount
>>>>>>
>>>>>> Again, this is a dummy proposal, a spec has to written if we
>>>>>> consider moving the RT.
>>>>>
>>>>> While I am not against moving the resource tracker, I feel we
>>>>> could move this to Gantt after the core scheduling has been
>>>>> moved.
>>>>
>>>> Big -1 from me on this, John.
>>>>
>>>> Frankly, I see no urgency whatsoever -- and actually very little
>>>> benefit -- to moving the scheduler out of Nova. The Gantt project I
>>>> think is getting ahead of itself by focusing on a split instead of
>>>> focusing on cleaning up the interfaces between nova-conductor,
>>>> nova-scheduler, and nova-compute.
>>>>
>>>
>>> -1 on saying there is no urgency. Don't you see the NFV group saying
>>> each meeting what is the status of the scheduler split ?
>>
>> Frankly, I don't think a lot of the NFV use cases are well-defined.
>>
>> Even more frankly, I don't see any benefit to a split-out scheduler to
>> a single NFV use case.
>
> I don't know if you can, but if you're interested in giving feedback to
> the NFV team, we do run weekly meeting on #openstack-meeting-alt every
> Wednesday 2pm UTC.
>
> You can find a list of all the associated blueprints here
> https://wiki.openstack.org/wiki/Teams/NFV#Active_Blueprints whose list
> is processed hourly by a backend script so it generates a Gerrit
> dashboard accessible here : http://nfv.russellbryant.net
>
> By saying that, you can find
> https://blueprints.launchpad.net/nova/+spec/solver-scheduler as a
> possible use-case for NFV.
> As Paul and Yathi said, there is a need for a global resource placement
> engine able to cope with both network and compute resources if we need
> to provide NFV use-cases, that appears to me quite clearly and that's
> why I joined the NFV team.
>
>>
>>> Don't you see each Summit the lots of talks (and people attending
>>> them) talking about how OpenStack should look at Pets vs. Cattle and
>>> saying that the scheduler should be out of Nova ?
>>
>> There's been no concrete benefits discussed to having the scheduler
>> outside of Nova.
>>
>> I don't really care how many people say that the scheduler should be
>> out of Nova unless those same people come to the table with concrete
>> reasons why. Just saying something is a benefit does not make it a
>> benefit, and I think I've outlined some of the very real dangers -- in
>> terms of code and payload complexity -- of breaking the scheduler out
>> of Nova until the interfaces are cleaned up and the scheduler actually
>> owns the resources upon which it exercises placement decisions.
>>
>>> From an operator perspective, people waited so long for having a
>>> scheduler doing "scheduling" and not only "resource placement".
>>
>> Could you elaborate a bit here? What operators are begging for the
>> scheduler to do more than resource placement? And if they are begging
>> for this, what use cases are they trying to address?
>>
>> I'm genuinely curious, so looking forward to your reply here! :)
>
> Sure, I don't keep track of all the talks and presentations related to
> the scheduler, but I remember those below :
>
>  - Solver Scheduler (see Yathi email
> http://lists.openstack.org/pipermail/openstack-dev/2014-July/040255.html )
>  - Simultaneous Scheduling for Server Groups (
> http://summit.openstack.org/cfp/details/400,
> https://blueprints.launchpad.net/nova/+spec/simultaneous-server-group
> and https://etherpad.openstack.org/p/juno-nova-scheduling-server-groups )
>  - Entreprise-Grade Scheduling (
> https://www.openstack.org/summit/openstack-summit-atlanta-2014/session-videos/presentation/enterprise-grade-scheduling-enterprise-grade-openstack-from-a-scheduler-perspective
> )
>  - Policy-based Resource Scheduling (
> http://openstacksummitmay2014atlanta.sched.org/event/b4313b37de4645079e3d5506b1d725df#.U8TvQPGOqao,
> sorry can't find the slides)
>
>
> All of these discussions were presenting the need of an holistic
> scheduler able to address various different metrics from heterogeneous
> worlds. In particular, it was raised during some of these talks the
> possibility of having a "scheduler" (and not a "resource placer"), ie.
> something acting like a maestro and handling the scheduling requests in
> a certain order.
>
> As it was agreed during the Icehouse and Juno summits, these use-cases
> are too big to be fitted in Nova and must reside in a separate project,
> hence Gantt.
>
>
>
>>
>> snip...
>>
>>>> As for the idea that things will get *easier* once scheduler code
>>>> is broken out of Nova, I go back to my original statement that I
>>>> don't really see the benefit of the split at this point, and I
>>>> would just bring up the fact that Neutron/nova-network is a shining
>>>> example of how things can easily backfire when splitting of code is
>>>> done too early before interfaces are cleaned up and
>>>> responsibilities between internal components are not clearly agreed
>>>> upon.
>>>
>>> Please, please, don't mix the rationale for extensible Resource
>>> Tracker and the current efforts for moving out the Scheduler. Both of
>>> them try to have an agnostic and heterogeneous scheduler, but both
>>> efforts are independent.
>>>
>>> The ResourceTracker is something pure Nova. Saying to Gantt "I want
>>> to store this data" and "I want you to select a destination" is
>>> something enough agnostic for not including the port of
>>> ResourceTracker to the Scheduler.
>>
>> Sorry, I'm not following you. Who is saying to Gantt "I want to store
>> this data"?
>>
>> All I am saying is that the thing that places a resource on some
>> provider of that resource should be the thing that owns the process of
>> a requester *claiming* the resources on that provider, and in order to
>> properly track resources in a race-free way in such a system, then the
>> system needs to contain the resource tracker.
>>
>
> That's where we diverge : I'm thinking about what we should express for
> Gantt, not how the implementation is done.
>
> Provided we agree on the fact that Gantt (or the scheduler if you wish)
> needs to track the metrics in order to make decisions and provided we
> agree on the scheduler not querying the computes for getting these
> metrics, then that needs that the interface in between computes and the
> scheduler is "Gantt, please take these metrics and store them in your
> own datastore".
>
> This interface is clear enough to be defined : it accepts a unique
> identifier for the resource to be stored or updated, and it also accepts
> a set of metrics.
>
> Whether we say to keep track of the resources in Nova or not (ie. if we
> accept a ResourceTracker or not) is just moving a line without modifying
> its interface : the scheduler has to accept metrics coming from other
> components and that's it.
>
>
> Of course, there is another interface for what you call "claiming":
> "hey, Gantt, could you please select for me a set of resources available
> for my needs that I express ?". With the current Scheduler
> implementation, this decision is based internally by the Scheduler, so
> the current select_destinations call is a good candidate for expressing
> the needs.
>
> As said Paul in his email, the current ResourceTracker and Scheduler
> implementation is not that great, and some drawbacks can be observed
> (persistent claiming, possible incorrect decision based on delayed
> information), so I totally admit that there is room for improvements.
> But if you look at my proposal in 3 steps, I was just saying that these
> improvements are not prerequisites for a split (because they don't block
> the split) and can be done once the prerequisites are satisfied, as it
> doesn't change the interface after all.
>
>
>
>>> While I approve to define the interfaces now, there is no reason tho
>>> to say we would have to change anything in how Nova is doing that.
>>> The role of Gantt is to define the interfaces, make the line
>>> Scheduler vs. Nova and forklift the Scheduler into a single project.
>>> No big bang is needed here.
>>
>> Yeah, I just don't see the need to split the scheduler at this point,
>> sorry. :(
>>
>
> tl; dr: I'm just saying that the level of change required for doing that
> is far more important than the previously targeted and is not a
> requirement for the split, so it can be done later on.
>
> Best,
> -Sylvain
>
>
>> Best,
>> -jay
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
-Debo~



More information about the OpenStack-dev mailing list