[openstack-dev] [nova] RT/Scheduler summit summary and Kilo development plan
Dugger, Donald D
donald.d.dugger at intel.com
Tue Nov 18 00:43:38 UTC 2014
Good, detailed summary, it pretty much matches with what I heard.
The one thing I want to do is I've setup a wiki page to track our progress (I find email/etherpads deficient for this task). I've tried to include everything from the email threads and the summit etherpads at:
If I've missed anything let me know and I'll update the wiki.
"Censeo Toto nos in Kansa esse decisse." - D. Gale
From: Jay Pipes [mailto:jaypipes at gmail.com]
Sent: Monday, November 17, 2014 8:59 AM
To: OpenStack Development Mailing List; Michael Still; sgordon at redhat.com; Dugger, Donald D
Subject: [nova] RT/Scheduler summit summary and Kilo development plan
Good morning Stackers,
At the summit in Paris, we put together a plan for work on the Nova resource tracker and scheduler in the Kilo timeframe. A large number of contributors across many companies are all working on this particular part of the Nova code base, so it's important that we keep coordinated and updated on the overall efforts. I'll work together with Don Dugger this cycle to make sure we make steady, measured progress. If you are involved in this effort, please do be sure to attend the weekly scheduler IRC meetings  (Tuesdays @ 1500UTC on #openstack-meeting).
== Decisions from Summit ==
The following decisions were made at the summit session :
1) The patch series for virt CPU pinning  and huge page support  shall not be approved until nova/virt/hardware.py is modified to use nova.objects as its serialization/domain object model. Jay is responsible for the conversion patches, and this patch series should be fully proposed by end of this week.
2) We agreed on the concepts introduced by the resource-objects blueprint , with a caveat that child object versioning be discussed in greater depth with Jay, Paul, and Dan Smith.
3) We agreed on all concepts and implementation from the 2 isolate-scheduler-db blueprints: aggregates  and instance groups 
4) We agreed on implementation and need for separating compute node object from the service object 
5) We agreed on concept and implementation for converting the request spec from a dict to a versioned object  as well as converting
select_destinations() to use said object 
 We agreed on the need for returning a proper object from the virt driver's get_available_resource() method  but AFAICR, we did not say that this object needed to use nova/objects because this is an interface internal to the virt layer and resource tracker, and the ComputeNode nova.object will handle the setting of resource-related fields properly.
 We agreed the unit tests for the resource tracker were, well, crappy, and are a real source of pain in making changes to the resource tracker itself. So, we resolved to fix them up in early Kilo-1
 We are not interested in adding any additional functionality to the scheduler outside already-agreed NUMA blueprint functionality in Kilo.
The goal is to get the scheduler fully independent of the Nova database, and communicating with nova-conductor and nova-compute via fully versioned interfaces by the end of Kilo, so that a split of the scheduler can occur at the start of the L release cycle.
== Action Items ==
1) Jay to propose patches that objectify the domain objects in nova/virt/hardware.py by EOB November 21
2) Paul Murray, Jay, and Alexis Lee to work on refactoring of the unit tests around the resource tracker in early Kilo-1
3) Dan Smith, Paul Murray, and Jay to discuss the issues with child object versioning
4) Ed Leafe to work on separating the compute node from the service object in Kilo-1
5) Sylvain Bauza to work on the request spec and select_destinations() to use request spec blueprints to be completed for Kilo-2
6) Paul Murray, Sylvain Bauza to work on the isolate-scheduler-db aggregate and instance groups blueprints to be completed by Kilo-3
7) Jay to complete the resource-objects blueprint work by Kilo-2
8) Dan Berrange, Sahid, and Nikola Dipanov to work on completing the CPU pinning, huge page support, and get_available_resources() blueprints in
== Open Items ==
1) We need to figure out who is working on the objectification of the PCI tracker stuff (Yunjong maybe or Robert Li?)
2) The child object version thing needs to be thoroughly vetted.
Basically, the nova.objects.compute_node.ComputeNode object will have a series of sub objects for resources (NUMA, PCI, other stuff) and Paul Murray has some questions on how to handle the child object versioning properly.
3) Need to coordinate with Steve Gordon, Adrian Hoban, and Ian Wells on NUMA hardware in an external testing lab that the NFV subteam is working on getting up and running . We need functional tests (Tempest+Nova) written for all NUMA-related functionality in the RT and scheduler by end of Kilo-3, but have yet to divvy up the work to make this a reality.
== Conclusion ==
Please everyone read the above thoroughly and respond if I have missed anything or left anyone out of the conversation. Really appreciate everyone coming together to get this work done over the next 4-5 months.
More information about the OpenStack-dev