Open Stack

Thu Mar 8 12:51:27 UTC 2018

We had a productive PTG and were able to discuss a great many 
scheduler-related topics. I've put together an etherpad [0] with a 
summary, reproduced below.

Expect follow-up emails about each priority item in the scheduler track 
from those contributors working on that area.

Best,
-jay

Placement/scheduler: Rocky PTG Summary

== Key topics ==

- Aggregates
   - How we messed up operators using nova host aggregates for 
allocation ratios
   - Placement currently doesn't "auto-create" placement aggregates when 
nova host aggregates change

- Standardizing trait handling for virt drivers

- Placement REST API
   - Partial allocation patching
   - Removing assumptions around generation 0

- Supporting policy/RBAC

-NUMA
   - Supporting both shared and dedicated CPU on the same host as well 
as the same instance

- vGPU handling

- Tracking ingress/egress bandwidth resources using placement

- Finally supporting live migration of CPU-pinned instances

== Agreements and decisions ==

- dansmith's "placement request filters" work is an important enabler of 
a number of use cases, particularly around aggregate filtering. Spec is 
already approved here: https://review.openstack.org/#/c/544585/

- We need a method of filtering providers that do NOT have a certain 
trait. This is tentatively being called "forbidden traits". Spec review 
here: https://review.openstack.org/548915

- For parity/consistency reasons, we should add the in_tree=<RP_UUID> 
query parameter to GET /resource_providers

- To assist operators, add some new osc-placement CLI commands for 
applying traits/allocation ratio to batches of resource providers in an 
aggregate

- We should allow image metadata to specify required traits in the same 
fashion as flavor extra specs. Spec review here: 
https://review.openstack.org/#/c/541507/

- virt drivers should begin reporting their CPU features as traits. Spec 
review here: https://review.openstack.org/#/c/497733/
   - Furthermore, virt drivers should respect the cpu_model CONF option 
for overriding CPU-related traits

- We will eventually want to provide the ability to patch an already 
existing allocation
   - Hot-attaching a network interface is the canonical use case here. 
We want to add the new NIC resources to the existing allocation for the 
instance consumer without needing to re-PUT the entire allocation

- In order to do this, we will need to add a generation field to the 
consumers table, allowing multiple allocation writers to ensure their 
view of the consumer is consistent (TODO: need a blueprint/spec for this)

- We should extricate the standard resource classes currently defined in 
`nova.objects.fields.ResourceClass` into a small `os-resource-classes` 
library (TODO: need a blueprint/spec for this)

- We should use oslo.policy in the placement API (TODO: specless 
blueprint for this)
   - Use case here is making the transition to placement easy for 
operators that currently use the os-aggregates interface for managing 
compute resources

- Calling code should not assume the initial generation for a resource 
provider is zero. Spec review here: https://review.openstack.org/#/c/548903/

- Extracting placement into separate packages is not a priority, but we 
think incrementatl progress to extraction can be made in Rocky
   - Placement's microversion handling should be extracted into a 
separate library
   - Trimming nova imports

- We should add some support to nova-manage to assist operators using 
the caching scheduler to migrate to placement (and get rid of the caching

-  VGPU_DISPLAY_HEAD resource class should be removed and replaced with 
a set of os-traits traits that indicate the maximum supported number of 
display heads for the vGPU type

- A new PCPU resource class should be created to describe physical CPUs 
(logical processors in the hardware). Virt drivers will be able to set 
inventories of PCPU on resource providers representing NUMA nodes and 
therefore use placement to track dedicated CPU resources (TODO: need a 
blueprint/spec for this)

- artom is going to write a spec for supporting live migration of 
CPU-pinned instances (and abandon the complicated old patches)

- Multiple agreements about strict minimum bandwidth support feature in 
nova -  Spec has already been updated accordingly: 
https://review.openstack.org/#/c/502306/

   - For now we keep the hostname as the information connecting the 
nova-compute and the neutron-agent on the same host but we are aiming 
for having the hostname as an FQDN to avoid possible ambiguity.

   - We agreed not to make this feature dependent on moving the nova 
port create to the conductor. The current scope is to support 
pre-created neutron port only.

   - Neutron will provide the resource request in the port API so this 
feature does not depend on the neutron port binding API work

   - Neutron will create resource providers in placement under the 
compute RP. Also Neutron will report inventories on those RPs

   - Nova will do the claim of the port related resources in placement 
and the consumer_id will be the instance UUID

   - We should mirror nova host aggregate information to placement using 
an online data migration technique on the add/remove_host methods of 
nova.objects.Aggregate and a `nova-manage db online_migration` command

== Priorities for Rocky release cycle ==

1. Merge the update_provider_tree patch series (efried)

2. Placement request filters (dansmith)

3. Mirror aggregate information from nova to placement (jaypipes)

4. Forbidden traits (cdent)

== Non-priority Items for Rocky ==

- Add consumers.generation field and related API plumbing (efried and cdent)

- Support requested traits in image metadata (arvind)

- Provide CLI functionality to set traits and things like allocation 
ratios for a batch of resource providers via aggregate (ttsurya)

- Migrating off of the caching scheduler and on to placement (mriedem)

- Create `os-resource-classes` library and write migration code to 
replace `nova.objects.fields.ResourceClass` usage with calls to 
os_resource_classes (

- Policy/RBAC support in Placement REST API (mriedem)

- Extract placement's microversion handling into separate library (cdent)

- CPU-pinned instance live migration support (stephenfin and artom)

[0] https://etherpad.openstack.org/p/rocky-ptg-scheduler-placement-summary

Open Stack

[openstack-dev] [nova][placement] PTG Summary and Rocky Priorities

OpenStack

Community

Documentation

Branding & Legal