Open Stack

Thu May 31 18:26:46 UTC 2018

> 1. Make everything perform the pivot on compute node start (which can be
>    re-used by a CLI tool for the offline case)
> 2. Make everything default to non-nested inventory at first, and provide
>    a way to migrate a compute node and its instances one at a time (in
>    place) to roll through.

I agree that it sure would be nice to do ^ rather than requiring the
"slide puzzle" thing.

But how would this be accomplished, in light of the current "separation
of responsibilities" drawn at the virt driver interface, whereby the
virt driver isn't supposed to talk to placement directly, or know
anything about allocations?  Here's a first pass:

The virt driver, via the return value from update_provider_tree, tells
the resource tracker that "inventory of resource class A on provider B
have moved to provider C" for all applicable AxBxC.  E.g.

[ { 'from_resource_provider': <cn_rp_uuid>,
    'moved_resources': [VGPU: 4],
    'to_resource_provider': <gpu_rp1_uuid>
  },
  { 'from_resource_provider': <cn_rp_uuid>,
    'moved_resources': [VGPU: 4],
    'to_resource_provider': <gpu_rp2_uuid>
  },
  { 'from_resource_provider': <cn_rp_uuid>,
    'moved_resources': [
        SRIOV_NET_VF: 2,
        NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
        NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
    ],
    'to_resource_provider': <gpu_rp2_uuid>
  }
]

As today, the resource tracker takes the updated provider tree and
invokes [1] the report client method update_from_provider_tree [2] to
flush the changes to placement.  But now update_from_provider_tree also
accepts the return value from update_provider_tree and, for each "move":

- Creates provider C (as described in the provider_tree) if it doesn't
already exist.
- Creates/updates provider C's inventory as described in the
provider_tree (without yet updating provider B's inventory).  This ought
to create the inventory of resource class A on provider C.
- Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
- Updates provider B's inventory.

(*There's a hole here: if we're splitting a glommed-together inventory
across multiple new child providers, as the VGPUs in the example, we
don't know which allocations to put where.  The virt driver should know
which instances own which specific inventory units, and would be able to
report that info within the data structure.  That's getting kinda close
to the virt driver mucking with allocations, but maybe it fits well
enough into this model to be acceptable?)

Note that the return value from update_provider_tree is optional, and
only used when the virt driver is indicating a "move" of this ilk.  If
it's None/[] then the RT/update_from_provider_tree flow is the same as
it is today.

If we can do it this way, we don't need a migration tool.  In fact, we
don't even need to restrict provider tree "reshaping" to release
boundaries.  As long as the virt driver understands its own data model
migrations and reports them properly via update_provider_tree, it can
shuffle its tree around whenever it wants.

Thoughts?

-efried

[1]
https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/compute/resource_tracker.py#L890
[2]
https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/scheduler/client/report.py#L1341

Open Stack

[openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers

OpenStack

Community

Documentation

Branding & Legal