[openstack-dev] [nova] [placement] Upgrade concerns with nested Resource Providers
Eric Fried
openstack at fried.cc
Fri Jun 1 15:11:43 UTC 2018
Sylvain-
On 05/31/2018 02:41 PM, Sylvain Bauza wrote:
>
>
> On Thu, May 31, 2018 at 8:26 PM, Eric Fried <openstack at fried.cc
> <mailto:openstack at fried.cc>> wrote:
>
> > 1. Make everything perform the pivot on compute node start (which can be
> > re-used by a CLI tool for the offline case)
> > 2. Make everything default to non-nested inventory at first, and provide
> > a way to migrate a compute node and its instances one at a time (in
> > place) to roll through.
>
> I agree that it sure would be nice to do ^ rather than requiring the
> "slide puzzle" thing.
>
> But how would this be accomplished, in light of the current "separation
> of responsibilities" drawn at the virt driver interface, whereby the
> virt driver isn't supposed to talk to placement directly, or know
> anything about allocations? Here's a first pass:
>
>
>
> What we usually do is to implement either at the compute service level
> or at the virt driver level some init_host() method that will reconcile
> what you want.
> For example, we could just imagine a non-virt specific method (and I
> like that because it's non-virt specific) - ie. called by compute's
> init_host() that would lookup the compute root RP inventories, see
> whether one ore more inventories tied to specific resource classes have
> to be moved from the root RP and be attached to a child RP.
> The only subtility that would require a virt-specific update would be
> the name of the child RP (as both Xen and libvirt plan to use the child
> RP name as the vGPU type identifier) but that's an implementation detail
> that a possible virt driver update by the resource tracker would
> reconcile that.
The question was rhetorical; my suggestion (below) was an attempt at
designing exactly what you've described. Let me know if I can
explain/clarify it further. I'm looking for feedback as to whether it's
a viable approach.
> The virt driver, via the return value from update_provider_tree, tells
> the resource tracker that "inventory of resource class A on provider B
> have moved to provider C" for all applicable AxBxC. E.g.
>
> [ { 'from_resource_provider': <cn_rp_uuid>,
> 'moved_resources': [VGPU: 4],
> 'to_resource_provider': <gpu_rp1_uuid>
> },
> { 'from_resource_provider': <cn_rp_uuid>,
> 'moved_resources': [VGPU: 4],
> 'to_resource_provider': <gpu_rp2_uuid>
> },
> { 'from_resource_provider': <cn_rp_uuid>,
> 'moved_resources': [
> SRIOV_NET_VF: 2,
> NET_BANDWIDTH_EGRESS_KILOBITS_PER_SECOND: 1000,
> NET_BANDWIDTH_INGRESS_KILOBITS_PER_SECOND: 1000,
> ],
> 'to_resource_provider': <gpu_rp2_uuid>
> }
> ]
>
> As today, the resource tracker takes the updated provider tree and
> invokes [1] the report client method update_from_provider_tree [2] to
> flush the changes to placement. But now update_from_provider_tree also
> accepts the return value from update_provider_tree and, for each "move":
>
> - Creates provider C (as described in the provider_tree) if it doesn't
> already exist.
> - Creates/updates provider C's inventory as described in the
> provider_tree (without yet updating provider B's inventory). This ought
> to create the inventory of resource class A on provider C.
> - Discovers allocations of rc A on rp B and POSTs to move them to rp C*.
> - Updates provider B's inventory.
>
> (*There's a hole here: if we're splitting a glommed-together inventory
> across multiple new child providers, as the VGPUs in the example, we
> don't know which allocations to put where. The virt driver should know
> which instances own which specific inventory units, and would be able to
> report that info within the data structure. That's getting kinda close
> to the virt driver mucking with allocations, but maybe it fits well
> enough into this model to be acceptable?)
>
> Note that the return value from update_provider_tree is optional, and
> only used when the virt driver is indicating a "move" of this ilk. If
> it's None/[] then the RT/update_from_provider_tree flow is the same as
> it is today.
>
> If we can do it this way, we don't need a migration tool. In fact, we
> don't even need to restrict provider tree "reshaping" to release
> boundaries. As long as the virt driver understands its own data model
> migrations and reports them properly via update_provider_tree, it can
> shuffle its tree around whenever it wants.
>
> Thoughts?
>
> -efried
>
> [1]
> https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/compute/resource_tracker.py#L890
> <https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/compute/resource_tracker.py#L890>
> [2]
> https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/scheduler/client/report.py#L1341
> <https://github.com/openstack/nova/blob/8753c9a38667f984d385b4783c3c2fc34d7e8e1b/nova/scheduler/client/report.py#L1341>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list