[openstack-dev] [neutron] [nova] scheduling bandwidth resources / NIC_BW_KB resource class

Miguel Angel Ajo Pelayo majopela at redhat.com
Mon Apr 11 14:22:26 UTC 2016

On Mon, Apr 11, 2016 at 1:46 PM, Jay Pipes <jaypipes at gmail.com> wrote:
> Hi Miguel Angel, comments/answers inline :)
> On 04/08/2016 09:17 AM, Miguel Angel Ajo Pelayo wrote:
>> Hi!,
>>     In the context of [1] (generic resource pools / scheduling in nova)
>> and [2] (minimum bandwidth guarantees -egress- in neutron), I had a talk
>> a few weeks ago with Jay Pipes,
>>     The idea was leveraging the generic resource pools and scheduling
>> mechanisms defined in [1] to find the right hosts and track the total
>> available bandwidth per host (and per host "physical network"),
>> something in neutron (still to be defined where) would notify the new
>> API about the total amount of "NIC_BW_KB" available on every host/physnet.
> Yes, what we discussed was making it initially per host, meaning the host
> would advertise a total aggregate bandwidth amount for all NICs that it uses
> for the data plane as a single amount.
> The other way to track this resource class (NIC_BW_KB) would be to make the
> NICs themselves be resource providers and then the scheduler could pick a
> specific NIC to bind the port to based on available NIC_BW_KB on a
> particular NIC.
> The former method makes things conceptually easier at the expense of
> introducing greater potential for retrying placement decisions (since the
> specific NIC to bind a port to wouldn't be known until the claim is made on
> the compute host). The latter method adds complexity to the filtering and
> scheduler in order to make more accurate placement decisions that would
> result in fewer retries.
>>     That part is quite clear to me,
>>     From [1] I'm not sure which blueprint introduces the ability to
>> schedule based on the resource allocation/availability itself,
>> ("resource-providers-scheduler" seems more like an optimization to the
>> schedule/DB interaction, right?)
> Yes, you are correct about the above blueprint; it's only for moving the
> Python-side filters to be a DB query.
> The resource-providers-allocations blueprint:
> https://review.openstack.org/300177
> Is the one where we convert the various consumed resource amount fields to
> live in the single allocations table that may be queried for usage
> information.
> We aim to use the ComputeNode object as a facade that hides the migration of
> these data fields as much as possible so that the scheduler actually does
> not need to know that the schema has changed underneath it. Of course, this
> only works for *existing* resource classes, like vCPU, RAM, etc. It won't
> work for *new* resource classes like the discussed NET_BW_KB because,
> clearly, we don't have an existing field in the instance_extra or other
> tables that contain that usage amount and therefore can't use ComputeNode
> object as a facade over a non-existing piece of data.
> Eventually, the intent is to change the ComputeNode object to return a new
> AllocationList object that would contain all of the compute node's resources
> in a tabular format (mimicking the underlying allocations table):
> https://review.openstack.org/#/c/282442/20/nova/objects/resource_provider.py
> Once this is done, the scheduler can be fitted to query this AllocationList
> object to make resource usage and placement decisions in the Python-side
> filters.
> We are still debating on the resource-providers-scheduler-db-filters
> blueprint:
> https://review.openstack.org/#/c/300178/
> Whether to change the existing FilterScheduler or create a brand new
> scheduler driver. I could go either way, frankly. If we made a brand new
> scheduler driver, it would do a query against the compute_nodes table in the
> DB directly. The legacy FilterScheduler would manipulate the AllocationList
> object returned by the ComputeNode.allocations attribute. Either way we get
> to where we want to go: representing all quantitative resources in a
> standardized and consistent fashion.
>>      And, that brings me to another point: at the moment of filtering
>> hosts, nova  I guess, will have the neutron port information, it has to
>> somehow identify if the port is tied to a minimum bandwidth QoS policy.
> Yes, Nova's conductor gathers information about the requested networks
> *before* asking the scheduler where to place hosts:
> https://github.com/openstack/nova/blob/stable/mitaka/nova/conductor/manager.py#L362
>>      That would require identifying that the port has a "qos_policy_id"
>> attached to it, and then, asking neutron for the specific QoS policy
>>   [3], then look out for a minimum bandwidth rule (still to be defined),
>> and extract the required bandwidth from it.
> Yep, exactly correct.
>>     That moves, again some of the responsibility to examine and
>> understand external resources to nova.
> Yep, it does. The alternative is more retries for placement decisions
> because accurate decisions cannot be made until the compute node is already
> selected and the claim happens on the compute node.
>>      Could it make sense to make that part pluggable via stevedore?, so
>> we would provide something that takes the "resource id" (for a port in
>> this case) and returns the requirements translated to resource classes
>> (NIC_BW_KB in this case).
> Not sure Stevedore makes sense in this context. Really, we want *less*
> extensibility and *more* consistency. So, I would envision rather a system
> where Nova would call to Neutron before scheduling when it has received a
> port or network ID in the boot request and ask Neutron whether the port or
> network has any resource constraints on it. Neutron would return a
> standardized response containing each resource class and the amount
> requested in a dictionary (or better yet, an os_vif.objects.* object,
> serialized). Something like:
> {
>   'resources': {
>     '<UUID of port or network>': {
>       'NIC_BW_KB': 2048,
>       'IPV4_ADDRESS': 1
>     }
>   }
> }

Oh, true, that's a great idea, having some API that translates a
neutron resource, to scheduling constraints. The external call will be
still required, but the coupling issue is removed.

> In the case of the NIC_BW_KB resource class, Nova's scheduler would look for
> compute nodes that had a NIC with that amount of bandwidth still available.
> In the case of the IPV4_ADDRESS resource class, Nova's scheduler would use
> the generic-resource-pools interface to find a resource pool of IPV4_ADDRESS
> resources (i.e. a Neutron routed network or subnet allocation pool) that has
> available IP space for the request.

Not sure about the IPV4_ADDRESS part because I still didn't look on
how they resolve routed networks with this new framework, but for
other constraints makes perfect sense to me.

> Best,
> -jay
>> Best regards,
>> Miguel Ángel Ajo
>> [1]
>> http://lists.openstack.org/pipermail/openstack-dev/2016-February/086371.html
>> [2] https://bugs.launchpad.net/neutron/+bug/1560963
>> [3]
>> http://developer.openstack.org/api-ref-networking-v2-ext.html#showPolicy

More information about the OpenStack-dev mailing list