Open Stack

Thu Jul 26 16:15:03 UTC 2018

HTML: https://anticdent.org/novas-use-of-placement.html

A year and a half ago I did some analysis on how [nova uses
placement](http://lists.openstack.org/pipermail/openstack-dev/2017-January/110953.html).

I've repeated some of that analysis today and here's a brief summary
of the results. Note that I don't present this because I'm concerned
about load on placement, we've demonstrated that placement scales
pretty well. Rather, this analysis indicates that the compute node
is doing redundant work which we'd prefer not to do. The compute
node can't scale horizontally in the same way placement does. If
offloading the work to placement and being redundant is the easiest
way to avoid work on the compute node, let's do that, but that
doesn't seem to be quite what's happening here.

Nova uses placement mainly from two places:

* The `nova-compute` nodes report resource provider and inventory to
   placement and make sure that the placement view of what hardware
   is present is accurate.

* The `nova-scheduler` processes request candidates for placement,
   and claim resources by writing allocations to placement.

There are some additional interactions, mostly associated with
migrations or fixing up unusual edge cases. Since those things are
rare they are sort of noise in this discussion, so left out.

When a basic (where basic means no nested resource providers)
compute node starts up it POSTs to create a resource provider and
then PUTs to set the inventory. After that a periodic job runs,
usually every 60 seconds. In that job we see the following 11
requests:

     GET /placement/resource_providers?in_tree=82fffbc6-572b-4db0-b044-c47e34b27ec6
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/aggregates
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/traits
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/allocations
     GET /placement/resource_providers?in_tree=82fffbc6-572b-4db0-b044-c47e34b27ec6
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/aggregates
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/traits
     GET /placement/resource_providers/82fffbc6-572b-4db0-b044-c47e34b27ec6/inventories

A year and a half ago it was 5 requests per-cycle, but they were
different requests:

     GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/aggregates
     GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/inventories
     GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/allocations
     GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/aggregates
     GET /placement/resource_providers/0e33c6f5-62f3-4522-8f95-39b364aa02b4/inventories

The difference comes from two changes:

* We no longer confirm allocations on the compute node.
* We've now have things called ProviderTrees which are responsible
   for managing nested providers, aggregates and traits in a unified
   fashion.

It appears, however, that we have some redundancies. We get
inventories 4 times; aggregates, providers and traits 2 times, and
allocations once.

The `in_tree` calls happen from the report client method
`_get_providers_in_tree` which is called by
`_ensure_resource_provider` which can be called from multiple
places, but in this case is being called both times from
`get_provider_tree_and_ensure_root`, which is also responsible for
two of the inventory request.

`get_provider_tree_and_ensure_root` is called by `_update` in the
resource tracker.

`_update` is called by both `_init_compute_node` and
`_update_available_resource`. Every single period job iteration.
`_init_compute_node` is called from _update_available_resource`
itself.

That accounts for the overall doubling.

The two calls inventories per group come from the following, in
`get_provider_tree_and_ensure_root`:

1. `_ensure_resource_provider` in the report client calls
    `_refresh_and_get_inventory` for every provider in the tree (the
    result of the `in_tree` query)

2. Immediately after the the call to `_ensure_resource_provider`
    every provider in the provider tree (from
    `self._provider_tree.get_provider_uuids()`) then has a
    `_refresh_and_get_inventory` call made.

In a non-sharing, non-nested scenario (such as a single node
devstack, which is where I'm running this analysis) these are the
exact same one resource provider. I'm insufficiently aware of what
might be in the provider tree in more complex situations to be clear
on what could be done to limit redundancy here, but it's a place
worth looking.

The requests for aggregates and traits happen via
`_refresh_associations` in `_ensure_resource_provider`.

The single allocation request is from the resource tracker calling
`_remove_deleted_instances_allocations` checking to see if it is
possible to clean up any allocations left over from migrations.

## Summary/Actions

So what now? There are two avenues for potential investigation:

1. Each time `_update` is called it calls
   `get_provider_tree_and_ensure_root`. Can one of those be skipped
   while keeping the rest of `_update`? Or perhaps it is possible to
   avoid one of the calls to `_update` entirely?
2. Can `get_provider_tree_and_ensure_root` tries to manage inventory
    twice be rationalized for simple cases?

I've run out of time for now, so this doesn't address the requests
that happen once an instance exists. I'll get to that another time.

-- 
Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
freenode: cdent                                         tw: @anticdent

Open Stack

[openstack-dev] [nova] [placement] compute nodes use of placement

OpenStack

Community

Documentation

Branding & Legal