[openstack-dev] [nova][scheduler][placement] Trying to understand the proposed direction
Jay Pipes
jaypipes at gmail.com
Tue Jun 20 14:13:50 UTC 2017
On 06/20/2017 09:51 AM, Alex Xu wrote:
> 2017-06-19 22:17 GMT+08:00 Jay Pipes <jaypipes at gmail.com
> <mailto:jaypipes at gmail.com>>:
> * Scheduler then creates a list of N of these data structures,
> with the first being the data for the selected host, and the the
> rest being data structures representing alternates consisting of
> the next hosts in the ranked list that are in the same cell as
> the selected host.
>
> Yes, this is the proposed solution for allowing retries within a cell.
>
> Is that possible we use trait to distinguish different cells? Then the
> retry can be done in the cell by query the placement directly with trait
> which indicate the specific cell.
>
> Those traits will be some custom traits, and generate by the cell name.
No, we're not going to use traits in this way, for a couple reasons:
1) Placement doesn't and shouldn't know about Nova's internals. Cells
are internal structures of Nova. Users don't know about them, neither
should placement.
2) Traits describe a resource provider. A cell ID doesn't describe a
resource provider, just like an aggregate ID doesn't describe a resource
provider.
> * Scheduler returns that list to conductor.
> * Conductor determines the cell of the selected host, and sends
> that list to the target cell.
> * Target cell tries to build the instance on the selected host.
> If it fails, it uses the allocation data in the data structure
> to unclaim the resources for the selected host, and tries to
> claim the resources for the next host in the list using its
> allocation data. It then tries to build the instance on the next
> host in the list of alternates. Only when all alternates fail
> does the build request fail.
>
> In the compute node, will we get rid of the allocation update in the
> periodic task "update_available_resource"? Otherwise, we will have race
> between the claim in the nova-scheduler and that periodic task.
Yup, good point, and yes, we will be removing the call to PUT
/allocations in the compute node resource tracker. Only DELETE
/allocations/{instance_uuid} will be called if something goes terribly
wrong on instance launch.
Best,
-jay
More information about the OpenStack-dev
mailing list