On Mon, May 6, 2019 at 8:03 PM, Eric Fried <openstack@fried.cc> wrote:
Summary: In keeping with the first proposed cycle theme [1] (though we didn't land on that until later in the PTG), we would like to be able to add required traits to the GET /allocation_candidates query to reduce the number of results returned - i.e. do more filtering in placement rather than in the scheduler (or worse, the compute). You can already do this by explicitly adding required traits to flavor/image; we want to be able to do it implicitly based on things like: - If the instance requires multiattach, make sure it lands on a compute that supports multiattach [2]. - If the image is in X format, make sure it lands on a compute that can read X format [3].
Currently the proposals in [2],[3] work by modifying the RequestSpec.flavor right before select_destinations calls GET /allocation_candidates. This just happens to be okay because we don't persist that copy of the flavor back to the instance (which we wouldn't want to do, since we don't want these implicit additions to e.g. show up when we GET server details, or to affect other lifecycle operations).
But this isn't a robust design.
What we would like to do instead is exploit the RequestSpec.requested_resources field [4] as it was originally intended, accumulating all the resource/trait/aggregate/etc. criteria from the flavor, image, *and* request_filter-y things like the above. However, gibi started on this [5] and it turns out to be difficult to express the unnumbered request group in that field for... reasons.
Sorry that I was not able to describe the problems with the approach on the PTG. I will try now in a mail. So this patch [5] tries to create the unnumbered group in RequestSpec.requested_resources based on the other fields (flavor, image ..) in the RequestSpec early enough that the above mentioned pre-filters can add traits to this group instead of adding it the the flavor extra_spec. The current sequence is the following: * RequestSpec is created in three diffefent ways 1) RequestSpec.from_components(): used during server create. (and cold migrate if legacy compute is present) 2) RequestSpec.from_primitives(): deprecated but still used during re-schedule 3) RequestSpec.__init__(): oslo OVO deepcopy calls __init__ then copies over every field one by one. * Before nova scheduler sends the Placement a_c query it calls nova.scheduler.utils.resources_from_request_spec(RequestSpec) that code use the RequesSpec fields and collect all the request groups and all the other parameters (e.g. limit, group_policy) What we would need at the end: * When the RequetSpec is created in any way we need to populate the RequestSpec.requested_resources field based on the other RequestSpec fields. Note that __init__ cannot be used for this as all three instantiation of the object creates an empty object first with __init__ then pupulates the fields later one by one. * When any of the interesting fields (flavor, image, is_bvf, force_*, ...) is updated on the RequestSpec the request groups in RequestSpec.requested_resources needs to be updated to reflect the change. However we have to be careful not to blindly re-generate such data as the unnumbered group migh already contain traits that are not coming form any of these direct sources but coming from the above mentioned implicit required traits code paths. * When the Placement a_c query is generated it needs to be generated from RequestSpec.requested_resources There are couple of problems: 1) Detecting a change of a RequestSpec field cannot be done via wrapping the field in a propery due to OVO limitations [6]. Even if it would be possible the way we create the RequestSpec object (init an empty object then set fields one by one) the field setters might be called on an incomplete object. 2) Regeneration of RequestSpec.requested_resources would need to distinguish between data that can be regenerated from the other fields of the RequestSpec and the traits added from outside (implicit required traits). 3) The request pre-filters [7] run before the placement a_c query is generated. But these today changes the fields of the RequestSpec (e.g. requested_destination) that would mean the regeneration of RequestSpec.requested_resources would be needed. This probably solvable by changing the pre-filters to work directly on RequestSpec.requested_resources after we solved all the other issues. 4) The numbered request groups can come from multiple places. When it comes from the Flavor the number is stable as provided by the person created the Flavor. But when it comes from a Neutron port the number is generated (the next unoccupied int). So a re-generation of such groups would potentially re-numbed the groups. This makes the debuging hard as well as mapping numbered group back to the entity it requested the resource (port) after allocation. This probably solvable by using the proposed placement extension that allows a string in the numbered group name instead of just a single int. [8] This way the port uuid can be used as the identity for the numbered group to make the indenity stable. Cheers, gibi [6] https://bugs.launchpad.net/oslo.versionedobjects/+bug/1821619 [7] https://github.com/openstack/nova/blob/master/nova/scheduler/request_filter.... [8] https://storyboard.openstack.org/#!/story/2005575
Action: Since gibi is going to be pretty occupied and unlikely to have time to resolve [5], aspiers has graciously (been) volunteered to take it over; and then follow [2] and [3] to use that mechanism once it's available.
Aspier, ping me if you want to talk about these in IRC. Cheers, gibi
efried
[1] https://protect2.fireeye.com/url?k=07226944-5ba84bad-072229df-0cc47ad93e2e-db879b26751dd159&u=https://review.opendev.org/#/c/657171/1/priorities/train-priorities.rst@13 [2] https://protect2.fireeye.com/url?k=6793d282-3b19f06b-67939219-0cc47ad93e2e-b61d4c15f019d018&u=https://review.opendev.org/#/c/645316/ [3] https://protect2.fireeye.com/url?k=975e0f6d-cbd42d84-975e4ff6-0cc47ad93e2e-9cf6144999db0dfb&u=https://review.opendev.org/#/q/topic:bp/request-filter-image-types+(status:open+OR+status:merged) [4] https://protect2.fireeye.com/url?k=495a140e-15d036e7-495a5495-0cc47ad93e2e-745cad547e47b7cc&u=https://opendev.org/openstack/nova/src/commit/5934c5dc6932fbf19ca7f3011c4ccc07b0038ac4/nova/objects/request_spec.py#L93-L100 [5] https://protect2.fireeye.com/url?k=733c10d0-2fb63239-733c504b-0cc47ad93e2e-25f07d70c4385f31&u=https://review.opendev.org/#/c/647396/