[Openstack-operators] [openstack-dev] [nova][placement] Trying to summarize bp/glance-image-traits scheduling alternatives for rebuild
Arvind N
arvindn05 at gmail.com
Wed May 2 16:16:23 UTC 2018
> What if the API compares the original image required traits against the
new image required traits, and if the new image has required traits which
weren't in the original image, then (punt) fail in the API? Then you would
at least have a chance > to rebuild with a new image that has required
traits as long as those required traits are less than or equal to the
originally validated traits for the host on which the instance is currently
running.
This is what i was proposing with #1, sorry if it was unclear. Will make it
more explicit.
1. Reject the rebuild request indicating that rebuilding with a new image
with **different** required traits compared to the original request is not
supported.
If the new image has the same or reduced set of traits as the old image,
then the request will be passed through to the conductor etc
Pseudo code
> if not set(new_image.traits_required).issubset(
set(original_image.traits_required))
> raise exception
On Wed, May 2, 2018 at 7:07 AM, Matt Riedemann <mriedemos at gmail.com> wrote:
> On 5/1/2018 5:26 PM, Arvind N wrote:
>
>> In cases of rebuilding of an instance using a different image where the
>> image traits have changed between the original launch and the rebuild, is
>> it reasonable to ask to just re-launch a new instance with the new image?
>>
>> The argument for this approach is that given that the requirements have
>> changed, we want the scheduler to pick and allocate the appropriate host
>> for the instance.
>>
>
> We don't know if the requirements have changed with the new image until we
> check them.
>
> Here is another option:
>
> What if the API compares the original image required traits against the
> new image required traits, and if the new image has required traits which
> weren't in the original image, then (punt) fail in the API? Then you would
> at least have a chance to rebuild with a new image that has required traits
> as long as those required traits are less than or equal to the originally
> validated traits for the host on which the instance is currently running.
>
>
>> The approach above also gives you consistent results vs the other
>> approaches where the rebuild may or may not succeed depending on how the
>> original allocation of resources went.
>>
>>
> Consistently frustrating, I agree. :) Because as a user, I can rebuild
> with some images (that don't have required traits) and can't rebuild with
> other images (that do have required traits).
>
> I see no difference with this and being able to rebuild (with a new image)
> some instances (image-backed) and not others (volume-backed). Given that, I
> expect if we punt on this, someone will just come along asking for the
> support later. Could be a couple of years from now when everyone has moved
> on and it then becomes someone else's problem.
>
> For example(from Alex Xu) ,if you launched an instance on a host which has
>> two SRIOV nic. One is normal SRIOV nic(A), another one with some kind of
>> offload feature(B).
>>
>> So, the original request is: resources=SRIOV_VF:1 The instance gets a VF
>> from the normal SRIOV nic(A).
>>
>> But with a new image, the new request is: resources=SRIOV_VF:1
>> traits=HW_NIC_OFFLOAD_XX
>>
>> With all the solutions discussed in the thread, a rebuild request like
>> above may or may not succeed depending on whether during the initial launch
>> whether nic A or nic B was allocated.
>>
>> Remember that in rebuild new allocation don't happen, we have to reuse
>> the existing allocations.
>>
>> Given the above background, there seems to be 2 competing options.
>>
>> 1. Fail in the API saying you can't rebuild with a new image with new
>> required traits.
>>
>> 2. Look at the current allocations for the instance and try to match the
>> new requirement from the image with the allocations.
>>
>> With #1, we get consistent results in regards to how rebuilds are treated
>> when the image traits changed.
>>
>> With #2, the rebuild may or may not succeed, depending on how well the
>> original allocations match up with the new requirements.
>>
>> #2 will also need to need to account for handling preferred traits or
>> granular resource traits if we decide to implement them for images at some
>> point...
>>
>
> Option 10: Don't support image-defined traits at all. I know that won't
> happen though.
>
> At this point I'm exhausted with this entire issue and conversation and
> will probably bow out and need someone else to step in with different
> perspective, like melwitt or dansmith.
>
> All of the solutions are bad in their own way, either because they add
> technical debt and poor user experience, or because they make rebuild more
> complicated and harder to maintain for the developers.
>
> --
>
> Thanks,
>
> Matt
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
--
Arvind N
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20180502/17cc0b0a/attachment.html>
More information about the OpenStack-operators
mailing list