[openstack-dev] [Nova][Ironic] Question about scheduling two instances to same baremetal node
Sylvain Bauza
sbauza at redhat.com
Fri Jan 9 14:21:13 UTC 2015
Le 09/01/2015 15:07, Murray, Paul (HP Cloud) a écrit :
>
> >There is bug when running nova with ironic
> https://bugs.launchpad.net/nova/+bug/1402658
>
> I filed this bug – it has been a problem for us.
>
> >The problem is at scheduler side the IronicHostManager will consume
> all the resources for that node whatever
>
> >how much resource the instance used. But at compute node side, the
> ResourceTracker won't consume resources
>
> >like that, just consume like normal virtual instance. And
> ResourceTracker will update the resource usage once the
>
> >instance resource claimed, then scheduler will know there are some
> free resource on that node, then will try to
>
> >schedule other new instance to that node
>
> You have summed up the problem nicely – i.e.: the resource
> availability is calculated incorrectly for ironic nodes.
>
> >I take look at that, there is NumInstanceFilter, it will limit how
> many instance can schedule to one host. So can
>
> >we just use this filter to finish the goal? The max instance is
> configured by option 'max_instances_per_host', we
>
> >can make the virt driver to report how many instances it supported.
> The ironic driver can just report max_instances_per_host=1.
>
> >And libvirt driver can report max_instance_per_host=-1, that means no
> limit. And then we can just remove the
>
> >IronicHostManager, then make the scheduler side is more simpler. Does
> make sense? or there are more trap?
>
> Makes sense, but solves the wrong problem. The problem is what you
> said above – i.e.: the resource availability is calculated incorrectly
> for ironic nodes.
>
> The right solution would be to fix the resource tracker. The ram
> resource on an ironic node has different allocation behavior to a
> regular node. The test to see if a new instance fits is the same, but
> instead of deducting the requested amount to get the remaining
> availability it should simply return 0. This should be dealt with in
> the new resource objects ([2] below) by either having different
> version of the resource object for ironic nodes (certainly doable and
> the most sensible option – resources should be presented according to
> the resources on the host). Alternatively the ram resource object
> should cater for the difference in its calculations.
>
> I have a local fix for this that I was too shy to propose upstream
> because it’s a bit hacky and will hopefully be obsolete soon. I could
> share it if you like.
>
> Paul
>
> [2] https://review.openstack.org/#/c/127609/
>
Agreed, I think that [2] will help a lot. Until it's done, are we really
sure we want to fix the bug ? It can be workarounded by creating flavors
being at least half the compute nodes and I really would like adding
more tech debt.
-Sylvain
> From: *Sylvain Bauza* <sbauza at redhat.com <mailto:sbauza at redhat.com>>
> Date: 9 January 2015 at 09:17
> Subject: Re: [openstack-dev] [Nova][Ironic] Question about scheduling
> two instances to same baremetal node
> To: "OpenStack Development Mailing List (not for usage questions)"
> <openstack-dev at lists.openstack.org
> <mailto:openstack-dev at lists.openstack.org>>
>
> Le 09/01/2015 09:01, Alex Xu a écrit :
>
> Hi, All
>
> There is bug when running nova with ironic
> https://bugs.launchpad.net/nova/+bug/1402658
>
> The case is simple: one baremetal node with 1024MB ram, then boot
> two instances with 512MB ram flavor.
>
> Those two instances will be scheduling to same baremetal node.
>
> The problem is at scheduler side the IronicHostManager will
> consume all the resources for that node whatever
>
> how much resource the instance used. But at compute node side, the
> ResourceTracker won't consume resources
>
> like that, just consume like normal virtual instance. And
> ResourceTracker will update the resource usage once the
>
> instance resource claimed, then scheduler will know there are some
> free resource on that node, then will try to
>
> schedule other new instance to that node.
>
> I take look at that, there is NumInstanceFilter, it will limit how
> many instance can schedule to one host. So can
>
> we just use this filter to finish the goal? The max instance is
> configured by option 'max_instances_per_host', we
>
> can make the virt driver to report how many instances it
> supported. The ironic driver can just report max_instances_per_host=1.
>
> And libvirt driver can report max_instance_per_host=-1, that means
> no limit. And then we can just remove the
>
> IronicHostManager, then make the scheduler side is more simpler.
> Does make sense? or there are more trap?
>
> Thanks in advance for any feedback and suggestion.
>
> Mmm, I think I disagree with your proposal. Let me explain by the best
> I can why :
>
> tl;dr: Any proposal unless claiming at the scheduler level tends to be
> wrong
>
> The ResourceTracker should be only a module for providing stats about
> compute nodes to the Scheduler.
> How the Scheduler is consuming these resources for making a decision
> should only be a Scheduler thing.
>
> Here, the problem is that the decision making is also shared with the
> ResourceTracker because of the claiming system managed by the context
> manager when booting an instance. It means that we have 2 distinct
> decision makers for validating a resource.
>
> Let's stop to be realistic for a moment and discuss about what could
> mean a decision for something else than a compute node. Ok, let say a
> volume.
> Provided that *something* would report the volume statistics to the
> Scheduler, that would be the Scheduler which would manage if a volume
> manager could accept a volume request. There is no sense to validate
> the decision of the Scheduler on the volume manager, just maybe doing
> some error management.
>
> We know that the current model is kinda racy with Ironic because there
> is a 2-stage validation (see [1]). I'm not in favor of complexifying
> the model, but rather put all the claiming logic in the scheduler,
> which is a longer path to win, but a safier one.
>
> -Sylvain
>
> [1] https://bugs.launchpad.net/nova/+bug/1341420
>
>
> Thanks
>
> Alex
>
> _______________________________________________
>
> OpenStack-dev mailing list
>
> OpenStack-dev at lists.openstack.org <mailto:OpenStack-dev at lists.openstack.org>
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150109/7b982faf/attachment-0001.html>
More information about the OpenStack-dev
mailing list