[openstack-dev] [Nova][Ironic] Question about scheduling two instances to same baremetal node

Alex Xu soulxu at gmail.com
Fri Jan 9 15:02:12 UTC 2015


2015-01-09 22:07 GMT+08:00 Murray, Paul (HP Cloud) <pmurray at hp.com>:

>   >There is bug when running nova with ironic
> https://bugs.launchpad.net/nova/+bug/1402658
>
>
>
> I filed this bug – it has been a problem for us.
>
>
>
> >The problem is at scheduler side the IronicHostManager will consume all
> the resources for that node whatever
>
> >how much resource the instance used. But at compute node side, the
> ResourceTracker won't consume resources
>
> >like that, just consume like normal virtual instance. And ResourceTracker
> will update the resource usage once the
>
> >instance resource claimed, then scheduler will know there are some free
> resource on that node, then will try to
>
> >schedule other new instance to that node
>
>
>
> You have summed up the problem nicely – i.e.: the resource availability is
> calculated incorrectly for ironic nodes.
>
>
>
> >I take look at that, there is NumInstanceFilter, it will limit how many
> instance can schedule to one host. So can
>
> >we just use this filter to finish the goal? The max instance is
> configured by option 'max_instances_per_host', we
>
> >can make the virt driver to report how many instances it supported. The
> ironic driver can just report max_instances_per_host=1.
>
> >And libvirt driver can report max_instance_per_host=-1, that means no
> limit. And then we can just remove the
>
> >IronicHostManager, then make the scheduler side is more simpler. Does
> make sense? or there are more trap?
>
>
>
>
>
> Makes sense, but solves the wrong problem. The problem is what you said
> above – i.e.: the resource availability is calculated incorrectly for
> ironic nodes.
>
> The right solution would be to fix the resource tracker. The ram resource
> on an ironic node has different allocation behavior to a regular node. The
> test to see if a new instance fits is the same, but instead of deducting
> the requested amount to get the remaining availability it should simply
> return 0. This should be dealt with in the new resource objects ([2] below)
> by either having different version of the resource object for ironic nodes
> (certainly doable and the most sensible option – resources should be
> presented according to the resources on the host). Alternatively the ram
> resource object should cater for the difference in its calculations.
>
 Dang it, I reviewed that spec....why I didn't found that :( Totally beat
me!

>  I have a local fix for this that I was too shy to propose upstream
> because it’s a bit hacky and will hopefully be obsolete soon. I could share
> it if you like.
>
> Paul
>
> [2] https://review.openstack.org/#/c/127609/
>
>
>
>
>
> From: *Sylvain Bauza* <sbauza at redhat.com>
> Date: 9 January 2015 at 09:17
> Subject: Re: [openstack-dev] [Nova][Ironic] Question about scheduling two
> instances to same baremetal node
> To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev at lists.openstack.org>
>
>
>
> Le 09/01/2015 09:01, Alex Xu a écrit :
>
>  Hi, All
>
>
>
> There is bug when running nova with ironic
> https://bugs.launchpad.net/nova/+bug/1402658
>
>
>
> The case is simple: one baremetal node with 1024MB ram, then boot two
> instances with 512MB ram flavor.
>
> Those two instances will be scheduling to same baremetal node.
>
>
>
> The problem is at scheduler side the IronicHostManager will consume all
> the resources for that node whatever
>
> how much resource the instance used. But at compute node side, the
> ResourceTracker won't consume resources
>
> like that, just consume like normal virtual instance. And ResourceTracker
> will update the resource usage once the
>
> instance resource claimed, then scheduler will know there are some free
> resource on that node, then will try to
>
> schedule other new instance to that node.
>
>
>
> I take look at that, there is NumInstanceFilter, it will limit how many
> instance can schedule to one host. So can
>
> we just use this filter to finish the goal? The max instance is configured
> by option 'max_instances_per_host', we
>
> can make the virt driver to report how many instances it supported. The
> ironic driver can just report max_instances_per_host=1.
>
> And libvirt driver can report max_instance_per_host=-1, that means no
> limit. And then we can just remove the
>
> IronicHostManager, then make the scheduler side is more simpler. Does make
> sense? or there are more trap?
>
>
>
> Thanks in advance for any feedback and suggestion.
>
>
>
>
>
> Mmm, I think I disagree with your proposal. Let me explain by the best I
> can why :
>
> tl;dr: Any proposal unless claiming at the scheduler level tends to be
> wrong
>
> The ResourceTracker should be only a module for providing stats about
> compute nodes to the Scheduler.
> How the Scheduler is consuming these resources for making a decision
> should only be a Scheduler thing.
>
> Here, the problem is that the decision making is also shared with the
> ResourceTracker because of the claiming system managed by the context
> manager when booting an instance. It means that we have 2 distinct decision
> makers for validating a resource.
>
> Let's stop to be realistic for a moment and discuss about what could mean
> a decision for something else than a compute node. Ok, let say a volume.
> Provided that *something* would report the volume statistics to the
> Scheduler, that would be the Scheduler which would manage if a volume
> manager could accept a volume request. There is no sense to validate the
> decision of the Scheduler on the volume manager, just maybe doing some
> error management.
>
> We know that the current model is kinda racy with Ironic because there is
> a 2-stage validation (see [1]). I'm not in favor of complexifying the
> model, but rather put all the claiming logic in the scheduler, which is a
> longer path to win, but a safier one.
>
> -Sylvain
>
> [1]  https://bugs.launchpad.net/nova/+bug/1341420
>
>
>   Thanks
>
> Alex
>
>
>
> _______________________________________________
>
> OpenStack-dev mailing list
>
> OpenStack-dev at lists.openstack.org
>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150109/6c421c5f/attachment.html>


More information about the OpenStack-dev mailing list