[openstack-dev] [nova] Resource tracker

Vishvananda Ishaya vishvananda at gmail.com
Tue Oct 7 17:56:42 UTC 2014


On Oct 7, 2014, at 6:21 AM, Daniel P. Berrange <berrange at redhat.com> wrote:

> On Mon, Oct 06, 2014 at 02:55:20PM -0700, Joe Gordon wrote:
>> On Mon, Oct 6, 2014 at 6:03 AM, Gary Kotton <gkotton at vmware.com> wrote:
>> 
>>> Hi,
>>> At the moment the resource tracker in Nova ignores that statistics that
>>> are returned by the hypervisor and it calculates the values on its own. Not
>>> only is this highly error prone but it is also very costly – all of the
>>> resources on the host are read from the database. Not only the fact that we
>>> are doing something very costly is troubling, the fact that we are over
>>> calculating resources used by the hypervisor is also an issue. In my
>>> opinion this leads us to not fully utilize hosts at our disposal. I have a
>>> number of concerns with this approach and would like to know why we are not
>>> using the actual resource reported by the hypervisor.
>>> The reason for asking this is that I have added a patch which uses the
>>> actual hypervisor resources returned and it lead to a discussion on the
>>> particular review (https://review.openstack.org/126237).
>>> 
>> 
>> So it sounds like you have mentioned two concerns here:
>> 
>> 1. The current method to calculate hypervisor usage is expensive in terms
>> of database access.
>> 2. Nova ignores that statistics that are returned by the hypervisor and
>> uses its own calculations.
>> 
>> 
>> To #1, maybe we can doing something better, optimize the query, cache the
>> result etc. As for #2 nova intentionally doesn't use the hypervisor
>> statistics for a few reasons:
>> 
>> * Make scheduling more deterministic, make it easier to reproduce issues
>> etc.
>> * Things like memory ballooning and thin provisioning in general, mean that
>> the hypervisor is not reporting how much of the resources can be allocated
>> but rather how much are currently in use (This behavior can vary from
>> hypervisor to hypervisor today AFAIK -- which makes things confusing). So
>> if I don't want to over subscribe RAM, and the hypervisor is using memory
>> ballooning, the hypervisor statistics are mostly useless. I am sure there
>> are more complex schemes that we can come up with that allow us to factor
>> in the properties of thin provisioning, but is the extra complexity worth
>> it?
> 
> That is just an example of problems with the way Nova virt drivers
> /currently/ report usage to the schedular. It is easily within the
> realm of possibility for the virt drivers to be changed so that they
> report stats which take into account things like ballooning and thin
> provisioning so that we don't oversubscribe. Ignoring the hypervisor
> stats entirely and re-doing the calculations in the resource tracker
> code is just a crude workaround really. It is just swapping one set
> of problems for a new set of problems.

+1 lets make the hypervisors report detailed enough information that we
can do it without having to recalculate.

Vish

> 
>> That being said I am fine with discussing in a spec the idea of adding an
>> option to use the hypervisor reported statistics, as long as it is off by
>> default.
> 
> I'm against the idea of adding config options to switch between multiple
> codepaths because it is just punting the problem to the admins who are
> in an even worse position to decide what is best. It is saying would you
> rather your cloud have bug A or have bug B. We should be fixing the data
> the hypervisors report so that the resource tracker doesn't have to ignore
> them, and give the admins something which just works and avoid having to
> choose between 2 differently broken options.
> 
> 
> Regards,
> Daniel
> -- 
> |: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
> |: http://libvirt.org              -o-             http://virt-manager.org :|
> |: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
> |: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141007/3db36fd5/attachment.pgp>


More information about the OpenStack-dev mailing list