[Openstack-operators] [nova] Automatically disabling compute service on RBD EMFILE failures
Daniel P. Berrange
berrange at redhat.com
Mon Jan 9 10:23:32 UTC 2017
On Sat, Jan 07, 2017 at 12:04:25PM -0600, Matt Riedemann wrote:
> A few weeks ago someone in the operators channel was talking about issues
> with ceph-backed nova-compute and OSErrors for too many open files causing
> issues.
>
> We have a bug reported that's very similar sounding:
>
> https://bugs.launchpad.net/nova/+bug/1651526
>
> During the periodic update_available_resource audit, the call to RBD to get
> disk usage fails with the EMFILE OSError. Since this is in a periodic it
> doesn't cause any direct operations to fail, but it will cause issues with
> scheduling as that host is really down, however, nothing sets the service to
> down (disabled).
>
> I had proposed a solution in the bug report that we could automatically
> disable the service for that host when this happens, and then automatically
> enable the service again if/when the next periodic task run is successful.
> Disabling the service would take that host out of contention for scheduling
> and may also trigger an alarm for the operator to investigate the failure
> (although if there are EMFILE errors from the ceph cluster I'm guessing
> alarms should already be going off).
>
> Anyway, I wanted to see how hacky of an idea this is. We already
> automatically enable/disable the service from the libvirt driver when the
> connection to libvirt itself drops via an event callback. This would be
> similar albeit less sophisticated as it's not using an event listening
> mechanism, we'd have to maintain some local state in memory to know if we
> need to enable/disable the service again. And it seems pretty
> hacky/one-offish to handle this just for the RBD failure, but maybe we just
> generically handle it for any EMFILE error when collecting disk usage in the
> resource audit?
Presumably this deployment was using the default Linux file limits
which are at a ridiculously low value of 1024. Ceph with 900 OSDs
will potentially need 900 files, not really leaving any slack for
Nova todo other work. I'd be willing to bet there are other scenarios
in which Nova would hit the 1024 FD limit under high usage, not merely
Ceph. So perhaps regardless of whether Ceph is used, we should just
recommend that you always run Nova with 4096 fds, and check that in
initialize() on startup and log a warning if the num files is lower
than this.
With pretty much all distros using systemd, it would be nice if Nova
shipped a standard systemd unit file, which could then also contain
the recommended higher FD limit so people get sane limits out of the
box.
Regards,
Daniel
--
|: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org -o- http://virt-manager.org :|
|: http://entangle-photo.org -o- http://search.cpan.org/~danberr/ :|
More information about the OpenStack-operators
mailing list