[openstack-dev] [ceilometer]ceilometer-collector high CPU usage

Gyorgy Szombathelyi gyorgy.szombathelyi at doclerholding.com
Wed Feb 17 13:46:47 UTC 2016


> 
> hi,
Hi Gordon,

> 
> this seems to be similar to a bug we were tracking in earlier[1].
> basically, any service with a listener never seemed to idle properly.
> 
> based on earlier investigation, we found it relates to the heartbeat
> functionality in oslo.messaging. i'm not entirely sure if it's because of it or
> some combination of things including it. the short answer, is to disable
> heartbeat by setting heartbeat_timeout_threshold = 0 and see if it fixes your
> cpu usage. you can track the comments in bug.

As I see in the bug report, you mention that the problem is only with the notification agent, 
and the collector is fine. I'm in an entirely opposite else situtation.

starce-ing the two processes:

Notification agent:
----------------------
epoll_wait(4, {}, 1023, 43)             = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_ctl(4, EPOLL_CTL_DEL, 8, {EPOLLWRNORM|EPOLLMSG|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLONESHOT|EPOLLET|0x1ec88000, {u32=32738, u64=24336577484324834}}) = 0
recvfrom(8, 0x7fe2da3a4084, 7, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(4, EPOLL_CTL_ADD, 8, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=8, u64=40046962262671368}}) = 0
epoll_wait(4, {}, 1023, 1)              = 0
epoll_ctl(4, EPOLL_CTL_DEL, 24, {EPOLLWRNORM|EPOLLMSG|EPOLLERR|EPOLLHUP|EPOLLRDHUP|EPOLLONESHOT|EPOLLET|0x1ec88000, {u32=32738, u64=24336577484324834}}) = 0
recvfrom(24, 0x7fe2da3a4084, 7, 0, 0, 0) = -1 EAGAIN (Resource temporarily unavailable)
epoll_ctl(4, EPOLL_CTL_ADD, 24, {EPOLLIN|EPOLLPRI|EPOLLERR|EPOLLHUP, {u32=24, u64=40046962262671384}}) = 0
epoll_wait(4, {}, 1023, 0)              = 0

ceilometer-collector:
-------------------------
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0
epoll_wait(4, {}, 1023, 0)              = 0

So the notification agent do something at least between the crazy epoll()s.

It is the same with or without the heartbeat_timeout_threshold = 0 in [oslo_messaging_rabbit].
Then something must be still wrong with the listeners, the bug[1] should not be closed, I think.

Br,
György

> 
> [1] https://bugs.launchpad.net/oslo.messaging/+bug/1478135
> 
> On 17/02/2016 4:14 AM, Gyorgy Szombathelyi wrote:
> > Hi!
> >
> > Excuse me, if the following question/problem is a basic one, already
> > known problem, or even a bad setup on my side.
> >
> > I just noticed that the most CPU consuming process in an idle
> > OpenStack cluster is ceilometer-collector. When there are only
> > 10-15 samples/minute, it just constantly eats about 15-20% CPU.
> >
> > I started to debug, and noticed that it epoll()s constantly with a
> > zero timeout, so it seems it just polls for events in a tight loop.
> > I found out that the _maybe_ the python side of the problem is
> > oslo_messaging.get_notification_listener() with the eventlet executor.
> > A quick search showed that this function is only used in aodh_listener
> > and ceilometer_collector, and both are using relatively high CPU even
> > if they're just 'listening'.
> >
> > My skills for further debugging is limited, but I'm just curious why
> > this listener uses so much CPU, while other executors, which are using
> > eventlet, are not that bad. Excuse me, if it was a basic question,
> > already known problem, or even a bad setup on my side.
> >
> > Br,
> > György
> >
> >
> __________________________________________________________
> ____________
> > ____ OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe:
> > OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> 
> --
> gord
> 
> __________________________________________________________
> ________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list