[openstack-dev] [nova][ceilometer] model for ceilo/nova interaction going forward

Sandy Walsh sandy.walsh at RACKSPACE.COM
Fri Nov 23 13:31:30 UTC 2012


From: Jiang, Yunhong [yunhong.jiang at intel.com]
Sent: Friday, November 23, 2012 3:37 AM

>> Regardless, could you provide some info on the services that are suffering so
>> badly that eventlet lags the periodic_task for so long? Sounds like a nova bug
>> we should fix? Actually, if we're monitoring green thread counts via the periodic
>> task we should be able to warn upstream monitoring before it becomes an
>> issue.
>
>Data tells everything.
>
>I did some quick experiment on my SandyBridge EP system. The period_all.txt gives the time it cost to finish one >periodic_task. Nova_list is the output of my noav list.
>
>I paste the length of periodic tasks below, it takes more than 2 minutes to finish a round!
>
>Of course, you can call it as "nova bug", and we can tune the periodic tasks to shorten the time, but the key thing >is, "there is no architecture guarantee of the timeliness of the periodic task" in the eventlet and greenthread, and >we need make sure periodic task will not block the RPC call, will not block status report ......

Thanks for this Jiang ... I agree, data good. Several questions/observations:

1. I agree that there should be no architectural guarantees for periodic_task. But it should be "within reason". It looks like about a 1 minute periodic task with about a 30 second margin of error. If it was a 5 minute periodic task would we still only see a 30s margin of error? I think a 1 minute periodic task is too high-resolution for this mechanism. 

2. Which service is this coming from? I'm assuming Compute? Is the service especially busy? Or are this lags just inherent even under low load?

3. Are you dipping down into user-space (on the instances?) Could this be causing a explosion of green threads that is adding to the latency? My personal view is that user-space monitoring should be out-of-scope for ceilometer ... the "incubation" part of it is around measuring openstack, not the applications that run on it. 



--jyh

periodic tasks 2012-11-23 06:53:12.209065
PPP periodic tasks done 2012-11-23 06:53:35.225337
periodic tasks 2012-11-23 06:54:35.226013
PPP periodic tasks done 2012-11-23 06:55:37.826033
periodic tasks 2012-11-23 06:56:37.826802
PPP periodic tasks done 2012-11-23 06:57:40.110641
periodic tasks 2012-11-23 06:58:40.113476
PPP periodic tasks done 2012-11-23 06:59:23.248379
periodic tasks 2012-11-23 07:00:23.248486
PPP periodic tasks done 2012-11-23 07:01:21.104802
periodic tasks 2012-11-23 07:02:21.105146
PPP periodic tasks done 2012-11-23 07:03:31.180916
periodic tasks 2012-11-23 07:04:31.181067
PPP periodic tasks done 2012-11-23 07:05:57.834720
periodic tasks 2012-11-23 07:06:57.834820
PPP periodic tasks done 2012-11-23 07:08:42.046682
periodic tasks 2012-11-23 07:09:42.055921
PPP periodic tasks done 2012-11-23 07:11:49.662566
periodic tasks 2012-11-23 07:12:49.667298
PPP periodic tasks done 2012-11-23 07:14:42.141007
periodic tasks 2012-11-23 07:15:42.141126
PPP periodic tasks done 2012-11-23 07:17:54.201270
periodic tasks 2012-11-23 07:18:54.201386
PPP periodic tasks done 2012-11-23 07:20:54.931626
periodic tasks 2012-11-23 07:21:54.937052


--jyh

>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list