[openstack-dev] OS tracing??
Sandy Walsh
sandy.walsh at RACKSPACE.COM
Tue Sep 4 23:27:59 UTC 2012
Actually if you look at the default configs, you'll see we hook into the RPC dispatcher. All incoming/outgoing calls are tracked on all services, which is the majority of what's important. I have some specific ones for compute.run_instance, but it's optional. I'll dig it out and send a paste.
Never thought about hooking into python trace, but you'd likely spend more time telling it what *not* to report. Have to think about that a little more.
Eventlet and RPC in-queue time are definitely concerns. That's what Inflight is meant to monitor.
-S
________________________________________
From: Joshua Harlow [harlowja at yahoo-inc.com]
Sent: Tuesday, September 04, 2012 8:14 PM
To: Sandy Walsh; openstack-dev at yahoo-inc.com; OpenStack Development Mailing List
Subject: Re: OS tracing??
Does this mean there is a massive set of functions which u guys have
wrapped this around?
Is there anyway that full config can be distributed? I wonder if it is
possible to hook into the profiling/trace functions that python provides
to automatically get this information (filter for certain
namespaces/modules to avoid all the other 'garbage'?) Then no config would
be needed at all, and this could become even more agnostic to what
functions/classes/methods to wrap.
Thoughts?
I wonder if the interactions with eventlet though cause some issues....
On 9/4/12 4:03 PM, "Sandy Walsh" <sandy.walsh at rackspace.com> wrote:
>Yes, Tach can be used against any python program, but the sample configs
>are for nova services.
>
>You would call your program like this:
>
>tach tach.conf nova_foo nova_foo.conf
>
>This will load tach, load your program and monkeypatch the
>functions/methods defined. The measurements go to statsd (timings,
>counts, etc)
>
>We drive all this via puppet, so when we update our tach puppet variables
>the services all update automatically.
>
>-S
>
>________________________________________
>From: Joshua Harlow [harlowja at yahoo-inc.com]
>Sent: Tuesday, September 04, 2012 4:29 PM
>To: openstack-dev at yahoo-inc.com; Sandy Walsh; OpenStack Development
>Mailing List
>Subject: Re: OS tracing??
>
>Thanks much,
>
>Almost forgot about tach, it seems like it can be hooked into arbitrary
>functions, which is great. It'd be cool if that type of functionality was
>included with say nova, and it could be remotely enabled/disabled as
>needed (say a weird production issue u want to find more info about, so u
>send a special command that says start monitoring this function, or even
>better, integrate it into eventlet so that it can start reporting
>automatically on 'hot' functions).
>
>Is tach monkey patching the functions that it is asked to instrument?
>
>-Josh
>
>On 9/4/12 10:46 AM, "Sandy Walsh" <sandy.walsh at rackspace.com> wrote:
>
>>We've been using Tach to orchestrate Openstack services and report to
>>statsd/graphite. https://github.com/ohthree/tach ... works great
>>
>>I've been trying to land this Inflight Service branch to measure RPC and
>>greenlet overhead
>>https://review.openstack.org/#/c/11179/
>>BP: https://blueprints.launchpad.net/nova/+spec/monitoring-service
>>
>>Hope it helps,
>>-S
>>
>>
>>
>>
>>From: Joshua Harlow [harlowja at yahoo-inc.com]
>>Sent: Tuesday, September 04, 2012 2:35 PM
>>To: OpenStack Development Mailing List
>>Cc: openstack-dev
>>Subject: [openstack-dev] OS tracing??
>>
>>Has anyone had any luck with trying out some tracing/coverage with the
>>openstack projects to see where the bottlenecks are (outside of test
>>coverage)?
>>
>>I was thinking about possible ways to do this (there seems to be a lot of
>>different libraries that might help) but was wondering if anyone else has
>>figured out the best one to use yet.
>>
>>Ideally it should have the following properties (in my mind):
>>
>>Non-intrusive (shouldn't require sprinkling of timing/trace logic all
>>over)Works with eventlet/greenlet (eventlet is going to switch things in
>>and out, so that has to be taken account of)Probably does this via
>>sampling (?)Writes out some standard format (valgrind like?) for
>>analysisŠCan be turned on and off remotely (nice to have, it'd be cool to
>>have an API/entrypoint/Š that says enable tracing which can be used on a
>>live system, that system will become slower but it'd be neat)
>>
>>This could be some special 'admin' entry point (restricted to certain
>>users of course) that could also do stuff like 'reload-configs' or
>>'enable-tracing' or 'adjust-log-level' or similar administrative actions
>>that would be useful during those crazy debug
>> sessions (think a simple admin telnet entrypoint to view stats, similar
>>to what memcache/redis provide via there 'stats' commandsŠ)
>>
>>
>>Anyone have any ideas on this :-)
>>
>>
>>
>>
>>
>>-Josh
>>
>>
>>
>>
>
More information about the OpenStack-dev
mailing list