[openstack-dev] OS tracing??
harlowja at yahoo-inc.com
Tue Sep 4 23:37:30 UTC 2012
All good info, RPC should definitely cover most, as for as the other ones,
a paste would be awesome. I'm hoping that this info can start to find
spots where issues will pop up, and we can fix them early. As for
eventlet, did a little digging, they have http://tinyurl.com/cc2uwlc which
seems to take into account the eventlet switching. Might be useful.
I'll also look into the trace stuff, I'd be cool if we could hook into
that to automatically pickup certain modules, and start actively tracing
them, then be able to turn this on/off remotely (possibly via the eventlet
backdoor server?). Then you could have some pretty knarly debug
capabilities (when needed) as well as being able to track exactly what
your server is doing (without having to keep the 'tracing' always on,
which it seems like tach requires?) Of course at some point this might
have to be more intrusive, as u start wanting to know context and the
On 9/4/12 4:27 PM, "Sandy Walsh" <sandy.walsh at rackspace.com> wrote:
>Actually if you look at the default configs, you'll see we hook into the
>RPC dispatcher. All incoming/outgoing calls are tracked on all services,
>which is the majority of what's important. I have some specific ones for
>compute.run_instance, but it's optional. I'll dig it out and send a
>Never thought about hooking into python trace, but you'd likely spend
>more time telling it what *not* to report. Have to think about that a
>Eventlet and RPC in-queue time are definitely concerns. That's what
>Inflight is meant to monitor.
>From: Joshua Harlow [harlowja at yahoo-inc.com]
>Sent: Tuesday, September 04, 2012 8:14 PM
>To: Sandy Walsh; openstack-dev at yahoo-inc.com; OpenStack Development
>Subject: Re: OS tracing??
>Does this mean there is a massive set of functions which u guys have
>wrapped this around?
>Is there anyway that full config can be distributed? I wonder if it is
>possible to hook into the profiling/trace functions that python provides
>to automatically get this information (filter for certain
>namespaces/modules to avoid all the other 'garbage'?) Then no config would
>be needed at all, and this could become even more agnostic to what
>functions/classes/methods to wrap.
>I wonder if the interactions with eventlet though cause some issues....
>On 9/4/12 4:03 PM, "Sandy Walsh" <sandy.walsh at rackspace.com> wrote:
>>Yes, Tach can be used against any python program, but the sample configs
>>are for nova services.
>>You would call your program like this:
>>tach tach.conf nova_foo nova_foo.conf
>>This will load tach, load your program and monkeypatch the
>>functions/methods defined. The measurements go to statsd (timings,
>>We drive all this via puppet, so when we update our tach puppet variables
>>the services all update automatically.
>>From: Joshua Harlow [harlowja at yahoo-inc.com]
>>Sent: Tuesday, September 04, 2012 4:29 PM
>>To: openstack-dev at yahoo-inc.com; Sandy Walsh; OpenStack Development
>>Subject: Re: OS tracing??
>>Almost forgot about tach, it seems like it can be hooked into arbitrary
>>functions, which is great. It'd be cool if that type of functionality was
>>included with say nova, and it could be remotely enabled/disabled as
>>needed (say a weird production issue u want to find more info about, so u
>>send a special command that says start monitoring this function, or even
>>better, integrate it into eventlet so that it can start reporting
>>automatically on 'hot' functions).
>>Is tach monkey patching the functions that it is asked to instrument?
>>On 9/4/12 10:46 AM, "Sandy Walsh" <sandy.walsh at rackspace.com> wrote:
>>>We've been using Tach to orchestrate Openstack services and report to
>>>statsd/graphite. https://github.com/ohthree/tach ... works great
>>>I've been trying to land this Inflight Service branch to measure RPC and
>>>Hope it helps,
>>>From: Joshua Harlow [harlowja at yahoo-inc.com]
>>>Sent: Tuesday, September 04, 2012 2:35 PM
>>>To: OpenStack Development Mailing List
>>>Subject: [openstack-dev] OS tracing??
>>>Has anyone had any luck with trying out some tracing/coverage with the
>>>openstack projects to see where the bottlenecks are (outside of test
>>>I was thinking about possible ways to do this (there seems to be a lot
>>>different libraries that might help) but was wondering if anyone else
>>>figured out the best one to use yet.
>>>Ideally it should have the following properties (in my mind):
>>>Non-intrusive (shouldn't require sprinkling of timing/trace logic all
>>>over)Works with eventlet/greenlet (eventlet is going to switch things in
>>>and out, so that has to be taken account of)Probably does this via
>>>sampling (?)Writes out some standard format (valgrind like?) for
>>>analysisŠCan be turned on and off remotely (nice to have, it'd be cool
>>>have an API/entrypoint/Š that says enable tracing which can be used on a
>>>live system, that system will become slower but it'd be neat)
>>>This could be some special 'admin' entry point (restricted to certain
>>>users of course) that could also do stuff like 'reload-configs' or
>>>'enable-tracing' or 'adjust-log-level' or similar administrative actions
>>>that would be useful during those crazy debug
>>> sessions (think a simple admin telnet entrypoint to view stats, similar
>>>to what memcache/redis provide via there 'stats' commandsŠ)
>>>Anyone have any ideas on this :-)
More information about the OpenStack-dev