[osprofiler] Distributed tracing in OpenStack

12 Apr 2019

      Hi,

Distributed tracing is one of must-have features when one wants to track
the full path of request going through different services and APIs. This
makes it similar to shared request-id, but with nice visualization at the
end [1]. In OpenStack the tracing can be achieved via osprofiler library.
The library was introduced 5 years ago, and back then there was no standard
approach on how to do tracing and that's why it stays aside from what has
become a mainstream. Yet there is no single standard, but the major players
are OpenTracing and OpenCensus communities. OpenTracing is represented by
Uber's Jaeger which is the default tracer from k8s world.

Issues and limitations to be fixed:
1. Compatibility. While osprofiler library supports many different storage
drivers, it has only one way of transferring trace context over the wire.
Ideally the library should be compatible with other third-party tracers and
allow traces to start in front of OpenStack APIs (e.g. in user apps) and
continue after (e.g. in storage systems, or network management tools). [2]
2. Operation mode. With osprofiler tracing is initiated by user request,
while in industrial solutions the tracing can be managed centrally via
dynamic sampling policies.
3. In-process trace propagation. Depending on execution model (threaded,
async) the ways of storing current trace context differ. OSProfiler
supports thread-local model, which recently got broken with new async
implementation in openstacksdk [3]. With OpenTracing it is possible to
select the appropriate model alongside with tracer configuration.

What's the plan:
Switching to OpenTracing could be a good option to gain compatibility with
3rd-party solutions. The actual change should go to osprofiler library, but
indirectly affects all OpenStack projects (should it be a global team goal
then?). I'm going to make a PoC of proposed change, so reviews would be
highly appreciated.

Comments, suggestions?

Thanks,
Ilya

[1] e.g.
http://logs.openstack.org/15/650915/4/check/tempest-smoke-py3-osprofiler-red...
[2] https://bugs.launchpad.net/osprofiler/+bug/1798565
[3] https://bugs.launchpad.net/osprofiler/+bug/1818493

Ilya Shakhat

Fox, Kevin M

Monty Taylor

Monty Taylor

Slawomir Kaplonski

tags

participants (4)