[openstack-dev] [oslo][performance] Proposing tail-based sampling in OSProfiler

Boris Pavlovic boris at pavlovic.me
Fri Aug 4 20:24:50 UTC 2017


Ilya,

Continuous tracing is a cool story, but before proceeding it would be good
> to estimate the overhead. There will be an additional delay introduced by
> OSProfiler library itself and delay caused by events transfer to consumer.
> OSProfiler overhead is critical to minimize. E.g. VM creation produces >1k
> events, which gives almost 2 times performance penalty in DevStack. Would
> be definitely nice to have the same test run on real environment --
> something that Performance Team could help with.


As far as I understand the idea of continuous tracing is to collect as few
as possible metrics to get insights of the request (not all tracepoints).
If you keep only API, RPC and Driver calls it is going to
drastically reduce amount of metrics collected.

As well, one of the things that should be done is sending the metrics in
bulk after the request in async way, that way we won't slow down UX and
won't add too much load on underlaying infrastructure.


Rajul,

ICYMI, Boris is father of OSprofiler in OpenStack [1]
>
>> This is why I was excited to get the first response from him and curious
>> on his stand. Really looking forward to get more on this from him. Also,
>> Josh's response on other Tracing thread peeked my curiosity further.
>
>
I'll try to elaborate my points. For monitoring perspective it's going to
be super beneficial to have continuous tracing and I fully support the
effort. However, it won't help community too much to fix the real problems
in architecture (in my opinion it's too late), for example creating VM
performs ~400 DB requests... and yep this is going to be slow, and now
what? how can you fix that?..

Best regards,
Boris Pavlovic



On Fri, Aug 4, 2017 at 1:12 PM, Rajul Kumar <kumar.raju at husky.neu.edu>
wrote:

> Hi Vinh
>
> For the `agent idea`, I think it is very good.
>
> However, in OpenStack, that idea may be really hard for us.
>
> The reason is the same with what Boris think.
>
>
> Thanks. We did a poc and working to integrate it with OSProfiler without
> affecting any of the services.
> I understand this will be difficult.
>
> For tail-based and adaptive sampling, it is another story.
>
> Exactly. This needs some major changes. We will need this if we look to
> have an effective tracing and any kind of automated analysis of the system.
>
> However, in naïve way, we can use sampling abilities from other
> OpenTracing compatible tracers
>
> such as Uber Jaeger, Appdash, Zipkin (has an open pull request), LighStep
> … by making OSprofiler
>
> compatible with OpenTracing API.
>
> I agree. Initially, this can be done.
> However, the limitations of traces they generate is another story and
> working to come up with another blueprint on that.
>
> ICYMI, Boris is father of OSprofiler in OpenStack [1]
>
> This is why I was excited to get the first response from him and curious
> on his stand. Really looking forward to get more on this from him. Also,
> Josh's response on other Tracing thread peeked my curiosity further.
>
> Thanks
> Rajul
>
>
>
>
>
> On Thu, Aug 3, 2017 at 10:04 PM, vinhnt at vn.fujitsu.com <
> vinhnt at vn.fujitsu.com> wrote:
>
>> Hi Rajul,
>>
>>
>>
>> For the `agent idea`, I think it is very good.
>>
>> However, in OpenStack, that idea may be really hard for us.
>>
>> The reason is the same with what Boris think.
>>
>>
>>
>> For the sampling part, head-based sampling can be implemented in
>> OSprofiler.
>>
>> For tail-based and adaptive sampling, it is another story.
>>
>> However, in naïve way, we can use sampling abilities from other
>> OpenTracing compatible tracers
>>
>> such as Uber Jaeger, Appdash, Zipkin (has an open pull request), LighStep
>> … by making OSprofiler
>>
>> compatible with OpenTracing API.
>>
>>
>>
>> ICYMI, Boris is father of OSprofiler in OpenStack [1]
>>
>>
>>
>> [1] https://specs.openstack.org/openstack/oslo-specs/specs/mitak
>> a/osprofiler-cross-service-project-profiling.html
>>
>>
>>
>> Best regards,
>>
>>
>>
>> Vinh Nguyen Trong
>>
>> PODC – Fujitsu Vietnam Ltd.
>>
>>
>>
>> *From:* Rajul Kumar [mailto:kumar.raju at husky.neu.edu]
>> *Sent:* Friday, 04 August, 2017 03:49
>> *To:* OpenStack Development Mailing List (not for usage questions) <
>> openstack-dev at lists.openstack.org>
>> *Subject:* Re: [openstack-dev] [oslo][performance] Proposing tail-based
>> sampling in OSProfiler
>>
>>
>>
>> Hi Boris
>>
>>
>>
>> That is a point of concern.
>>
>> Can you please direct to any of those?
>>
>>
>>
>> Anyways, we don't have anything in place for OpenStack yet.
>>
>> Now, either we pick another tracing solution like Zipkin, Jaeger etc.
>> which have their own limitations OR enhance OSProfiler.
>>
>> We pick the later as it's most native and better coupled with OpenStack
>> as of now.
>>
>> I understand that we may be blocked by these issues. However, I feel
>> it'll be better to fight with OSProfiler than anything else till we come up
>> with something better :)
>>
>>
>>
>> Thanks
>>
>> Rajul
>>
>>
>>
>>
>>
>>
>>
>> On Thu, Aug 3, 2017 at 4:01 PM, Boris Pavlovic <boris at pavlovic.me> wrote:
>>
>> Rajul,
>>
>>
>>
>> May I ask why you think so?
>>
>>
>>
>> Exposed by OSprofiler issues are going to be really hard to fix in
>> current OpenStack architecture.
>>
>>
>>
>> Best regards,
>>
>> Boris Pavlovic
>>
>>
>>
>> On Thu, Aug 3, 2017 at 12:56 PM, Rajul Kumar <kumar.raju at husky.neu.edu>
>> wrote:
>>
>> Hi Boris
>>
>>
>>
>> Good to hear from you.
>>
>> May I ask why you think so?
>>
>>
>>
>> We do see some potential with OSProfiler for this and further objectives.
>>
>>
>>
>> Thanks
>>
>> Rajul
>>
>>
>>
>> On Thu, Aug 3, 2017 at 3:48 PM, Boris Pavlovic <boris at pavlovic.me> wrote:
>>
>> Rajul,
>>
>>
>>
>> It makes sense! However, maybe it's a bit too late... ;)
>>
>>
>>
>> Best regards,
>>
>> Boris Pavlovic
>>
>>
>>
>> On Thu, Aug 3, 2017 at 12:16 PM, Rajul Kumar <kumar.raju at husky.neu.edu>
>> wrote:
>>
>> Hello everyone
>>
>>
>>
>> I have added a blueprint on having tail-based sampling as a sampling
>> option for continuous tracing in OSProfiler. It would be really helpful to
>> have some thoughts, ideas, comments on this from the community.
>>
>>
>>
>> Continuous tracing provides a good insight on how various transactions
>> behave across in a distributed system. Currently, OpenStack doesn't have a
>> defined solution for continuous tracing. Though, it has OSProfiler that
>> does generates selective traces, it may not capture the occurrence. Even if
>> we have OSProfiler running continuously [1], we need to sample the traces
>> so as to cut down the data generated and still keep the useful info.
>>
>>
>>
>> Head based sampling can be applied that decides initially whether a trace
>> should be saved or not. However, it may miss out on some useful traces. I
>> propose to have tail-based sampling [2] mechanism that makes the decision
>> at the end of the transaction and tends to keep all the useful traces. This
>> may require a lot of changes depending on what all type of info is required
>> and the solution that we pick to implement it [2]. This may not affect the
>> current working of any of the services on OpenStack as it will be off the
>> critical path [3].
>>
>>
>>
>> Please share your thoughts on this and what solution should be preferred
>> in a broader OpenStack's perspective.
>>
>> This is a step in the process of having an automated diagnostic solution
>> for OpenStack cluster.
>>
>>
>>
>> [1] https://blueprints.launchpad.net/osprofiler/+spec/
>> osprofiler-overhead-control
>>
>> [2] https://blueprints.launchpad.net/osprofiler/+spec/tail-
>> based-coherent-sampling
>>
>> [3] https://blueprints.launchpad.net/osprofiler/+spec/
>> asynchronous-trace-collection
>>
>>
>>
>> Thanks
>>
>> Rajul Kumar
>>
>>
>>
>>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>> ____________________________________________________________
>> ______________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscrib
>> e
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170804/310b1151/attachment-0001.html>


More information about the OpenStack-dev mailing list