Open Stack

Thu Oct 29 15:55:38 UTC 2015

On 10/29/2015 9:30 AM, Dina Belova wrote:
> Hey folks!
>
> On Tuesday we had great summit session about performance team kick-off
> and yesterday it was a great LDT session as well and I’m really glad to
> see how much does the OpenStack performance topic is important for all
> of us. 40 minutes session surely was not enough to analyse everyone’s
> feedback and bottlenecks people usually see, so I’ll try to finalise
> what have been discussed and the next steps in this email.
>
> Performance team kick-off session
> (https://etherpad.openstack.org/p/mitaka-cross-project-performance-team-kick-off)
> can be shortly described with the following points:
>
>   * IBM, Intel, HP, Mirantis, Rackspace, Red Hat, Yahoo! and others were
>     taking part in the session
>   * Various tools are used right now for OpenStack benchmarking and
>     profiling right now:
>       o Rally (IBM, HP, Mirantis, Yahoo!)
>       o Shaker (Mirantis, merging its functionality to Rally right now)
>       o Gatling (Rackspace)
>       o Zipkin (Yahoo!)
>       o JMeter (Yandex)
>       o and others…
>   * Various issues have been seen during the OpenStack cloud operating
>     (full list can be found here -
>     https://etherpad.openstack.org/p/openstack-performance-issues). Most
>     mentioned issues were the following:
>       o performance of DB-related layers (DB itself and oslo.db) - it is
>         about 7 abstraction DB layers in Nova; performance of Nova
>         conductor was mentioned several times
>       o performance of MQ-related layers (MQ itself and oslo.messaging)
>   * Different companies are using different standards for performance
>     benchmarking (both control plane and data plane testing)
>   * The most wished output from the team due to the comments will be:
>       o agree on the “performance testing standard”, including answers
>         on the following questions:
>           + what tools need to be used for OpenStack performance
>             benchmarking?
>           + what benchmarking meters need to be covered? what we would
>             like to compare?
>           + what scenarios need to be covered?
>           + how can we compare performance of different cloud deployments?
>           + what performance deployment patterns can be used for various
>             workloads?
>       o share test plans and perform benchmarking tests
>       o create methodologies and documentation about best OpenStack
>         deployment and performance testing practices
>
>
> We’re going to cover all these topics further. First of all IRC channel
> for the discussions was created: *#openstack-performance*. We’re going
> to have weekly meeting related to current progress on that channel,
> doodle with the voting can be found here:
> http://doodle.com/poll/wv6qt8eqtc3mdkuz#table
>   (I was brave enough not to include timeslots that were overlapping
> with some of mine really hard-to-move activities :))
>
> Let’s have next week as a voting time, and have first IRC meeting in our
> channel the week after next. We can start our further discussions with
> “performance” and “performance testing” terms definition and
> benchmarking tools analysis.
>
> Cheers,
> Dina
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

Thanks for writing this up, it's great to see people getting together 
and sharing info on performance issues and trying to pinpoint the big ones.

I poked through the performance issues etherpad and was wondering how 
many people with DB issues, particularly for nova-conductor, are using a 
level of oslo.db that's new enough to be using pymysql rather than 
mysql-python because from what I remember there were eventlet issues 
without pymysql. That was added to oslo.db 1.12.0 [1].

The nova-conductor workers / CPU usage is also a known issue in the 
large ops gate job [2] but I'm not aware of anyone spending the time 
drilling into what exactly is causing a lot of that overhead and if any 
of it is abnormal.

Finally, wrt DB, I'd also be interested to know if Rackspace, or anyone 
else, is still running with the direct-to-sql stuff that comstud wrote 
for nova [3] and if that still shows significant performance 
improvements over using sqlalchemy ORM. Not to open that can of worms in 
the -dev list here again, but it'd be an interesting data point.

[1] https://review.openstack.org/#/c/184392/
[2] https://review.openstack.org/#/c/228636/
[3] https://blueprints.launchpad.net/nova/+spec/db-mysqldb-impl

-- 

Thanks,

Matt Riedemann

Open Stack

[openstack-dev] Performance Team summit session results

OpenStack

Community

Documentation

Branding & Legal