[openstack-dev] [tripleo][ironic] Hardware provisioning testing for Ocata
jtaleric at redhat.com
Tue Jun 13 22:48:18 UTC 2017
On Fri, Jun 9, 2017 at 5:25 AM, Dmitry Tantsur <dtantsur at redhat.com> wrote:
> On 06/08/2017 02:21 PM, Justin Kilpatrick wrote:
>> Morning everyone,
>> I've been working on a performance testing tool for TripleO hardware
>> provisioning operations off and on for about a year now and I've been
>> using it to try and collect more detailed data about how TripleO
>> performs in scale and production use cases. Perhaps more importantly
>> YODA (Yet Openstack Deployment Tool, Another) automates the task
>> enough that days of deployment testing is a set it and forget it
>> operation. >
>> You can find my testing tool here  and the test report  has
>> links to raw data and visualization. Just scroll down, click the
>> capcha and click "go to kibana". I still need to port that machine
>> from my own solution over to search guard.
>> If you have too much email to consider clicking links I'll copy the
>> results summary here.
>> TripleO inspection workflows have seen massive improvements from
>> Newton with a failure rate for 50 nodes with the default workflow
>> falling from 100% to <15%. Using patches slated for Pike that spurious
>> failure rate reaches zero.
>> Overcloud deployments show a significant improvement of deployment
>> speed in HA and stack update tests.
>> Ironic deployments in the overcloud allow the use of Ironic for bare
>> metal scale out alongside more traditional VM compute. Considering a
>> single conductor starts to struggle around 300 nodes it will be
>> difficult to push a multi conductor setup to it's limits.
> This number of "300", does it come from your testing or from other sources?
Dmitry - The "300" comes from my testing on different environments.
Most recently, here is what I saw at CNCF -
The undercloud was "idle" during this period.
> If the former, which driver were you using?
> What exactly problems have you seen approaching this number?
I would have to restart ironic-conductor before every scale-up, which
here is what ironic-conductor looks like after a restart
. Without restarting ironic, the scale up would fail due to ironic (I
do not have the exact error we would encounter documented).
>> Finally Ironic node cleaning, shows a similar failure rate to
>> inspection and will require similar attention in TripleO workflows to
>> become painless.
> Could you please elaborate? (a bug could also help). What exactly were you
>>  https://review.openstack.org/#/c/384530/
>> Thanks for your time!
> Thanks for YOUR time, this work is extremely valuable!
>> - Justin
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
More information about the OpenStack-dev