[openstack-dev] [puppet] [infra] split integration jobs

Emilien Macchi emilien at redhat.com
Wed Sep 30 21:14:27 UTC 2015


Hello,

Today our Puppet OpenStack Integration jobs are deploying:
- mysql / rabbitmq
- keystone in wsgi with apache
- nova
- glance
- neutron with openvswitch
- cinder
- swift
- sahara
- heat
- ceilometer in wsgi with apache

Currently WIP:
- Horizon
- Trove

The status of the jobs is that some tempest tests (related to compute)
are failing randomly. Most of failures are because of timeouts:

http://logs.openstack.org/70/229470/1/check/gate-puppet-openstack-integration-dsvm-centos7/e374fd1/logs/neutron/server.txt.gz#_2015-09-30_18_38_32_425

http://logs.openstack.org/70/229470/1/check/gate-puppet-openstack-integration-dsvm-centos7/e374fd1/logs/nova/nova-compute.txt.gz#_2015-09-30_18_38_34_799

http://logs.openstack.org/70/229470/1/check/gate-puppet-openstack-integration-dsvm-centos7/e374fd1/logs/nova/nova-compute.txt.gz#_2015-09-30_18_38_12_636

http://logs.openstack.org/70/229470/1/check/gate-puppet-openstack-integration-dsvm-centos7/1d88f34/logs/nova/nova-compute.txt.gz#_2015-09-30_20_26_34_730

The timeouts happen because Nova needs more than 300s (default) to spawn
a VM. Neutron is barely able to sustain to Nova requests.

It's obvious we reached jenkins slave resources limits.


We have 3 options:

#1 increase timeouts and try to give more time to services to accomplish
what they need to do.

#2 drop some services from our testing scenario.

#3 split our scenario to have scenario001 and scenario002.

I feel like #1 is not really a scalable idea, since we are going to test
more and more services.

I don't like #2 because we want to test all our modules, not just a
subset of them.

I like #3 but we are going to consume more CI resources (that's why I
put [infra] tag).


Side note: we have some non-voting upgrade jobs that we don't really pay
attention now, because of lack of time to work on them. They consume 2
slaves. If resources are a problem, we can drop them and replace by the
2 new integration jobs.

So I propose option #3 and
* drop upgrade jobs if infra says we're using too much resources with 2
more jobs
* replace them by the 2 new integration jobs
or option #3 by adding 2 more jobs with a new scenario, where services
would be split.

Any feedback from Infra / Puppet teams is welcome,
Thanks,
-- 
Emilien Macchi

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150930/4e3256d8/attachment.pgp>


More information about the OpenStack-dev mailing list