[openstack-dev] [devstack] [Cinder-GlusterFS CI] centos7 gate job abrupt failures

Deepak Shetty dpkshetty at gmail.com
Thu Mar 5 10:24:51 UTC 2015


Update:

   Cinder - GlusterFS CI job (ubuntu based) was added as experimental (non
voting) to cinder project [1]
Its running successfully without any issue so far [2], [3]

We will monitor it for few days and if it continues to run fine, we will
propose a patch to make it check (voting)

[1]: https://review.openstack.org/160664
[2]: https://jenkins07.openstack.org/job/gate-tempest-dsvm-full-glusterfs/
[3]: https://jenkins02.openstack.org/job/gate-tempest-dsvm-full-glusterfs/

thanx,
deepak

On Fri, Feb 27, 2015 at 10:47 PM, Deepak Shetty <dpkshetty at gmail.com> wrote:

>
>
> On Fri, Feb 27, 2015 at 4:02 PM, Deepak Shetty <dpkshetty at gmail.com>
> wrote:
>
>>
>>
>> On Wed, Feb 25, 2015 at 11:48 PM, Deepak Shetty <dpkshetty at gmail.com>
>> wrote:
>>
>>>
>>>
>>> On Wed, Feb 25, 2015 at 8:42 PM, Deepak Shetty <dpkshetty at gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Wed, Feb 25, 2015 at 6:34 PM, Jeremy Stanley <fungi at yuggoth.org>
>>>> wrote:
>>>>
>>>>> On 2015-02-25 17:02:34 +0530 (+0530), Deepak Shetty wrote:
>>>>> [...]
>>>>> > Run 2) We removed glusterfs backend, so Cinder was configured with
>>>>> > the default storage backend i.e. LVM. We re-created the OOM here
>>>>> > too
>>>>> >
>>>>> > So that proves that glusterfs doesn't cause it, as its happening
>>>>> > without glusterfs too.
>>>>>
>>>>> Well, if you re-ran the job on the same VM then the second result is
>>>>> potentially contaminated. Luckily this hypothesis can be confirmed
>>>>> by running the second test on a fresh VM in Rackspace.
>>>>>
>>>>
>>>> Maybe true, but we did the same on hpcloud provider VM too and both time
>>>> it ran successfully with glusterfs as the cinder backend. Also before
>>>> starting
>>>> the 2nd run, we did unstack and saw that free memory did go back to 5G+
>>>> and then re-invoked your script, I believe the contamination could
>>>> result in some
>>>> additional testcase failures (which we did see) but shouldn't be
>>>> related to
>>>> whether system can OOM or not, since thats a runtime thing.
>>>>
>>>> I see that the VM is up again. We will execute the 2nd run afresh now
>>>> and update
>>>> here.
>>>>
>>>
>>> Ran tempest with configured with default backend i.e. LVM and was able
>>> to recreate
>>> the OOM issue, so running tempest without gluster against a fresh VM
>>> reliably
>>> recreates the OOM issue, snip below from syslog.
>>>
>>> Feb 25 16:58:37 devstack-centos7-rax-dfw-979654 kernel: glance-api
>>> invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
>>>
>>> Had a discussion with clarkb on IRC and given that F20 is discontinued,
>>> F21 has issues with tempest (under debug by ianw)
>>> and centos7 also has issues on rax (as evident from this thread), the
>>> only option left is to go with ubuntu based CI job, which
>>> BharatK is working on now.
>>>
>>
>> Quick Update:
>>
>>     Cinder-GlusterFS CI job on ubuntu was added (
>> https://review.openstack.org/159217)
>>
>> We ran it 3 times against our stackforge repo patch @
>> https://review.openstack.org/159711
>> and it works fine (2 testcase failures, which are expected and we're
>> working towards fixing them)
>>
>> For the logs of the 3 experimental runs, look @
>>
>> http://logs.openstack.org/11/159711/1/experimental/gate-tempest-dsvm-full-glusterfs/
>>
>> Of the 3 jobs, 1 was schedued on rax and 2 on hpcloud, so its working
>> nicely across
>> the different cloud providers.
>>
>
> Clarkb, Fungi,
>   Given that the ubuntu job is stable, I would like to propose to add it
> as experimental to the
> openstack cinder while we work on fixing the 2 failed test cases in
> parallel
>
> thanx,
> deepak
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150305/439cf1cd/attachment.html>


More information about the OpenStack-dev mailing list