[Third-party-announce] Cisco ml2 CI is disabled

Nikolay Fedotov (nfedotov) nfedotov at cisco.com
Wed Jun 17 16:23:24 UTC 2015


Hello Anita

Here is results of my investigation:
* Jenkins died at " Jun 11, 2015 8:15:45 PM " . Part of Jenkins log is in attachment
* Zuul experienced issues with memory too as it is reported by "apport". Log is in attachment. " Signal 11, or officially know as "segmentation fault", means that the program accessed a memory location that was not assigned ". Not sure why zuul did not died as Jenkins did.
* "ml2-nexus"  got EXCEPTION status because zuul was not able to submit new job. Code where zuul sets EXCEPTION status here: https://github.com/openstack-infra/zuul/blob/master/zuul/launcher/gearman.py#L368-L370 . I could find other places where a job gets EXCEPTION status. I think it failed to add job to Gearman queue because of RAM issue. (Gearman is managed by zuul)

In order to avoid such issue I will move Jenkins and other services to a machine (virtual) that has more RAM.

Thanks

-----Original Message-----
From: Anita Kuno [mailto:anteaya at anteaya.info] 
Sent: Saturday, June 13, 2015 2:41 AM
To: third-party-announce at lists.openstack.org
Subject: Re: [Third-party-announce] Cisco ml2 CI is disabled

On 06/12/2015 06:20 PM, Nikolay Fedotov (nfedotov) wrote:
> Hello Anita
> 
> Thank you for pointing me out to the cacti and IRC log.
> 
> I did not change comment text of the "Cisco ml2 CI". If tests failed the CI should reply :
> " Build failed. For help on isolating this failure, please contact cisco-openstack-neutron-ci at cisco.com. To re-run, post a 'recheck cisco-ml2' comment. "

The belief is it is the recheck syntax in the comment text that is the problem.

Have you logs for your system activity from yesterday?

Is anyone associated with this account active on irc?

Thank you,
Anita

anteaya on irc


> 
> I see there are guesses that looping might be caused by wrong " comment_filter " (regexp for check text) value. The CI uses:
>         - event: comment-added
>           comment_filter: (?i)^(Patch Set [0-9]+:\n\n)?\s*recheck(( (?:bug|lp)[\s#:]*(\d+))|( no bug))\s*$
>         - event: comment-added
>           comment_filter: (?i)^(Patch Set [0-9]+:\n\n)?\s*recheck 
> cisco-ml2\s*$
> 
> My guess is: Jenkins died at " Jun 11, 2015 8:15:45 PM"  due to error='Cannot allocate memory'. And, maybe, it caused zuul to start looping. But I am not sure.
> 
> Could you advice further steps that help me to get the CI account enabled?
> 
> Thanks.
> 
> -----Original Message-----
> From: Anita Kuno [mailto:anteaya at anteaya.info]
> Sent: Friday, June 12, 2015 11:13 PM
> To: third-party-announce at lists.openstack.org
> Subject: Re: [Third-party-announce] Cisco ml2 CI is disabled
> 
> On 06/12/2015 04:06 PM, Anita Kuno wrote:
>> On 06/12/2015 03:47 PM, Nikolay Fedotov (nfedotov) wrote:
>>> Hello All!
>>>
>>> I was wondering what kind of looping are you talking about? 
>>>
>>> I see the change https://review.openstack.org/#/c/188221/ has 5  patchset and each Cisco CI replied :
>>> - Cisco Tail-f CI - 4 times
>>> - Cisco N1kv CI - 4 times
>>> - Cisco APIC CI - 4 times
>>> - Cisco ml2 CI - 0 times
>>
>> We had to perform database surgery to remove the hundreds of comments 
>> left by Cisco ml2 CI. Trust us it was looping.
>>
>> We also had to restart gerrit to close the connection as our gerrit 
>> input spiked because of it. See the inbound spike from yesterday?
>> http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id
>> =
>> 34&rra_id=all
>> that was you.
>>
>> Your comment text changed yesterday sometime. No other patches were 
>> affected by this.
>>
>> Please confirm your comment text changed yesterday.
>>
>> Thank you,
>> Anita.
> 
> Also here is the log of the incident:
> 
> http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack
> -infra.2015-06-11.log.html#t2015-06-11T20:34:08
> 
>>
>>>
>>> So each CI systems tested every (except the 5th) patchset only once.
>>>
>>> Thanks.
>>>
>>> -----Original Message-----
>>> From: Asselin, Ramy [mailto:ramy.asselin at hp.com]
>>> Sent: Thursday, June 11, 2015 11:55 PM
>>> To: Announcements for third party CI operators.
>>> Subject: Re: [Third-party-announce] Cisco ml2 CI is disabled
>>>
>>> Wow, good call. We should document that and find a way to protect that from happening.
>>>
>>> -----Original Message-----
>>> From: Doug Wiegley [mailto:dougw at a10networks.com]
>>> Sent: Thursday, June 11, 2015 1:49 PM
>>> To: Announcements for third party CI operators.
>>> Subject: Re: [Third-party-announce] Cisco ml2 CI is disabled
>>>
>>> Not sure if this is your issue, but this looks similar to a looping issue that I inflicted on myself, by including my recheck text in my CI comment.
>>>
>>>
>>> "Cisco ml2 CI2:22 PMPatch Set 4:
>>> Build failed. For help on isolating this failure, please contact cisco-openstack-neutron-ci at cisco.com. To re-run, post a 'recheck cisco-ml2' comment.
>>>
>>> * ml2-nexus <http://128.107.233.10:8080/job/ml2-nexus/None> 
>>> EXCEPTION
>>> * python27 <http://128.107.233.10:8080/job/python27/225> SUCCESS in 6m 06s (non-voting) "
>>>
>>> Thanks,
>>> Doug
>>>
>>>
>>>
>>>
>>>
>>> On 6/11/15, 2:40 PM, "Anita Kuno" <anteaya at anteaya.info> wrote:
>>>
>>>> https://review.openstack.org/#/c/188221/
>>>>
>>>> The patch speaks for itself.
>>>>
>>>> Thank you, Russell Bryant for the alert.
>>>>
>>>> Before this system is enabled we will need to hear the results of 
>>>> your investigation as to why this happened as well as you assurance 
>>>> that it won't happen again.
>>>>
>>>> Thank you,
>>>> Anita.
>>>>
>>>> _______________________________________________
>>>> Third-party-announce mailing list
>>>> Third-party-announce at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-ann
>>>> o
>>>> unc
>>>> e
>>> _______________________________________________
>>> Third-party-announce mailing list
>>> Third-party-announce at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-anno
>>> u
>>> nce
>>>
>>> _______________________________________________
>>> Third-party-announce mailing list
>>> Third-party-announce at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-anno
>>> u
>>> nce
>>>
>>> _______________________________________________
>>> Third-party-announce mailing list
>>> Third-party-announce at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-anno
>>> u
>>> nce
>>>
>>
> 
> 
> _______________________________________________
> Third-party-announce mailing list
> Third-party-announce at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announ
> ce
> 
> _______________________________________________
> Third-party-announce mailing list
> Third-party-announce at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announ
> ce
> 


_______________________________________________
Third-party-announce mailing list
Third-party-announce at lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/third-party-announce
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jenkins.log
Type: application/octet-stream
Size: 2793 bytes
Desc: jenkins.log
URL: <http://lists.openstack.org/pipermail/third-party-announce/attachments/20150617/59dc5b30/attachment.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: apport.log
Type: application/octet-stream
Size: 378 bytes
Desc: apport.log
URL: <http://lists.openstack.org/pipermail/third-party-announce/attachments/20150617/59dc5b30/attachment-0001.obj>


More information about the Third-party-announce mailing list