<div dir="ltr">Hi, Craig.<div><br><div>Yes, i thought about configurations test suits.</div></div><div>For now core team, maybe, should extend gate running time.</div><div>But for the tempest tests i would suggest to exclude some tests from 'gate'-group (the longest ones).</div>
<div>We need to deal with it asap, because gate failing for four or five days.</div><div><br></div><div><div style="font-family:arial,sans-serif;font-size:13px">Best regards</div><div style="font-family:arial,sans-serif;font-size:13px">
Denis Makogon.</div><div style="font-family:arial,sans-serif;font-size:13px"><br>Sent from an iPad</div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Feb 17, 2014 at 6:33 AM, Craig Vyvial <span dir="ltr"><<a href="mailto:cp16net@gmail.com" target="_blank">cp16net@gmail.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div>Trovesters,</div><div><br></div>One reason for the longer running test was that for the configuration groups i added a creation of a new instance. This is to test a new instance will be created with a configuration group applied. This might be causing the run to be a little longer but i am surprised that its taking over an hour to run through everything still.<span class="HOEnZb"><font color="#888888"><div>

<br></div><div>-Craig Vyvial</div></font></span></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Feb 16, 2014 at 12:25 AM, Mirantis <span dir="ltr"><<a href="mailto:dmakogon@mirantis.com" target="_blank">dmakogon@mirantis.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="auto"><div>Hello, Mathew.</div><div><br></div><div>I'm seeing same issues with the gate.</div><div>I also tried to found out why gate job is failing. First ran into issue related to cinder installation failure in devstack. But then I found same problem as you described. The best option is to increase job time range. </div>

<div>Thanks for such research. I hope gate will be fixed in the easiest way and for the shortest period of time.<br><br>Best regards</div><div>Denis Makogon.<br>Sent from an iPad</div><div><br>16 февр. 2014, в 00:46, "Lowery, Mathew" <<a href="mailto:mlowery@ebay.com" target="_blank">mlowery@ebay.com</a>> написал(а):<br>

<br></div><div><div><blockquote type="cite"><div>




<div style="font-size:14px;font-family:Calibri,sans-serif">
Hi all,</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<b>Issue #1: Jobs that need more than one hour</b></div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
Of the last 30 <a href="https://rdjenkins.dyndns.org/job/Trove-Gate/" target="_blank">Trove-Gate</a> builds (spanning three days), 7 have failed due to a Jenkins job-level timeout (not a proboscis timeout). These jobs had no failed tests when the timeout occurred.</div>


<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
Not having access to the job config to see what the job looks like, I used the console output to guess what was going on. It appears that a Jenkins plugin named
<a href="https://github.com/mrhoades/boot-hpcloud-vm/blob/2272770b0ce54752eabb84229dc8939d79b2be50/models/boot_vm_concurrent.rb#L181" target="_blank">
boot-hpcloud-vm</a> is booting a VM and running the commands given, including redstack int-tests. From the console output, it states that it was supplied with an ssh_shell_timeout="7200". This is passed down to another library called
<a href="https://github.com/busyloop/net-ssh-simple/blob/e3834f259a47606bfb06a487ca701fc20dbad8a5/lib/net/ssh/simple.rb#L632" target="_blank">
net-ssh-simple</a>. net-ssh-simple has two timeouts: an idle timeout and an operation timeout.</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
In the <a href="https://github.com/mrhoades/boot-hpcloud-vm/blob/2272770b0ce54752eabb84229dc8939d79b2be50/models/boot_vm_concurrent.rb#L182" target="_blank">
latest boot-hpcloud-vm</a>, ssh_shell_timeout is passed down to net-ssh-simple for both the idle timeout and the operation timeout. But in
<a href="https://github.com/mrhoades/boot-hpcloud-vm/blob/9260e957d6c54142c33dd9e9632b86e17fd5c02f/models/boot_vm_concurrent.rb#L141" target="_blank">
older versions of boot-hp-cloud-vm</a>, ssh_shell_timeout is passed down to net-ssh-simple for only the idle timeout, leaving a default operation timeout of 3600. This is why I believe these jobs are failing after exactly one hour.</div>


<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<div>FYI: Here are the jobs that failed due to the Jenkins job-level timeout (and had no test failures when the timeout occurred) along with their associated patch sets:</div>
<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2532/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2532/console</a> (<a href="http://review.openstack.org/73786" target="_blank">http://review.openstack.org/73786</a>)</div>


<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2530/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2530/console</a> (<a href="http://review.openstack.org/73736" target="_blank">http://review.openstack.org/73736</a>)</div>


<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2517/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2517/console</a> (<a href="http://review.openstack.org/63789" target="_blank">http://review.openstack.org/63789</a>)</div>


<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2514/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2514/console</a> (<a href="https://review.openstack.org/50944" target="_blank">https://review.openstack.org/50944</a>)</div>


<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2513/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2513/console</a> (<a href="https://review.openstack.org/50944" target="_blank">https://review.openstack.org/50944</a>)</div>


<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2504/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2504/console</a> (<a href="https://review.openstack.org/73147" target="_blank">https://review.openstack.org/73147</a>)</div>


<div><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2503/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2503/console</a> (<a href="https://review.openstack.org/73147" target="_blank">https://review.openstack.org/73147</a>)</div>


<div><br>
</div>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<b>Suggested action items:</b></div>
<ul style="font-size:14px;font-family:Calibri,sans-serif">
<li>If it is acceptable to have jobs that run over one hour, then install the latest boot-hpcloud-vm plugin for Jenkins which will increase the make the operation timeout match the idle timeout.</li></ul>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<b>Issue #2: The running time of all jobs is 1 hr 1 min</b></div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
While the Jenkins job-level timeout will end the job after one hour, it also appears to keep every job running for a minimum of one hour.  To be more precise, the timeout (or minimum running time) occurs on the part of the Jenkins job that runs commands on
 the VM; the VM provision (which takes about one minute) is excluded from this timeout which is why the
<a href="https://rdjenkins.dyndns.org/job/Trove-Gate/buildTimeTrend" target="_blank">running time of all jobs is around 1 hr 1 min</a>. A sampling of console logs showing the time the int-tests completed and when the timeout kicks in:</div>


<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2531/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2531/console</a> (00:01:03 wasted)</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<pre style="white-space:pre-wrap;word-wrap:break-word;margin-top:0px;margin-bottom:0px;font-size:11px"><pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word"><span><b>04:51:12</b> COMMAND_0: echo refs/changes/36/73736/2</span></pre>

<pre style="white-space:pre-wrap;word-wrap:break-word;margin-top:0px;margin-bottom:0px"><span style="background-color:rgb(255,254,254)">...</span></pre><pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word">
<span><b>05:50:10</b> </span>    335.41     proboscis.case.MethodTest (test_instance_created)
<span><b>05:50:10</b> </span>    194.05     proboscis.case.MethodTest (test_instance_returns_to_active_after_resize)
<span><b>05:51:13</b> </span>**************************************
<span><b>05:51:13</b> </span>****** STDERR-BEGIN ******</pre></pre>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2521/console" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2521/console</a> (00:06:44 wasted)</div>
<div>
<pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word"><pre style="font-size:11px;white-space:pre-wrap;word-wrap:break-word;margin-top:0px;margin-bottom:0px"><pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word">
<span><b>21:11:44</b> </span>COMMAND_0: echo refs/changes/89/63789/13</pre><pre style="white-space:pre-wrap;word-wrap:break-word;margin-top:0px;margin-bottom:0px"><span style="background-color:rgb(255,254,254)">...</span></pre>

</pre><pre style="font-size:11px;white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word"><font face="Courier"><span><b>22:05:00</b> </span>    195.11     proboscis.case.MethodTest (test_instance_returns_to_active_after_resize)
<span><b>22:05:00</b> </span>    186.89     proboscis.case.MethodTest (test_resize_down)
<span><b>22:11:44</b> </span>**************************************
<span><b>22:11:44</b> </span>****** STDERR-BEGIN ******</font><span style="font-family:Calibri,sans-serif">
</span></pre><div style="font-size:11px;font-family:Calibri,sans-serif"><br></div><div><font face="Calibri"><a href="https://rdjenkins.dyndns.org/job/Trove-Gate/2518/consoleFull" target="_blank">https://rdjenkins.dyndns.org/job/Trove-Gate/2518/consoleFull</a> (00:06:01 wasted)</font></div>

<div style="font-size:11px;font-family:Calibri,sans-serif"><pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word"><span><b>17:46:59</b> COMMAND_0: echo refs/changes/02/64302/20</span></pre>

<pre style="white-space:pre-wrap;word-wrap:break-word;margin-top:0px;margin-bottom:0px"><span style="background-color:rgb(255,254,254)">...</span></pre><pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word">
<span><b>18:40:57</b> </span>    210.03     proboscis.case.MethodTest (test_instance_returns_to_active_after_resize)
<span><b>18:40:57</b> </span>    187.89     proboscis.case.MethodTest (test_resize_down)
<span><b>18:46:58</b> </span>**************************************
<span><b>18:46:58</b> </span>****** STDERR-BEGIN ******
</pre></div><div style="font-size:11px;font-family:Calibri,sans-serif"><br></div><font face="Calibri,sans-serif"><b>Suggested action items:</b></font></pre>
<ul>
<li>
<pre style="margin-bottom:0px;margin-top:0px;word-wrap:break-word"><font face="Calibri,sans-serif"><span style="white-space:pre-wrap">Given that the minimum running time is one hour, I assume the problem is in the net-ssh-simple library. Needs more investigation.</span></font></pre>


</li></ul>
<pre style="white-space:pre-wrap;margin-bottom:0px;margin-top:0px;word-wrap:break-word"><font face="Calibri,sans-serif" style="font-size:11px"><br></font></pre>
<pre style="white-space:pre-wrap;margin-bottom:0px;font-family:Calibri,sans-serif;margin-top:0px;word-wrap:break-word"><span style="background-color:rgb(255,254,254)"><b>Issue #3: Jenkins console log line timestamps different between full and truncated views</b></span></pre>


</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
I assume this is due to <a href="https://issues.jenkins-ci.org/browse/JENKINS-17779" target="_blank">JENKINS-17779</a>.</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<br>
</div>
<div style="font-size:14px;font-family:Calibri,sans-serif">
<b>Suggested action items:</b></div>
<ul style="font-size:14px;font-family:Calibri,sans-serif">
<li>Upgrade the <a href="https://wiki.jenkins-ci.org/display/JENKINS/Timestamper" target="_blank">
timestamper plugin</a>.</li></ul>


</div></blockquote></div></div><blockquote type="cite"><div><span>_______________________________________________</span><br><span>OpenStack-dev mailing list</span><br><span><a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a></span><br>

<span><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a></span><br></div></blockquote></div><br>_______________________________________________<br>


OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote></div><br></div>
</div></div><br>_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote></div><br></div>