<div dir="ltr"><div><div>Sorry for top-posting -- using web mail client.<br><br></div>Is it possible to change the retry interval in Cirros (or cloud-init?) so that the backoff is less than 60 seconds?<br><br>Best,<br></div>
-jay<br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Mon, Jan 20, 2014 at 10:23 AM, Darragh O'Reilly <span dir="ltr"><<a href="mailto:dara2002-openstack@yahoo.com" target="_blank">dara2002-openstack@yahoo.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div><div style="font-size:12pt;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif"><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
<br><span></span><span>I did a test to see what the dhcp client on cirros does. I killed the dhcp agent and started an instance. The instance sent the first dhcp offer after about 35 sec. Then another 60 sec later, and a final one after another 60 sec.</span></div>
<div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif"><span><br></span></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
<span>So a revised theory for what happened is this: </span></div><div style="display:block"> <br>t=0 tempest starts vm and starts polling for ACTIVE status<br>t=20 instance-->ACTIVE and tempest starts polling the floating ip for 60 sec<br>
t=40 instance does a dhcp discover - no response - so sets a timer for 60 sec<br>t=45 ovs-agent sets the port vlan<br>t=80 tempest gives up and kills vm<br>t=100 instance would have sent another dhcp discover now if it had been let live<br>
<br>I think it would be worth trying to change that test to poll for 120 seconds instead of 60.<div><div class="h5"><br> <br> <div style="font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif;font-size:12pt">
<div style="font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif;font-size:12pt"> <div dir="ltr"> <font face="Arial"> On Monday, 20
January 2014, 11:23, Darragh O'Reilly <<a href="mailto:dara2002-openstack@yahoo.com" target="_blank">dara2002-openstack@yahoo.com</a>> wrote:<br> </font> </div> <blockquote style="border-left:2px solid rgb(16,16,255);margin-left:5px;margin-top:5px;padding-left:5px">
<div><div><div><div style="font-size:12pt;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif"><div>Hi Salvatore,</div><div><br></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
I presume it's this one? </div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif"><a rel="nofollow" href="http://logs.openstack.org/38/65838/4/check/check-tempest-dsvm-neutron-isolated/d108e4a/logs/tempest.txt.gz?#_2014-01-19_20_50_14_604" target="_blank">http://logs.openstack.org/38/65838/4/check/check-tempest-dsvm-neutron-isolated/d108e4a/logs/tempest.txt.gz?#_2014-01-19_20_50_14_604</a></div>
<div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif"><br></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
Is it true that the cirros image just fires off a few dhcp discovers and then gives up? If so, then maybe it did so before the tagging happened. Do we have the instance console log? It took about 45 seconds from when the port was created to when it was tagged.</div>
<div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif"><br></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
2014-01-19
20:48:57.412 8142 DEBUG neutron.agent.linux.ovsdb_monitor [-] Output
received from ovsdb monitor:
{"data":[["3602a7b2-b559-4709-9bf0-53ae2af68d06","insert","tap496b808c-b5"]],"headings":["row","action","name"]}</div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
<snip></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">2014-01-19 20:49:41.925 8142 DEBUG neutron.agent.linux.utils [-] <br>
Command:
['sudo', '/usr/local/bin/neutron-rootwrap',
'/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=10', 'set',
'Port', 'tap496b808c-b5', 'tag=64']<br>Exit code: 0</div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
<br></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">Darragh.<br></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">
<br></div><div style="font-style:normal;font-size:16px;background-color:transparent;font-family:HelveticaNeue,Helvetica Neue,Helvetica,Arial,Lucida Grande,Sans-Serif">>I have been seeing in the past 2 days timeout failures on gate jobs which I<br>
>am struggling to explain. An example is
available in [1]<br>>These are the usual failure that we associate with bug 1253896, but this<br>>time I can verify that:<br>>- The floating IP is correctly wired (IP and NAT rules)<br>>- The DHCP port is correctly wired, as well as the VM port and the router<br>
>port<br>>- The DHCP agent is correctly started for the network<br>><br>>However, no DHCP DISCOVER request is sent. Only the DHCP RELEASE message is<br>>seen.<br>>Any help at interpreting the logs will be appreciated.<br>
><br>><br>>Salvatore<br>><br>>[1] <a href="http://logs.openstack.org/38/65838" target="_blank">http://logs.openstack.org/38/65838</a><br></div></div></div></div><br><br></div> </blockquote> </div> </div> </div>
</div></div> </div></div><br>_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote></div><br></div>