[openstack-dev] [neutron] [gate] 35% failure rate for neutron tempest jobs

Armando M. armamig at gmail.com
Tue Nov 10 18:37:35 UTC 2015


On 10 November 2015 at 09:49, Sean Dague <sean at dague.net> wrote:

> The neutron tempest jobs are now at a 35% failure rate:
> http://tinyurl.com/ne3ex4v (note, 35% is basically the worst possible
> fail rate, because it's just passing enough to land patches that cause
> that kind of fail on two test runs check/gate with a coin flip).
>

Sean, thanks for the heads-up.


>
> The failure is currently seen here -
>
> http://logstash.openstack.org/#/dashboard/file/logstash.json?query=message:%5C%22No%20IPv4%20addresses%20found%20in:%20%5B%5D%5C%22
>
> That is a new assert that was added in Tempest. However it was added in
> a path that expects there should be an IPv4 address. The fact that port
> is sometimes not returning one is problematic.
> https://review.openstack.org/#/c/241800/
>
> The server via nova is returning an address here -
>
> http://logs.openstack.org/76/243676/1/check/gate-tempest-dsvm-neutron-full/291e1d7/logs/tempest.txt.gz#_2015-11-10_17_14_35_465
>
> But then when the port is polled here:
>
> http://logs.openstack.org/76/243676/1/check/gate-tempest-dsvm-neutron-full/291e1d7/logs/tempest.txt.gz#_2015-11-10_17_14_35_527
> it comes back with {"ports": []}
>
>
> This can be contrasted with a working path where we do the similar
> action on the Server is active here -
>
> http://logs.openstack.org/76/243676/1/check/gate-tempest-dsvm-neutron-full/291e1d7/logs/tempest.txt.gz#_2015-11-10_17_13_48_193
>
> Then we verify the port -
>
> http://logs.openstack.org/76/243676/1/check/gate-tempest-dsvm-neutron-full/291e1d7/logs/tempest.txt.gz#_2015-11-10_17_13_48_230
>
> Which returns:
>
>   Body: {"ports": [{"status": "ACTIVE", "binding:host_id":
> "devstack-trusty-rax-dfw-5784820", "allowed_address_pairs": [],
> "extra_dhcp_opts": [], "dns_assignment": [{"hostname":
> "host-10-100-0-3", "ip_address": "10.100.0.3", "fqdn":
> "host-10-100-0-3.openstacklocal."}], "device_owner": "compute:None",
> "port_security_enabled": true, "binding:profile": {}, "fixed_ips":
> [{"subnet_id": "147b1e65-3463-4965-8461-11b76a00dd99", "ip_address":
> "10.100.0.3"}], "id": "65c11c76-42fc-4010-bbb8-58996911803e",
> "security_groups": ["f2d48dcf-ea8d-4a7c-bf09-da37d3c2ee37"],
> "device_id": "b03bec85-fe69-4c0d-94e8-51753a8bebd5", "name": "",
> "admin_state_up": true, "network_id":
> "eb72d3af-f1a0-410b-8085-76cbe19ace90", "dns_name": "",
> "binding:vif_details": {"port_filter": true, "ovs_hybrid_plug": true},
> "binding:vnic_type": "normal", "binding:vif_type": "ovs", "tenant_id":
> "eab50a3d331c4db3a68f71d1ebdc41bf", "mac_address": "fa:16:3e:02:e4:ee"}]}
>
>
> HenryG suggested this might be related to the ERROR of "No more IP
> addresses available on network". However that ERROR is thrown a lot in
> neutron, and 60% of the times the tempest run is successful.
>
>
> This issue is currently stuck and needs neutron folks to engage to get
> us somewhere. Reverting the tempest patch which does the early
> verification might make this class of fail go away, but I think what
> it's done is surface a more fundamental bit where ports aren't active
> when the server is active, which may explain deeper races we've had over
> the years. So actually getting folks to dive in here would be really great.
>

We'll dig into this more deeply. AFAIK, Nova servers won't go ACTIVE if the
port isn't, so we might have a regression. That said, it's been on our
radar to better synchronize actions that need to happen on port setup.
Right now for instance, DHCP and L2 setup is uncoordinated and Kevin Benton
has been looking into it.

That said, I wonder if reverting the tempest patch is the best course of
action: we can then use Depends-on to test a Neutron fix and the revert of
the revert together without causing the gate too much grief.

Thoughts?

Armando


>
>         -Sean
>
> --
> Sean Dague
> http://dague.net
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151110/0740da67/attachment.html>


More information about the OpenStack-dev mailing list