[Openstack-operators] Issues with hybrid neutron ml2/ovs-agent ports after Icehouse upgrade

Michael Dorman mdorman at godaddy.com
Fri Oct 3 23:19:15 UTC 2014

Hi all,

Wanted to share details of an issue we just discovered around hybrid ml2/ovs configuration under Icehouse.

We run ml2 on the API nodes, but the openvswitch plugin/ovs-agent on the compute/network nodes.  We ran this split setup because under Havana this was the only way we could get ml2 working correctly, and this setup was recommend by an ml2 dev.  We kept this design because it continued to work under Icehouse, seemingly without issue.  We upgraded from havana to icehouse without too much trouble a couple months ago.

However, we had not rebooted any compute nodes since then until this week.  When the compute nodes came back up, instances that had been created before moving to icehouse did not start up because the vif for them was not being created.

Exact error is:  https://gist.githubusercontent.com/krislindgren/c1f4f79dc12403c4815d/raw/386ef0607f32088ad372a27e06e3606f6c1ac220/gistfile1.txt

Turns out this is because ports created under havana were missing the 'hybrid' property.  And this was preventing the vif from being recreated on the compute host.  The ports for instances created after the icehouse upgrade did have this property, and those instances started back up without a problem.

Specifically, the problem is that in the neutron.ml2_port_bindings table, instances created before the upgrade had this for vif_details:
{"port_filter": true}
Instances created after the upgrade had this for vif_details:
{"port_filter": true, "ovs_hybrid_plug": true}
Missing this flag caused instances' vifs to never get plug. The cause is this method:
Specifically, because the ovs_hybrid_plug flag isn't in the vif_details, vif.is_hybrid_plug_enabled() returns False and instead of calling plug_ovs_hybrid(), the driver calles plug_ovs_bridge(). plug_ovs_bridge() only calls its super implementation, which is a no-op method, so the vif never actually gets plugged.

We ended up solving this by manually assigning the hybrid property on the ports that were missing it via MySQL (maybe paste the mysql query we used, or at least an example.)  Then starting all the havana instances worked normally.
Here's the sql update we used:
update ml2_port_bindings set vif_details = '{"port_filter": true, "ovs_hybrid_plug": true}' where vif_details not like '%ovs_hybrid_plug%';
Note: that update statement will overwrite ALL entries that don't contain the ovs_hybrid_plug property. This was fine for us, but you should verify that it won't munge any of your data.

Not sure if we missed a step in the icehouse upgrade, and/or if this is just a function of our particular configuration.  It might be possible that running the ml2 pluging with the openvswitch mechansim driver and the ovs-agent is now the correct solution because that solution has a hardcoded ovs_hybrid_plug =true value.  https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/mech_openvswitch.py#L40

Hope this may be useful info for somebody.

Mike (et. al.)

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20141003/cecba6e8/attachment.html>

More information about the OpenStack-operators mailing list