[openstack-dev] [neutron][L3] IPAM alternate refactoring

Assaf Muller amuller at redhat.com
Mon Apr 13 22:50:38 UTC 2015



----- Original Message -----
> I think removing all occurrences of create_port inside of another transaction
> is something we should be doing for a couple of reasons.

The issues you're pointing out are very much real. It's a *huge* pain to workaround
this issue and you can look for an example here:
https://github.com/openstack/neutron/blob/master/neutron/db/l3_hamode_db.py#L303

The thing is, is that you *should* be able to call core_plugin.create_port in a
transaction. I think that the correct thing to do is to eliminate the issue with
create_port, and not work around the issue with awful patterns such as the one
in the link above. There's a few different acute issues with that pattern:
1) We have no automated way to tell if create_port is being called in a transaction
   or not, currently it's left up to reviewers to spot such occurrences and prevent
   them from being merged.
2) The mental load it adds to read that code is not trivial.
3) Transactions are awesome... I'd very much like to group up core_plugin.create_port
   and create_ha_port_binding in a single transaction and avoid having to deal with
   edge cases manually.
4) Sometimes you can't use the try/except/manual cleanup approach (If you delete a resource
   in transaction A, then transaction B fails, good luck re-creating the resource you already
   deleted).

The better long term approach would be to introduce a framework at the API layer that queues
up notifications (Both HTTP to vendor servers and RPC to agents) at the start of an API or RPC call.
You're then free to use a single huge transaction (Fun!), and finally all queued up notifications
will be sent for you automagically. That's the simplest approach, I haven't thought this through
and I'm sure there will be issues but it should be possible. We're still left with questions such
as: What happens if I commit a mega-transaction and then all (Or even more complicated, one) of
the notifications fails, but this isn't a new problem.

> 
> First, it's a recipe for the cherished "lock wait timeout" deadlocks because
> create_port makes yielding calls. These are awful to troubleshoot and are
> pretty annoying for users (request takes ~60 seconds and then blows up).
> 
> Second, create_port in ML2 expects the transaction to be committed to the DB
> by the time it's done with pre-commit phase, which we break by opening a
> parent transaction before calling it so the failure handling semantics may
> be messed up.
> 
> 
> 
> On Mon, Apr 13, 2015 at 9:48 AM, Carl Baldwin < carl at ecbaldwin.net > wrote:
> 
> 
> Have we found the last of them? I wonder. I suppose any higher level
> service like a router that needs to create ports under the hood (under
> the API) will have this problem. The DVR fip namespace creation comes
> to mind. It will create a port to use as the external gateway port
> for that namespace. This could spring up in the context of another
> create_port, I think (VM gets new port bound to a compute host where a
> fip namespace needs to spring in to existence).
> 
> Carl
> 
> On Mon, Apr 13, 2015 at 10:24 AM, John Belamaric
> < jbelamaric at infoblox.com > wrote:
> > Thanks Pavel. I see an additional case in L3_NAT_dbonly_mixin, where it
> > starts the transaction in create_router, then eventually gets to
> > create_port:
> > 
> > create_router (starts tx)
> > ->self._update_router_gw_info
> > ->_create_gw_port
> > ->_create_router_gw_port
> > ->create_port(plugin)
> > 
> > So that also would need to be unwound.
> > 
> > On 4/13/15, 10:44 AM, "Pavel Bondar" < pbondar at infoblox.com > wrote:
> > 
> >>Hi,
> >> 
> >>I made some investigation on the topic[1] and see several issues on this
> >>way.
> >> 
> >>1. Plugin's create_port() is wrapped up in top level transaction for
> >>create floating ip case[2], so it becomes more complicated to do IPAM
> >>calls outside main db transaction.
> >> 
> >>- for create floating ip case transaction is initialized on
> >>create_floatingip level:
> >>create_floatingip(l3_db)->create_port(plugin)->create_port(db_base)
> >>So IPAM call should be added into create_floatingip to be outside db
> >>transaction
> >> 
> >>- for usual port create transaction is initialized on plugin's
> >>create_port level, and John's change[1] cover this case:
> >>create_port(plugin)->create_port(db_base)
> >> 
> >>Create floating ip work-flow involves calling plugin's create_port,
> >>so IPAM code inside of it should be executed only when it is not wrapped
> >>into top level transaction.
> >> 
> >>2. It is opened question about error handling.
> >>Should we use taskflow to manage IPAM calls to external systems?
> >>Or simple exception based model is enough to handle rollback actions on
> >>third party systems in case of failing main db transaction.
> >> 
> >>[1] https://review.openstack.org/#/c/172443/
> >>[2] neutron/db/l3_db.py: line 905
> >> 
> >>Thanks,
> >>Pavel
> >> 
> >>On 10.04.2015 21:04, openstack-dev-request at lists.openstack.org wrote:
> >>> L3 Team,
> >>> 
> >>> I have put up a WIP [1] that provides an approach that shows the ML2
> >>>create_port method refactored to use the IPAM driver prior to initiating
> >>>the database transaction. Details are in the commit message - this is
> >>>really just intended to provide a strawman for discussion of the
> >>>options. The actual refactor here is only about 40 lines of code.
> >>> 
> >>> [1] https://review.openstack.org/#/c/172443/
> >>> 
> >>> 
> >>> Thanks,
> >>> John
> >> 
> >> 
> >>__________________________________________________________________________
> >>OpenStack Development Mailing List (not for usage questions)
> >>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> > 
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> 
> --
> Kevin Benton
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 



More information about the OpenStack-dev mailing list