<div dir="ltr">><span style="font-size:12.8000001907349px">The thing is, is that you *should* be able to call core_plugin.create_port in a </span><span style="font-size:12.8000001907349px">transaction. </span><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Well it depends on what you mean by that. If you mean create_port should be part of the same transaction, I disagree because it leads to either inconsistency or a loss of veto power for drivers with external backends.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">With the current code, if you enclose create_port in a transaction and then have a failure in the parent transaction after the port is created, the DB creation will be rolled back but nothing will inform the backend to release the resources it allocated for the port.</span><br></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">If we switch to a notification system like you described where notifications are deferred until after create_port is complete, we just end up removing the ability for backends to block a create_port call if necessary. That's a pretty significant change because callers will think they have successfully created a port when not all of the relevant systems have confirmed it.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">This is going to become even more pronounced if procedures to allocate IP addresses and whatever else for the port result in calls to external servers.</span><br></div><div><span style="font-size:12.8000001907349px"><br></span></div><div>In the hack you showed, wouldn't it be easier to just to have a way to register extra DB operations to be performed on port_create? Something like a run-time defined mechanism driver with only a create port pre-commit method.</div><div><br></div><div><br></div><div>><span style="font-size:12.8000001907349px">We're still left with questions such </span><span style="font-size:12.8000001907349px">as: What happens if I commit a mega-transaction and then all (Or even more complicated, one) of </span><span style="font-size:12.8000001907349px">the notifications fails, but this isn't a new problem.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">This is why I think we shouldn't just rely on the DB to make mega-transactions. It doesn't really work with us calling out to other systems. We need a more generic system to manage flows of tasks that each have rollback mechanisms so the semantics rolling back large operations are handled in a database independent manner. If only such a system existed. ;-)</span></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 13, 2015 at 3:50 PM, Assaf Muller <span dir="ltr"><<a href="mailto:amuller@redhat.com" target="_blank">amuller@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class=""><br>

<br>

----- Original Message -----<br>

> I think removing all occurrences of create_port inside of another transaction<br>

> is something we should be doing for a couple of reasons.<br>

<br>

</span>The issues you're pointing out are very much real. It's a *huge* pain to workaround<br>

this issue and you can look for an example here:<br>

<a href="https://github.com/openstack/neutron/blob/master/neutron/db/l3_hamode_db.py#L303" target="_blank">https://github.com/openstack/neutron/blob/master/neutron/db/l3_hamode_db.py#L303</a><br>

<br>

The thing is, is that you *should* be able to call core_plugin.create_port in a<br>

transaction. I think that the correct thing to do is to eliminate the issue with<br>

create_port, and not work around the issue with awful patterns such as the one<br>

in the link above. There's a few different acute issues with that pattern:<br>

1) We have no automated way to tell if create_port is being called in a transaction<br>

   or not, currently it's left up to reviewers to spot such occurrences and prevent<br>

   them from being merged.<br>

2) The mental load it adds to read that code is not trivial.<br>

3) Transactions are awesome... I'd very much like to group up core_plugin.create_port<br>

   and create_ha_port_binding in a single transaction and avoid having to deal with<br>

   edge cases manually.<br>

4) Sometimes you can't use the try/except/manual cleanup approach (If you delete a resource<br>

   in transaction A, then transaction B fails, good luck re-creating the resource you already<br>

   deleted).<br>

<br>

The better long term approach would be to introduce a framework at the API layer that queues<br>

up notifications (Both HTTP to vendor servers and RPC to agents) at the start of an API or RPC call.<br>

You're then free to use a single huge transaction (Fun!), and finally all queued up notifications<br>

will be sent for you automagically. That's the simplest approach, I haven't thought this through<br>

and I'm sure there will be issues but it should be possible. We're still left with questions such<br>

as: What happens if I commit a mega-transaction and then all (Or even more complicated, one) of<br>

the notifications fails, but this isn't a new problem.<br>

<div class="HOEnZb"><div class="h5"><br>

><br>

> First, it's a recipe for the cherished "lock wait timeout" deadlocks because<br>

> create_port makes yielding calls. These are awful to troubleshoot and are<br>

> pretty annoying for users (request takes ~60 seconds and then blows up).<br>

><br>

> Second, create_port in ML2 expects the transaction to be committed to the DB<br>

> by the time it's done with pre-commit phase, which we break by opening a<br>

> parent transaction before calling it so the failure handling semantics may<br>

> be messed up.<br>

><br>

><br>

><br>

> On Mon, Apr 13, 2015 at 9:48 AM, Carl Baldwin < <a href="mailto:carl@ecbaldwin.net">carl@ecbaldwin.net</a> > wrote:<br>

><br>

><br>

> Have we found the last of them? I wonder. I suppose any higher level<br>

> service like a router that needs to create ports under the hood (under<br>

> the API) will have this problem. The DVR fip namespace creation comes<br>

> to mind. It will create a port to use as the external gateway port<br>

> for that namespace. This could spring up in the context of another<br>

> create_port, I think (VM gets new port bound to a compute host where a<br>

> fip namespace needs to spring in to existence).<br>

><br>

> Carl<br>

><br>

> On Mon, Apr 13, 2015 at 10:24 AM, John Belamaric<br>

> < <a href="mailto:jbelamaric@infoblox.com">jbelamaric@infoblox.com</a> > wrote:<br>

> > Thanks Pavel. I see an additional case in L3_NAT_dbonly_mixin, where it<br>

> > starts the transaction in create_router, then eventually gets to<br>

> > create_port:<br>

> ><br>

> > create_router (starts tx)<br>

> > ->self._update_router_gw_info<br>

> > ->_create_gw_port<br>

> > ->_create_router_gw_port<br>

> > ->create_port(plugin)<br>

> ><br>

> > So that also would need to be unwound.<br>

> ><br>

> > On 4/13/15, 10:44 AM, "Pavel Bondar" < <a href="mailto:pbondar@infoblox.com">pbondar@infoblox.com</a> > wrote:<br>

> ><br>

> >>Hi,<br>

> >><br>

> >>I made some investigation on the topic[1] and see several issues on this<br>

> >>way.<br>

> >><br>

> >>1. Plugin's create_port() is wrapped up in top level transaction for<br>

> >>create floating ip case[2], so it becomes more complicated to do IPAM<br>

> >>calls outside main db transaction.<br>

> >><br>

> >>- for create floating ip case transaction is initialized on<br>

> >>create_floatingip level:<br>

> >>create_floatingip(l3_db)->create_port(plugin)->create_port(db_base)<br>

> >>So IPAM call should be added into create_floatingip to be outside db<br>

> >>transaction<br>

> >><br>

> >>- for usual port create transaction is initialized on plugin's<br>

> >>create_port level, and John's change[1] cover this case:<br>

> >>create_port(plugin)->create_port(db_base)<br>

> >><br>

> >>Create floating ip work-flow involves calling plugin's create_port,<br>

> >>so IPAM code inside of it should be executed only when it is not wrapped<br>

> >>into top level transaction.<br>

> >><br>

> >>2. It is opened question about error handling.<br>

> >>Should we use taskflow to manage IPAM calls to external systems?<br>

> >>Or simple exception based model is enough to handle rollback actions on<br>

> >>third party systems in case of failing main db transaction.<br>

> >><br>

> >>[1] <a href="https://review.openstack.org/#/c/172443/" target="_blank">https://review.openstack.org/#/c/172443/</a><br>

> >>[2] neutron/db/l3_db.py: line 905<br>

> >><br>

> >>Thanks,<br>

> >>Pavel<br>

> >><br>

> >>On 10.04.2015 21:04, <a href="mailto:openstack-dev-request@lists.openstack.org">openstack-dev-request@lists.openstack.org</a> wrote:<br>

> >>> L3 Team,<br>

> >>><br>

> >>> I have put up a WIP [1] that provides an approach that shows the ML2<br>

> >>>create_port method refactored to use the IPAM driver prior to initiating<br>

> >>>the database transaction. Details are in the commit message - this is<br>

> >>>really just intended to provide a strawman for discussion of the<br>

> >>>options. The actual refactor here is only about 40 lines of code.<br>

> >>><br>

> >>> [1] <a href="https://review.openstack.org/#/c/172443/" target="_blank">https://review.openstack.org/#/c/172443/</a><br>

> >>><br>

> >>><br>

> >>> Thanks,<br>

> >>> John<br>

> >><br>

> >><br>

> >>__________________________________________________________________________<br>

> >>OpenStack Development Mailing List (not for usage questions)<br>

> >>Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

> >> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

> ><br>

> ><br>

> > __________________________________________________________________________<br>

> > OpenStack Development Mailing List (not for usage questions)<br>

> > Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

> > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

><br>

> __________________________________________________________________________<br>

> OpenStack Development Mailing List (not for usage questions)<br>

> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

><br>

><br>

><br>

> --<br>

> Kevin Benton<br>

><br>

> __________________________________________________________________________<br>

> OpenStack Development Mailing List (not for usage questions)<br>

> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

><br>

<br>

__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</div></div></blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div>Kevin Benton</div></div>

</div>