<div dir="ltr"><div>This actually doesn't solve the issue because if you run multiple neutron servers behind a loadbalancer you will still run into the same issue with the transaction on the database I believe. </div><div>
<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex"><div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word">
</div></blockquote><div class="gmail_extra">We handle this issue in the NVP plugin by removing the transaction and attempt to manually delete the port if the rest call to nvp failed. In the case where the port was unable to be deleted from the database (unlikely) the operational status of the port eventually goes to error state from a background thread that syncs the operational status from nvp to the neutron database. Then later we have to garbage collect ports in error state.<br>
<div class="gmail_quote"><br></div><div class="gmail_quote">On Mon, Nov 18, 2013 at 12:43 PM, Joshua Harlow <span dir="ltr"><<a href="mailto:harlowja@yahoo-inc.com" target="_blank">harlowja@yahoo-inc.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-color:rgb(204,204,204);border-left-style:solid;padding-left:1ex">
<div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word">
<div>An idea, make the lock more granular.</div>
<div><br>
</div>
<div>Instead of @utils.synchronized('any-name') I wonder if u could do something like.</div>
<div><br>
</div>
<div>with utils.synchronized('any-name-$device-id'):</div>
<div><span style="white-space:pre-wrap"></span># Code here</div>
<div><br>
</div>
<div>Then at least u won't be locking at the method level (which means no concurrency). Would that work?</div>
<div><br>
</div>
<span>
<div style="border-width:1pt medium medium;border-style:solid none none;padding:3pt 0in 0in;text-align:left;font-size:11pt;font-family:Calibri;border-top-color:rgb(181,196,223)">
<span style="font-weight:bold">From: </span>Edgar Magana <<a href="mailto:emagana@plumgrid.com" target="_blank">emagana@plumgrid.com</a>><br>
<span style="font-weight:bold">Reply-To: </span>"OpenStack Development Mailing List (not for usage questions)" <<a href="mailto:openstack-dev@lists.openstack.org" target="_blank">openstack-dev@lists.openstack.org</a>><br>
<span style="font-weight:bold">Date: </span>Monday, November 18, 2013 12:25 PM<br>
<span style="font-weight:bold">To: </span>OpenStack List <<a href="mailto:openstack-dev@lists.openstack.org" target="_blank">openstack-dev@lists.openstack.org</a>><br>
<span style="font-weight:bold">Subject: </span>[openstack-dev] [Neutron] Race condition between DB layer and plugin back-end implementation<br>
</div><div><div class="h5">
<div><br>
</div>
<div>
<div style="font-size:14px;font-family:Calibri,sans-serif;word-wrap:break-word">
<div>Developers,</div>
<div><br>
</div>
<div>This topic has been discussed before but I do not remember if we have a good solution or not.</div>
<div>Basically, if concurrent API calls are sent to Neutron, all of them are sent to the plug-in level where two actions have to be made:</div>
<div>
<div><br>
</div>
<div>1. DB transaction – No just for data persistence but also to collect the information needed for the next action</div>
<div>2. Plug-in back-end implementation – In our case is a call to the python library than consequentially calls PLUMgrid REST GW (soon SAL)</div>
<div><br>
</div>
<div>For instance:</div>
<div><br>
</div>
<div>
<div>def create_port(self, context, port):</div>
<div> with context.session.begin(subtransactions=True):</div>
<div> # Plugin DB - Port Create and Return port</div>
<div> port_db = super(NeutronPluginPLUMgridV2, self).create_port(context,</div>
<div> port)</div>
<div> device_id = port_db["device_id"]</div>
<div> if port_db["device_owner"] == "network:router_gateway":</div>
<div> router_db = self._get_router(context, device_id)</div>
<div> else:</div>
<div> router_db = None</div>
<div> try:</div>
<div> LOG.debug(_("PLUMgrid Library: create_port() called"))</div>
<div><span style="white-space:pre-wrap"></span># Back-end implementation</div>
<div> self._plumlib.create_port(port_db, router_db)</div>
<div> except Exception:</div>
<div> …</div>
</div>
<div><br>
</div>
<div>The way we have implemented at the plugin-level in Havana (even in Grizzly) is that both action are wrapped in the same "transaction" which automatically rolls back any operation done to its original state protecting mostly the DB of having any inconsistency
state or left over data if the back-end part fails.=.</div>
<div>The problem that we are experiencing is when concurrent calls to the same API are sent, the number of operation at the plug-in back-end are long enough to make the next concurrent API call to get stuck at the DB transaction level, which creates a hung
state for the Neutron Server to the point that all concurrent API calls will fail.</div>
</div>
<div><br>
</div>
<div>This can be fixed if we include some "locking" system such as calling:</div>
<div><br>
</div>
<div>from neutron.common import utile</div>
<div>…</div>
<div><br>
</div>
<div>@utils.synchronized('any-name', external=True)</div>
<div>def create_port(self, context, port):</div>
<div>…</div>
<div><br>
</div>
<div>Obviously, this will create a serialization of all concurrent calls which will ends up in having a really bad performance. Does anyone has a better solution?</div>
<div><br>
</div>
<div>Thanks,</div>
<div><br>
</div>
<div>Edgar</div>
</div>
</div>
</div></div></span>
</div>
<br>_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote></div><br></div></div>