<div dir="ltr">For what is worth we have considered this aspect from the perspective of the Neutron plugin my team maintains (NVP) during the past release cycle.<div><br><div>The synchronous model that most plugins with a controller on the backend currently implement is simple and convenient, but has some flaws:</div>

<div><br></div>

<div>- reliability: the current approach where the plugin orchestrates the backend is not really optimal when it comes to ensuring your running configuration (backend/control plane) is in sync with your desired configuration (neutron/mgmt plane); moreover in some case, due to neutron internals, API calls to the backend are wrapped in a transaction too, leading to very long SQL transactions, which are quite dangerous indeed. It is not easy to recover from a failure due to an eventlet thread deadlocking with a mysql transaction, where by 'recover' I mean ensuring neutron and backend state are in sync.</div>

<div><br></div>

<div>- maintainability: since handling rollback in case of failures on the backend and/or the db is cumbersome, this often leads to spaghetti code which is very hard to maintain regardless of the effort (ok, I agree here that this also depends on how good the devs are - most of the guys in my team are very good, but unfortunately they have me too...).</div>

<div><br></div><div>- performance & scalability:</div><div>    -  roundtrips to the backend take a non-negligible toll on the duration of an API call, whereas most Neutron API calls should probably just terminate at the DB just like a nova boot call does not wait for the VM to be ACTIVE to return.</div>

<div>    - we need to keep some operation serialized in order to avoid the mentioned race issues</div><div><br></div><div>For this reason we're progressively moving toward a change in the NVP plugin with a series of patches under this umbrella-blueprint [1].</div>

<div><br></div><div>For answering the issues mentioned by Isaku, we've been looking at a task management library with an efficient and reliable set of abstractions for ensuring operations are properly ordered thus avoiding those races (I agree on the observation on the pre/post commit solution).</div>

<div>We are currently looking at using celery [2] rather than taskflow; mostly because we've already have expertise on how to use it into our applications, and has very easy abstractions for workflow design, as well as for handling task failures.</div>

<div>Said that, I think we're still open to switch to taskflow should we become aware of some very good reason for using it.</div><div><br></div><div>Regards,</div><div>Salvatore</div><div><br></div><div>[1] <a href="https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication">https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication</a></div>

<div>[2] <a href="http://docs.celeryproject.org/en/master/index.html">http://docs.celeryproject.org/en/master/index.html</a></div>

<div><br></div></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On 19 November 2013 19:42, Joshua Harlow <span dir="ltr"><<a href="mailto:harlowja@yahoo-inc.com" target="_blank">harlowja@yahoo-inc.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">And also of course, nearly forgot a similar situation/review in heat.<br>

<br>

<a href="https://review.openstack.org/#/c/49440/" target="_blank">https://review.openstack.org/#/c/49440/</a><br>

<br>

Except theres was/is dealing with stack locking (a heat concept).<br>

<div class="HOEnZb"><div class="h5"><br>

On 11/19/13 10:33 AM, "Joshua Harlow" <<a href="mailto:harlowja@yahoo-inc.com">harlowja@yahoo-inc.com</a>> wrote:<br>

<br>

>If you start adding these states you might really want to consider the<br>

>following work that is going on in other projects.<br>

><br>

>It surely appears that everyone is starting to hit the same problem (and<br>

>joining efforts would produce a more beneficial result).<br>

><br>

>Relevant icehouse etherpads:<br>

>- <a href="https://etherpad.openstack.org/p/CinderTaskFlowFSM" target="_blank">https://etherpad.openstack.org/p/CinderTaskFlowFSM</a><br>

>- <a href="https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization" target="_blank">https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization</a><br>

><br>

>And of course my obvious plug for taskflow (which is designed to be a<br>

>useful library to help in all these usages).<br>

><br>

>- <a href="https://wiki.openstack.org/wiki/TaskFlow" target="_blank">https://wiki.openstack.org/wiki/TaskFlow</a><br>

><br>

>The states u just mentioned start to line-up with<br>

><a href="https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow" target="_blank">https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow</a><br>

><br>

>If this sounds like a useful way to go (joining efforts) then lets see how<br>

>we can make it possible.<br>

><br>

>IRC: #openstack-state-management is where I am usually at.<br>

><br>

>On 11/19/13 3:57 AM, "Isaku Yamahata" <<a href="mailto:isaku.yamahata@gmail.com">isaku.yamahata@gmail.com</a>> wrote:<br>

><br>

>>On Mon, Nov 18, 2013 at 03:55:49PM -0500,<br>

>>Robert Kukura <<a href="mailto:rkukura@redhat.com">rkukura@redhat.com</a>> wrote:<br>

>><br>

>>> On 11/18/2013 03:25 PM, Edgar Magana wrote:<br>

>>> > Developers,<br>

>>> ><br>

>>> > This topic has been discussed before but I do not remember if we have<br>

>>>a<br>

>>> > good solution or not.<br>

>>><br>

>>> The ML2 plugin addresses this by calling each MechanismDriver twice.<br>

>>>The<br>

>>> create_network_precommit() method is called as part of the DB<br>

>>> transaction, and the create_network_postcommit() method is called after<br>

>>> the transaction has been committed. Interactions with devices or<br>

>>> controllers are done in the postcommit methods. If the postcommit<br>

>>>method<br>

>>> raises an exception, the plugin deletes that partially-created resource<br>

>>> and returns the exception to the client. You might consider a similar<br>

>>> approach in your plugin.<br>

>><br>

>>Splitting works into two phase, pre/post, is good approach.<br>

>>But there still remains race window.<br>

>>Once the transaction is committed, the result is visible to outside.<br>

>>So the concurrent request to same resource will be racy.<br>

>>There is a window after pre_xxx_yyy before post_xxx_yyy() where<br>

>>other requests can be handled.<br>

>><br>

>>The state machine needs to be enhanced, I think. (plugins need<br>

>>modification)<br>

>>For example, adding more states like pending_{create, delete, update}.<br>

>>Also we would like to consider serializing between operation of ports<br>

>>and subnets. or between operation of subnets and network depending on<br>

>>performance requirement.<br>

>>(Or carefully audit complex status change. i.e.<br>

>>changing port during subnet/network update/deletion.)<br>

>><br>

>>I think it would be useful to establish reference locking policy<br>

>>for ML2 plugin for SDN controllers.<br>

>>Thoughts or comments? If this is considered useful and acceptable,<br>

>>I'm willing to help.<br>

>><br>

>>thanks,<br>

>>Isaku Yamahata<br>

>><br>

>>> -Bob<br>

>>><br>

>>> > Basically, if concurrent API calls are sent to Neutron, all of them<br>

>>>are<br>

>>> > sent to the plug-in level where two actions have to be made:<br>

>>> ><br>

>>> > 1. DB transaction ? No just for data persistence but also to collect<br>

>>>the<br>

>>> > information needed for the next action<br>

>>> > 2. Plug-in back-end implementation ? In our case is a call to the<br>

>>>python<br>

>>> > library than consequentially calls PLUMgrid REST GW (soon SAL)<br>

>>> ><br>

>>> > For instance:<br>

>>> ><br>

>>> > def create_port(self, context, port):<br>

>>> >         with context.session.begin(subtransactions=True):<br>

>>> >             # Plugin DB - Port Create and Return port<br>

>>> >             port_db = super(NeutronPluginPLUMgridV2,<br>

>>> > self).create_port(context,<br>

>>> ><br>

>>> port)<br>

>>> >             device_id = port_db["device_id"]<br>

>>> >             if port_db["device_owner"] == "network:router_gateway":<br>

>>> >                 router_db = self._get_router(context, device_id)<br>

>>> >             else:<br>

>>> >                 router_db = None<br>

>>> >             try:<br>

>>> >                 LOG.debug(_("PLUMgrid Library: create_port()<br>

>>>called"))<br>

>>> > # Back-end implementation<br>

>>> >                 self._plumlib.create_port(port_db, router_db)<br>

>>> >             except Exception:<br>

>>> >             Š<br>

>>> ><br>

>>> > The way we have implemented at the plugin-level in Havana (even in<br>

>>> > Grizzly) is that both action are wrapped in the same "transaction"<br>

>>>which<br>

>>> > automatically rolls back any operation done to its original state<br>

>>> > protecting mostly the DB of having any inconsistency state or left<br>

>>>over<br>

>>> > data if the back-end part fails.=.<br>

>>> > The problem that we are experiencing is when concurrent calls to the<br>

>>> > same API are sent, the number of operation at the plug-in back-end<br>

>>>are<br>

>>> > long enough to make the next concurrent API call to get stuck at the<br>

>>>DB<br>

>>> > transaction level, which creates a hung state for the Neutron Server<br>

>>>to<br>

>>> > the point that all concurrent API calls will fail.<br>

>>> ><br>

>>> > This can be fixed if we include some "locking" system such as<br>

>>>calling:<br>

>>> ><br>

>>> > from neutron.common import utile<br>

>>> > Š<br>

>>> ><br>

>>> > @utils.synchronized('any-name', external=True)<br>

>>> > def create_port(self, context, port):<br>

>>> > Š<br>

>>> ><br>

>>> > Obviously, this will create a serialization of all concurrent calls<br>

>>> > which will ends up in having a really bad performance. Does anyone<br>

>>>has a<br>

>>> > better solution?<br>

>>> ><br>

>>> > Thanks,<br>

>>> ><br>

>>> > Edgar<br>

>>> ><br>

>>> ><br>

>>> > _______________________________________________<br>

>>> > OpenStack-dev mailing list<br>

>>> > <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

>>> > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

>>> ><br>

>>><br>

>>><br>

>>> _______________________________________________<br>

>>> OpenStack-dev mailing list<br>

>>> <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

>><br>

>>--<br>

>>Isaku Yamahata <<a href="mailto:isaku.yamahata@gmail.com">isaku.yamahata@gmail.com</a>><br>

>><br>

>>_______________________________________________<br>

>>OpenStack-dev mailing list<br>

>><a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

>><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

><br>

><br>

>_______________________________________________<br>

>OpenStack-dev mailing list<br>

><a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

<br>

<br>

_______________________________________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</div></div></blockquote></div><br></div>