<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=windows-1250">
</head>
<body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; color: rgb(0, 0, 0); font-size: 14px; font-family: Calibri, sans-serif; ">
<div>Can u explain a little how using celery achieves workflow reliability and avoids races (or mitigates spaghetti code)?</div>
<div><br>
</div>
<div>To me celery acts as a way to distribute tasks, but does not deal with actually forming a easily understandable way of knowing that a piece of code that u design is actually going to go through the various state transitions (or states & workflows) that
u expect (this is a higher level mechanism that u can build on-top of a distribution system). So this means that NVP (or neutron or other?) must be maintaining an orchestration/engine layer on-top of celery to add on this additional set of code that 'drives'
celery to accomplish a given workflow in a reliable manner.</div>
<div><br>
</div>
<div>This starts to sound pretty similar to what taskflow is doing, not being a direct competitor to a distributed task queue such as celery but providing this higher level mechanism which adds on these benefits; since they are needed anyway. </div>
<div><br>
</div>
<div>To me these benefits currently are (may get bigger in the future): </div>
<div><br>
</div>
<div>1. A way to define a workflow (in a way that is not tied to celery, since celeries '@task' decorator ties u to celeries internal implementation).</div>
<div> - This includes ongoing work to determine how to easily define a state-machine in a way that is relevant to cinder (and other projects).</div>
<div>2. A way to keep track of the state that the workflow goes through (this brings along resumption, progress information… when u track at the right level).</div>
<div>3. A way to execute that workflow reliably (potentially using celery, rpc, local threads, other future hotness) </div>
<div> - This becomes important when u ask yourself how did u plan on testing celery in the gate/jenkins/CI?</div>
<div>4. A way to guarantee that the workflow upon failure is *automatically* resumed by some other entity.</div>
<div><br>
</div>
<div>More details @ http://www.slideshare.net/harlowja/taskflow-27820295</div>
<div><br>
</div>
<span id="OLK_SRC_BODY_SECTION">
<div style="font-family:Calibri; font-size:11pt; text-align:left; color:black; BORDER-BOTTOM: medium none; BORDER-LEFT: medium none; PADDING-BOTTOM: 0in; PADDING-LEFT: 0in; PADDING-RIGHT: 0in; BORDER-TOP: #b5c4df 1pt solid; BORDER-RIGHT: medium none; PADDING-TOP: 3pt">
<span style="font-weight:bold">From: </span>Salvatore Orlando <<a href="mailto:sorlando@nicira.com">sorlando@nicira.com</a>><br>
<span style="font-weight:bold">Date: </span>Tuesday, November 19, 2013 2:22 PM<br>
<span style="font-weight:bold">To: </span>"OpenStack Development Mailing List (not for usage questions)" <<a href="mailto:openstack-dev@lists.openstack.org">openstack-dev@lists.openstack.org</a>><br>
<span style="font-weight:bold">Cc: </span>Joshua Harlow <<a href="mailto:harlowja@yahoo-inc.com">harlowja@yahoo-inc.com</a>>, Isaku Yamahata <<a href="mailto:isaku.yamahata@gmail.com">isaku.yamahata@gmail.com</a>>, Robert Kukura <<a href="mailto:rkukura@redhat.com">rkukura@redhat.com</a>><br>
<span style="font-weight:bold">Subject: </span>Re: [openstack-dev] [Neutron] Race condition between DB layer and plugin back-end implementation<br>
</div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">For what is worth we have considered this aspect from the perspective of the Neutron plugin my team maintains (NVP) during the past release cycle.
<div><br>
<div>The synchronous model that most plugins with a controller on the backend currently implement is simple and convenient, but has some flaws:</div>
<div><br>
</div>
<div>- reliability: the current approach where the plugin orchestrates the backend is not really optimal when it comes to ensuring your running configuration (backend/control plane) is in sync with your desired configuration (neutron/mgmt plane); moreover in
some case, due to neutron internals, API calls to the backend are wrapped in a transaction too, leading to very long SQL transactions, which are quite dangerous indeed. It is not easy to recover from a failure due to an eventlet thread deadlocking with a mysql
transaction, where by 'recover' I mean ensuring neutron and backend state are in sync.</div>
<div><br>
</div>
<div>- maintainability: since handling rollback in case of failures on the backend and/or the db is cumbersome, this often leads to spaghetti code which is very hard to maintain regardless of the effort (ok, I agree here that this also depends on how good the
devs are - most of the guys in my team are very good, but unfortunately they have me too...).</div>
<div><br>
</div>
<div>- performance & scalability:</div>
<div> - roundtrips to the backend take a non-negligible toll on the duration of an API call, whereas most Neutron API calls should probably just terminate at the DB just like a nova boot call does not wait for the VM to be ACTIVE to return.</div>
<div> - we need to keep some operation serialized in order to avoid the mentioned race issues</div>
<div><br>
</div>
<div>For this reason we're progressively moving toward a change in the NVP plugin with a series of patches under this umbrella-blueprint [1].</div>
<div><br>
</div>
<div>For answering the issues mentioned by Isaku, we've been looking at a task management library with an efficient and reliable set of abstractions for ensuring operations are properly ordered thus avoiding those races (I agree on the observation on the pre/post
commit solution).</div>
<div>We are currently looking at using celery [2] rather than taskflow; mostly because we've already have expertise on how to use it into our applications, and has very easy abstractions for workflow design, as well as for handling task failures.</div>
<div>Said that, I think we're still open to switch to taskflow should we become aware of some very good reason for using it.</div>
<div><br>
</div>
<div>Regards,</div>
<div>Salvatore</div>
<div><br>
</div>
<div>[1] <a href="https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication">https://blueprints.launchpad.net/neutron/+spec/nvp-async-backend-communication</a></div>
<div>[2] <a href="http://docs.celeryproject.org/en/master/index.html">http://docs.celeryproject.org/en/master/index.html</a></div>
<div><br>
</div>
</div>
</div>
<div class="gmail_extra"><br>
<br>
<div class="gmail_quote">On 19 November 2013 19:42, Joshua Harlow <span dir="ltr">
<<a href="mailto:harlowja@yahoo-inc.com" target="_blank">harlowja@yahoo-inc.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
And also of course, nearly forgot a similar situation/review in heat.<br>
<br>
<a href="https://review.openstack.org/#/c/49440/" target="_blank">https://review.openstack.org/#/c/49440/</a><br>
<br>
Except theres was/is dealing with stack locking (a heat concept).<br>
<div class="HOEnZb">
<div class="h5"><br>
On 11/19/13 10:33 AM, "Joshua Harlow" <<a href="mailto:harlowja@yahoo-inc.com">harlowja@yahoo-inc.com</a>> wrote:<br>
<br>
>If you start adding these states you might really want to consider the<br>
>following work that is going on in other projects.<br>
><br>
>It surely appears that everyone is starting to hit the same problem (and<br>
>joining efforts would produce a more beneficial result).<br>
><br>
>Relevant icehouse etherpads:<br>
>- <a href="https://etherpad.openstack.org/p/CinderTaskFlowFSM" target="_blank">https://etherpad.openstack.org/p/CinderTaskFlowFSM</a><br>
>- <a href="https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization" target="_blank">
https://etherpad.openstack.org/p/icehouse-oslo-service-synchronization</a><br>
><br>
>And of course my obvious plug for taskflow (which is designed to be a<br>
>useful library to help in all these usages).<br>
><br>
>- <a href="https://wiki.openstack.org/wiki/TaskFlow" target="_blank">https://wiki.openstack.org/wiki/TaskFlow</a><br>
><br>
>The states u just mentioned start to line-up with<br>
><a href="https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow" target="_blank">https://wiki.openstack.org/wiki/TaskFlow/States_of_Task_and_Flow</a><br>
><br>
>If this sounds like a useful way to go (joining efforts) then lets see how<br>
>we can make it possible.<br>
><br>
>IRC: #openstack-state-management is where I am usually at.<br>
><br>
>On 11/19/13 3:57 AM, "Isaku Yamahata" <<a href="mailto:isaku.yamahata@gmail.com">isaku.yamahata@gmail.com</a>> wrote:<br>
><br>
>>On Mon, Nov 18, 2013 at 03:55:49PM -0500,<br>
>>Robert Kukura <<a href="mailto:rkukura@redhat.com">rkukura@redhat.com</a>> wrote:<br>
>><br>
>>> On 11/18/2013 03:25 PM, Edgar Magana wrote:<br>
>>> > Developers,<br>
>>> ><br>
>>> > This topic has been discussed before but I do not remember if we have<br>
>>>a<br>
>>> > good solution or not.<br>
>>><br>
>>> The ML2 plugin addresses this by calling each MechanismDriver twice.<br>
>>>The<br>
>>> create_network_precommit() method is called as part of the DB<br>
>>> transaction, and the create_network_postcommit() method is called after<br>
>>> the transaction has been committed. Interactions with devices or<br>
>>> controllers are done in the postcommit methods. If the postcommit<br>
>>>method<br>
>>> raises an exception, the plugin deletes that partially-created resource<br>
>>> and returns the exception to the client. You might consider a similar<br>
>>> approach in your plugin.<br>
>><br>
>>Splitting works into two phase, pre/post, is good approach.<br>
>>But there still remains race window.<br>
>>Once the transaction is committed, the result is visible to outside.<br>
>>So the concurrent request to same resource will be racy.<br>
>>There is a window after pre_xxx_yyy before post_xxx_yyy() where<br>
>>other requests can be handled.<br>
>><br>
>>The state machine needs to be enhanced, I think. (plugins need<br>
>>modification)<br>
>>For example, adding more states like pending_{create, delete, update}.<br>
>>Also we would like to consider serializing between operation of ports<br>
>>and subnets. or between operation of subnets and network depending on<br>
>>performance requirement.<br>
>>(Or carefully audit complex status change. i.e.<br>
>>changing port during subnet/network update/deletion.)<br>
>><br>
>>I think it would be useful to establish reference locking policy<br>
>>for ML2 plugin for SDN controllers.<br>
>>Thoughts or comments? If this is considered useful and acceptable,<br>
>>I'm willing to help.<br>
>><br>
>>thanks,<br>
>>Isaku Yamahata<br>
>><br>
>>> -Bob<br>
>>><br>
>>> > Basically, if concurrent API calls are sent to Neutron, all of them<br>
>>>are<br>
>>> > sent to the plug-in level where two actions have to be made:<br>
>>> ><br>
>>> > 1. DB transaction ? No just for data persistence but also to collect<br>
>>>the<br>
>>> > information needed for the next action<br>
>>> > 2. Plug-in back-end implementation ? In our case is a call to the<br>
>>>python<br>
>>> > library than consequentially calls PLUMgrid REST GW (soon SAL)<br>
>>> ><br>
>>> > For instance:<br>
>>> ><br>
>>> > def create_port(self, context, port):<br>
>>> > with context.session.begin(subtransactions=True):<br>
>>> > # Plugin DB - Port Create and Return port<br>
>>> > port_db = super(NeutronPluginPLUMgridV2,<br>
>>> > self).create_port(context,<br>
>>> ><br>
>>> port)<br>
>>> > device_id = port_db["device_id"]<br>
>>> > if port_db["device_owner"] == "network:router_gateway":<br>
>>> > router_db = self._get_router(context, device_id)<br>
>>> > else:<br>
>>> > router_db = None<br>
>>> > try:<br>
>>> > LOG.debug(_("PLUMgrid Library: create_port()<br>
>>>called"))<br>
>>> > # Back-end implementation<br>
>>> > self._plumlib.create_port(port_db, router_db)<br>
>>> > except Exception:<br>
>>> > Š<br>
>>> ><br>
>>> > The way we have implemented at the plugin-level in Havana (even in<br>
>>> > Grizzly) is that both action are wrapped in the same "transaction"<br>
>>>which<br>
>>> > automatically rolls back any operation done to its original state<br>
>>> > protecting mostly the DB of having any inconsistency state or left<br>
>>>over<br>
>>> > data if the back-end part fails.=.<br>
>>> > The problem that we are experiencing is when concurrent calls to the<br>
>>> > same API are sent, the number of operation at the plug-in back-end<br>
>>>are<br>
>>> > long enough to make the next concurrent API call to get stuck at the<br>
>>>DB<br>
>>> > transaction level, which creates a hung state for the Neutron Server<br>
>>>to<br>
>>> > the point that all concurrent API calls will fail.<br>
>>> ><br>
>>> > This can be fixed if we include some "locking" system such as<br>
>>>calling:<br>
>>> ><br>
>>> > from neutron.common import utile<br>
>>> > Š<br>
>>> ><br>
>>> > @utils.synchronized('any-name', external=True)<br>
>>> > def create_port(self, context, port):<br>
>>> > Š<br>
>>> ><br>
>>> > Obviously, this will create a serialization of all concurrent calls<br>
>>> > which will ends up in having a really bad performance. Does anyone<br>
>>>has a<br>
>>> > better solution?<br>
>>> ><br>
>>> > Thanks,<br>
>>> ><br>
>>> > Edgar<br>
>>> ><br>
>>> ><br>
>>> > _______________________________________________<br>
>>> > OpenStack-dev mailing list<br>
>>> > <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
>>> > <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
>>> ><br>
>>><br>
>>><br>
>>> _______________________________________________<br>
>>> OpenStack-dev mailing list<br>
>>> <a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
>>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
>><br>
>>--<br>
>>Isaku Yamahata <<a href="mailto:isaku.yamahata@gmail.com">isaku.yamahata@gmail.com</a>><br>
>><br>
>>_______________________________________________<br>
>>OpenStack-dev mailing list<br>
>><a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
>><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
><br>
><br>
>_______________________________________________<br>
>OpenStack-dev mailing list<br>
><a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
<br>
_______________________________________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org">OpenStack-dev@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
</div>
</div>
</blockquote>
</div>
<br>
</div>
</div>
</div>
</span>
</body>
</html>