[openstack-dev] [Neutron] Introducing task oriented workflows

Salvatore Orlando sorlando at nicira.com
Fri May 23 13:58:17 UTC 2014


While Mistral is a service with its own REST API endpoint, taskflow is a
library (shoot me if I'm wrong here).
Also, mistral appears, in my opinion, to satisfy a set of use cases aimed
at cloud operators rather than for building tasks within an application.

These are the reasons for which I did not consider mistral. I don't think
it a good idea to rely on a third party service for managing the operations
needed to complete a neutron API.

Regarding Cinder, the answer is yes; the project is trying to do similar
transformations to the API as the ones neutron is aiming to achieve.

Salvatore


On 23 May 2014 14:26, Endre Karlson <endre.karlson at gmail.com> wrote:

> I think Cinder has some of the same sauce ?
>
> https://review.openstack.org/#/c/94742/
> https://review.openstack.org/#/c/95037/
>
>
>
> 2014-05-23 10:57 GMT+02:00 Jaume Devesa <devvesa at gmail.com>:
>
> ​Hello,
>>
>> I think the Mistral Project[1] aims the same goal, isn't it?
>>
>> Regards,
>> jaume
>>
>> [1]: https://wiki.openstack.org/wiki/Mistral
>>
>>
>> On 23 May 2014 09:28, Salvatore Orlando <sorlando at nicira.com> wrote:
>>
>>> Nachi,
>>>
>>> I will be glad if the solution was as easy as sticking a task_state
>>> attribute to a resource! I'm afraid however that would be only the tip of
>>> the iceberg, or the icing of the cake, if you want.
>>> However, I agree with you that consistency across Openstack APIs is very
>>> important; whether this is a cross project discussion is instead debatable;
>>> my feeling here is that taskflow is the cross-project piece of the
>>> architecture, and every project then might have a different strategy for
>>> integrating it - as long as it does not result in inconsistent APIs exposed
>>> to customers!
>>>
>>> It is something that obviously will be considered when designing how to
>>> represent whether a DB resource is in sync with its actual configuration on
>>> the backend.
>>> I think this is something which might happen regardless of whether it
>>> will be also agreed to let API consumers access task execution information
>>> using the API.
>>>
>>> Salvatore
>>>
>>>
>>>
>>>
>>> On 23 May 2014 01:16, Nachi Ueno <nachi at ntti3.com> wrote:
>>>
>>>> Hi Salvatore
>>>>
>>>> Thank you for your posting this.
>>>>
>>>> IMO, this topic shouldn't be limited for Neutron only.
>>>> Users wants consistent API between OpenStack project, right?
>>>>
>>>> In Nova, a server has task_state, so Neutron should do same way.
>>>>
>>>>
>>>>
>>>> 2014-05-22 15:34 GMT-07:00 Salvatore Orlando <sorlando at nicira.com>:
>>>> > As most of you probably know already, this is one of the topics
>>>> discussed
>>>> > during the Juno summit [1].
>>>> > I would like to kick off the discussion in order to move towards a
>>>> concrete
>>>> > design.
>>>> >
>>>> > Preamble: Considering the meat that's already on the plate for Juno,
>>>> I'm not
>>>> > advocating that whatever comes out of this discussion should be put
>>>> on the
>>>> > Juno roadmap. However, preparation (or yak shaving) activities that
>>>> should
>>>> > be identified as pre-requisite might happen during the Juno time frame
>>>> > assuming that they won't interfere with other critical or high
>>>> priority
>>>> > activities.
>>>> > This is also a very long post; the TL;DR summary is that I would like
>>>> to
>>>> > explore task-oriented communication with the backend and how it
>>>> should be
>>>> > reflected in the API - gauging how the community feels about this, and
>>>> > collecting feedback regarding design, constructs, and related
>>>> > tools/techniques/technologies.
>>>> >
>>>> > At the summit a broad range of items were discussed during the
>>>> session, and
>>>> > most of them have been reported in the etherpad [1].
>>>> >
>>>> > First, I think it would be good to clarify whether we're advocating a
>>>> > task-based API, a workflow-oriented operation processing, or both.
>>>> >
>>>> > --> About a task-based API
>>>> >
>>>> > In a task-based API, most PUT/POST API operations would return tasks
>>>> rather
>>>> > than neutron resources, and users of the API will interact directly
>>>> with
>>>> > tasks.
>>>> > I put an example in [2] to avoid cluttering this post with too much
>>>> text.
>>>> > As the API operation simply launches a task - the database state
>>>> won't be
>>>> > updated until the task is completed.
>>>> >
>>>> > Needless to say, this would be a radical change to Neutron's API; it
>>>> should
>>>> > be carefully evaluated and not considered for the v2 API.
>>>> > Even if it is easily recognisable that this approach has a few
>>>> benefits, I
>>>> > don't think this will improve usability of the API at all. Indeed
>>>> this will
>>>> > limit the ability of operating on a resource will a task is in
>>>> execution on
>>>> > it, and will also require neutron API users to change the paradigm
>>>> the use
>>>> > to interact with the API; for not mentioning the fact that it would
>>>> look
>>>> > weird if neutron is the only API endpoint in Openstack operating in
>>>> this
>>>> > way.
>>>> > For the Neutron API, I think that its operations should still be
>>>> > manipulating the database state, and possibly return immediately
>>>> after that
>>>> > (*) - a task, or to better say a workflow will then be started,
>>>> executed
>>>> > asynchronously, and update the resource status on completion.
>>>> >
>>>> > --> On workflow-oriented operations
>>>> >
>>>> > The benefits of it when it comes to easily controlling operations and
>>>> > ensuring consistency in case of failures are obvious. For what is
>>>> worth, I
>>>> > have been experimenting introducing this kind of capability in the NSX
>>>> > plugin in the past few months. I've been using celery as a task
>>>> queue, and
>>>> > writing the task management code from scratch - only to realize that
>>>> the
>>>> > same features I was implementing are already supported by taskflow.
>>>> >
>>>> > I think that all parts of Neutron API can greatly benefit from
>>>> introducing a
>>>> > flow-based approach.
>>>> > Some examples:
>>>> > - pre/post commit operations in the ML2 plugin can be orchestrated a
>>>> lot
>>>> > better as a workflow, articulating operations on the various drivers
>>>> in a
>>>> > graph
>>>> > - operation spanning multiple plugins (eg: add router interface)
>>>> could be
>>>> > simplified using clearly defined tasks for the L2 and L3 parts
>>>> > - it would be finally possible to properly manage resources'
>>>> "operational
>>>> > status", as well as knowing whether the actual configuration of the
>>>> backend
>>>> > matches the database configuration
>>>> > - synchronous plugins might be converted into asynchronous thus
>>>> improving
>>>> > their API throughput
>>>> >
>>>> > Now, the caveats:
>>>> > - during the sessions it was correctly pointed out that special care
>>>> is
>>>> > required with multiple producers (ie: api servers) as workflows
>>>> should be
>>>> > always executed in the correct order
>>>> > - it is probably be advisable to serialize workflows operating on the
>>>> same
>>>> > resource; this might lead to unexpected situations (potentially to
>>>> > deadlocks) with workflows operating on multiple resources
>>>> > - if the API is asynchronous, and multiple workflows might be queued
>>>> or in
>>>> > execution at a given time, rolling back the DB operation on failures
>>>> is
>>>> > probably not advisable (it would not be advisable anyway in any
>>>> asynchronous
>>>> > framework). If the API instead stays synchronous the revert action
>>>> for a
>>>> > failed task might also restore the db state for a resource; but I
>>>> think that
>>>> > keeping the API synchronous missed a bit the point of this whole work
>>>> - feel
>>>> > free to show your disagreement here!
>>>> > - some neutron workflows are actually initiated by agents; this is
>>>> the case,
>>>> > for instance, of the workflow for doing initial L2 and security group
>>>> > configuration for a port.
>>>> > - it's going to be a lot of work, and we need to devise a strategy to
>>>> either
>>>> > roll this changes in the existing plugins or just decide that future
>>>> v3
>>>> > plugins will use it.
>>>> >
>>>> > From the implementation side, I've done a bit of research and task
>>>> queue
>>>> > like celery only implement half of what is needed; conversely I have
>>>> not
>>>> > been able to find a workflow manager, at least in the python world, as
>>>> > complete and suitable as taskflow.
>>>> > So my preference will be obviously to use it, and contribute to it
>>>> should we
>>>> > realize Neutron needs some changes to suit its needs. Growing
>>>> something
>>>> > neutron-specific in tree is something I'd rule out.
>>>> >
>>>> > (*) This is a bit different from what many plugins do, as they execute
>>>> > requests synchronously and return only once the backend request is
>>>> > completed.
>>>> >
>>>> > --> Tasks and the API
>>>> >
>>>> > The etherpad [1] contains a lot of interesting notes on this topic.
>>>> > One important item it to understand how tasks affect the resource's
>>>> status
>>>> > to indicate their completion or failure. So far Neutron resource
>>>> status
>>>> > pretty much expresses its "fabric" status. For instance a port is
>>>> "UP" if
>>>> > it's been wired by the OVS agent; it often does not tell us whether
>>>> the
>>>> > actual resource configuration is exactly the desired one in the
>>>> database.
>>>> > For instance, if the ovs agent fails to apply security groups to a
>>>> port, the
>>>> > port stays "ACTIVE" and the user might never know there was an error
>>>> and the
>>>> > actual state diverged from the desired one.
>>>> >
>>>> > It is therefore important to allow users to know whether the backend
>>>> state
>>>> > is in sync with the db; tools like taskflow will be really helpful to
>>>> this
>>>> > aim.
>>>> > However, how should this be represented? The main options are to
>>>> either have
>>>> > a new attribute describing the resource sync state, or to extend the
>>>> > semantics of the current status attribute to include also resource
>>>> sync
>>>> > state. I've put some rumblings on the subjects in the etherpad [3].
>>>> > Still, it has been correctly pointed out that it might not be enough
>>>> to know
>>>> > that a resource is out of sync, but it is good to know which operation
>>>> > exactly failed; this is where exposing somehow tasks through the API
>>>> might
>>>> > come handy.
>>>> >
>>>> > For instance one could do something like:
>>>> >
>>>> > GET /tasks?resource_id=<res_id>&task_state=FAILED
>>>> >
>>>> > to get failure details for a given resource.
>>>> >
>>>> > --> How to proceed
>>>> >
>>>> > This is where I really don't know... and I will therefore be brief.
>>>> > We'll probably need some more brainstorming to flush out all the
>>>> details.
>>>> > Once that is done, it might the case of evaluating what needs to be
>>>> done and
>>>> > whether it is better to target this work onto existing plugins, or
>>>> moving it
>>>> > out to v3 plugins (and hence do the actual work once the "core
>>>> refactoring"
>>>> > activities are complete).
>>>> >
>>>> > Regards,
>>>> > Salvatore
>>>> >
>>>> >
>>>> > [1] https://etherpad.openstack.org/p/integrating-task-into-neutron
>>>> > [2] http://paste.openstack.org/show/81184/
>>>> > [3] https://etherpad.openstack.org/p/sillythings
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > _______________________________________________
>>>> > OpenStack-dev mailing list
>>>> > OpenStack-dev at lists.openstack.org
>>>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> >
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>>
>> --
>> Jaume Devesa
>> Software Engineer at Midokura
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140523/1bc6ed53/attachment.html>


More information about the OpenStack-dev mailing list