[openstack-dev] Nova workflow management update

Joshua Harlow harlowja at yahoo-inc.com
Wed May 1 17:35:47 UTC 2013


Wow, very interesting that the work documented in your paper is so similar!

Very impressive :-)

Lets definitely chat sometime. Ping me on IRC or email and lets arrange something.

There seems to be a lot of interested parties, so it might just make sense to arrange some kind of organized time for all. Since the more we align the more we can accomplish together.

-Josh

From: Changbin Liu <changbin.liu at gmail.com<mailto:changbin.liu at gmail.com>>
Reply-To: OpenStack Development Mailing List <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Wednesday, May 1, 2013 8:51 AM
To: OpenStack Development Mailing List <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] Nova workflow management update

Hi Joshua,

First, +1 on your documents and code!

My name is Changbin Liu, from AT&T Labs Research. Together with Yun Mao (also from AT&T), we once worked on a research prototype cloud controller which supports "transactional" cloud resource orchestration (e.g., state rollback, concurrency, consistency issues, etc), and we used ZooKeeper to provide HA and serve as state store. I believe your work shares many interesting similarities with ours, and we would be very happy to join your efforts here to improve Nova state management.

FYI: our work is documented in a paper: https://www.usenix.org/system/files/conference/atc12/atc12-final41_0.pdf

Please let us know when we can set up meetings to discuss more.



Thanks

Changbin


On Fri, Apr 26, 2013 at 11:10 PM, Joshua Harlow <harlowja at yahoo-inc.com<mailto:harlowja at yahoo-inc.com>> wrote:
Great to hear all the encouraging feedback!

Since likely people like to see code I thought I'd just throw out some
pointers to what the prototype is doing.

- https://github.com/yahoo/NovaOrc/blob/master/nova/orc/manager.py#L187
(new manager, could be part of conductor, TBD)
- Workflow to fulfill the create request @
https://github.com/yahoo/NovaOrc/blob/master/nova/orc/manager.py#L216
- Refactored run_instance states/plugins @
https://github.com/yahoo/NovaOrc/tree/master/nova/orc/states
- Potential pieces of new workflow library @
https://github.com/yahoo/NovaOrc/blob/master/nova/orc/states/__init__.py

Work is in progress to add the zookeeper part in (which will vastly
increase HA, concurrency and add things like distributed resumption).
--- Related to discussion @
http://lists.openstack.org/pipermail/openstack-dev/2013-April/007881.html

For those that are interested we are hoping to figure out the coordination
and inter/intra team+project direction shortly (the harder problem IMHO).

More details & code to be landed soon!

On 4/26/13 3:14 PM, "Patil, Tushar" <Tushar.Patil at nttdata.com<mailto:Tushar.Patil at nttdata.com>> wrote:

>+1 on all the work done by Joshua and Rohit so far.
>I think in a very highly scalable solutions like Nova, we need a
>framework to resume/retry/rollback various stages of vm in a structured
>way.
>I can see this is driving in that direction.
>
>- Tushar
>
>>-----Original Message-----
>>From: Mike Wilson [mailto:geekinutah at gmail.com<mailto:geekinutah at gmail.com>]
>>Sent: Friday, April 26, 2013 8:30 AM
>>To: OpenStack Development Mailing List
>>Subject: Re: [openstack-dev] Nova workflow management update
>>
>>Very exciting to see this happening. Structured state management is
>>sorely needed. I also like the plan of attack, tackling creation of an
>>instance first is a good way to feel this one out.
>>
>>-Mike
>>
>>
>>On Fri, Apr 26, 2013 at 9:21 AM, Senhua Huang (senhuang)
>><senhuang at cisco.com<mailto:senhuang at cisco.com>> wrote:
>>
>>
>>      +1 on the great work on initiating the design and implementation
>>of a more structured and "manageable" provision manager.
>>      +1 on the documentation.
>>
>>      Having a state management separated from resource selection is
>>very helpful for group scheduling, cross compute/storage/network
>>scheduling, as well as improved migration solution. It makes Nova more
>>modular, easier to track and reasoning about errors, and easier to
>>scale.
>>
>>      Thanks,
>>      Senhua
>>
>>
>>      From: <Karajgi>, Rohit <Rohit.Karajgi at nttdata.com<mailto:Rohit.Karajgi at nttdata.com>>
>>      Reply-To: OpenStack Development Mailing List <openstack-
>>dev at lists.openstack.org<mailto:dev at lists.openstack.org>>
>>      Date: Thursday, April 25, 2013 11:07 PM
>>      To: OpenStack Development Mailing List <openstack-
>>dev at lists.openstack.org<mailto:dev at lists.openstack.org>>
>>
>>      Subject: Re: [openstack-dev] Nova workflow management update
>>
>>
>>
>>      +1x on the really well written plan and wikis.
>>
>>
>>
>>      It just goes to show how important having a Structured State
>>management is to Nova
>>
>>      for making it highly reliable and resilient across all APIs.
>>
>>
>>
>>      We would eventually want to see Nova become SuperNova!! J
>>
>>
>>
>>      Regards,
>>
>>      Rohit
>>
>>
>>
>>
>>
>>      From: Adrian Otto [mailto:adrian.otto at rackspace.com<mailto:adrian.otto at rackspace.com>]
>>      Sent: Friday, April 26, 2013 9:28 AM
>>      To: OpenStack Development Mailing List
>>      Cc: OpenStack Development Mailing List
>>      Subject: Re: [openstack-dev] Nova workflow management update
>>
>>
>>
>>      Joshua,
>>
>>
>>
>>      I'm one of the Rackers helping to add development resources to
>>Convection. I also work on the OASIS CAMP TC as an editor. I want to
>>help create a reusable task system that helps Nova, Heat, and many other
>>OpenStack projects. I am happy to see this progressing. The recent
>>collaboration among the Nova and Heat/Convection teams is very
>>encouraging. Thanks for all your efforts to form a sensible written
>>plan.
>>
>>
>>
>>      I'd like to take an editorial pass through the
>>StructuredStateManagement wiki, and help tighten up the definitions a
>>bit. Some wording changes may avoid some of the more overloaded
>>technical terms (but there are practically no pure ones left). I'm
>>planning to make a few edits to the wiki page (so there will be diffs),
>>but if another approach is preferred, I'm open to that too. Please
>>advise.
>>
>>
>>
>>      Thanks,
>>
>>
>>
>>      Adrian
>>
>>
>>      On Apr 25, 2013, at 5:16 PM, "Joshua Harlow" <harlowja at yahoo-
>>inc.com<http://inc.com>> wrote:
>>
>>              Since I wanted to make sure everyone was aware of this,
>>since some of you might have missed the summit session and I'd like
>>discussions so we can land code in havana.
>>
>>
>>
>>              For those that missed the session & associated material.
>>
>>
>>
>>              - https://etherpad.openstack.org/the-future-of-orch
>>(session details + discussion ...)
>>
>>
>>
>>              The summary of what I am trying to do is to move nova away
>>from having ad-hoc tasks and move it toward having a central entity (not
>>a single entity, but a central one, one that can be horizontally
>>scalable) which can execute these tasks on-behalf of nova-compute. This
>>central entity (a new orchestrator or conductor...) would centrally
>>manage
>>the workflow that nova goes through when completing an API request and
>>would do so in a organized, controlled and resumable manner (it would
>>also support rollbacks and more...). The reasons why what exists
>>currently
>>may not be optimal/good are listed in that etherpad, so I won't repeat
>>them here.
>>
>>
>>
>>              For example this is a possible diagram for the run_instance
>>'workflow' under this new scheme: http://imgur.com/sYOVz5X
>>
>>
>>
>>              Nttdata and y! have been pursuing how to refactor this in a
>>well thought out design, and even have prototype code @
>>https://github.com/Yahoo/NovaOrc which has some of these changes (see
>>the last 4-10 commits). The prototype was shown in the session but feel
>>free to check out the code, if you setup with that code - its based on
>>stable/grizzly, it should run (note that no external api changes
>>occurred).
>>
>>
>>
>>              Some of the outcomes of that meeting I received that are
>>relevant here:
>>
>>
>>
>>              - HEAT may have a convection library (WIP -
>>https://wiki.openstack.org/wiki/Convection
>><https://wiki.openstack.org/wiki/Convection> ) that this workflow
>>restructuring can use.
>>
>>              --- Note: If this code is created quickly (creating a solid
>>core) then it seems like we can use this code in nova itself and start
>>restructuring nova into using this code. This of course then allows HEAT
>>to use said library also, and nova as well (and likely creates future
>>capabilities for something like http://aws.amazon.com/swf). The talk
>>about this I think is just being started, but it seems like a solid core
>>can be created in a week or two.
>>
>>              --- The documentation for my attempt at what I would like
>>this central library to do where put @
>>https://etherpad.openstack.org/task-system (thx for the heat team for
>>starting that pad)
>>
>>              - There was an ask to document more the overall design and
>>how to accomplish it. I have started this @
>>https://wiki.openstack.org/wiki/StructuredStateManagement (input is
>>welcome)
>>
>>              --- More details are at
>>https://wiki.openstack.org/wiki/StructuredStateManagementDetails (WIP)
>>since I didn't want to clutter the main page up...
>>
>>              --- Other thoughts of mine at
>>http://lists.openstack.org/pipermail/openstack-dev/2013-
>>April/007881.html (with other code associated with it)
>>
>>              - There was an ask on how conductor fits into this picture,
>>this is still being worked out and discussed (thoughts welcome!)
>>
>>              - There was talk about how live migration/resizing can take
>>advantage of such a workflow like system to become more secure (details
>>on another email)
>>
>>              --- This one involves planning, where imho i would like
>>nova/heat groups to focus on this core library, and when adjusting the
>>live migration/resize path they should use said core library. If not a
>>core library then the prototype code I have created above (along with
>>nttdata) can be altered to focus on those paths instead of the initial
>>prototype path of 'run_instance'.
>>
>>              - More blueprints - I have started a few @
>>https://wiki.openstack.org/wiki/StructuredStateManagement#Blueprints
>>
>>              - Make a plan on how to get this into mainline, started
>>this @
>>https://wiki.openstack.org/wiki/StructuredStateManagement#Plan_of_record
>>
>>
>>
>>              Discussion is always welcome! I believe we can make this
>>happen (and in all honesty must make it happen).
>>
>>
>>
>>              I know there are others interested in this idea/solution,
>>so if they want to chime in that would be wonderful :-)
>>
>>
>>
>>              -Josh
>>
>>
>>
>>              _______________________________________________
>>              OpenStack-dev mailing list
>>              OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>>              http://lists.openstack.org/cgi-
>>bin/mailman/listinfo/openstack-dev
>>
>>
>>      __________________________________________________________________
>>____
>>      Disclaimer:This email and any attachments are sent in strictest
>>confidence for the sole use of the addressee and may contain legally
>>privileged, confidential, and proprietary data. If you are not the
>>intended recipient, please advise the sender by replying promptly to
>>this email and then delete and destroy this email and any attachments
>>without any further use, copying or forwarding
>>
>>
>>      _______________________________________________
>>      OpenStack-dev mailing list
>>      OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>>      http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>
>______________________________________________________________________
>Disclaimer:This email and any attachments are sent in strictest
>confidence for the sole use of the addressee and may contain legally
>privileged, confidential, and proprietary data.  If you are not the
>intended recipient, please advise the sender by replying promptly to this
>email and then delete and destroy this email and any attachments without
>any further use, copying or forwarding
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org<mailto:OpenStack-dev at lists.openstack.org>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20130501/35105b0e/attachment.html>


More information about the OpenStack-dev mailing list