[openstack-dev] Nova workflow management update

Joshua Harlow harlowja at yahoo-inc.com
Mon Apr 29 20:55:11 UTC 2013


More people is great!

Any thoughts on how the current conductor db-activity works with this?

I can see two entry points to conductor:

DB data calls 
  |
  | 
  ------------------------------------------Conductor-->RPC/DB calls to do
this stuff
                                               |
Workflow on behalf of something calls          |
  |                                            |
  |                                            |
  ---------------------------------------------|

Maybe its not a concern for 'H' but it seems one of those doesn¹t belong
there (cough cough DB stuff).

Especially I personally want to scale 'conductors' without having every
conductor connect to the DB.

My writeup @ https://wiki.openstack.org/wiki/StructuredStateManagement is
a big part of the overall goal I think, where I think the small iterations
are part of this goal, yet likely both small and big goals will be
happening at once, so it would be useful to ensure that we talk about the
bigger goal and make sure the smaller iteration goal will eventually
arrive at the bigger goal (or can be adjusted to be that way). Since some
rackspace folks will also be helping out building the underlying
foundation (convection library) for the end-goal it would be great to work
together and make sure all small iterations also align with that
foundational library work.

I'd be interested in what u think about moving the scheduler code around,
since this also connects into some work the cisco folks want to do for
better scheduling, so that is yet another coordination of work that needs
to happen (to end up at the same end-goal there as well).

Overall though I think we are all seeing we are working toward the same
goal, its just how to coordinate that becomes 'fun', ha.

I was thinking that documenting the current situation, possibly @
https://wiki.openstack.org/wiki/TheBetterPathToLiveMigration would help.

Then documenting your possible 'flow' described below would make sure that
everyone sees how the fixed 'flow' fits into the end-goal and adjust it if
it doesn't...

Something like https://wiki.openstack.org/wiki/File:Run_workflow.png might
help to easily visualize the current and fixed 'flow'/thread of execution.

Thoughts?

On 4/29/13 11:04 AM, "John Garbutt" <john at johngarbutt.com> wrote:

>Hi,
>
>Sorry I was still traveling last week, but I am keen to help with this
>too. Although I have other stuff at the top of my list right now
>(getting the agent and config drive separation much cleaner inside
>XenAPI and looking into the set/get-password situation)
>
>I did some work on getting live-migrate working across libvirt and
>XenAPI not too long ago, so I am keen to help with this one:
>https://blueprints.launchpad.net/nova/+spec/live-migration-to-conductor
>
>I am +1 to Russell's suggestion of small iterations, and we should
>co-ordinate to make sure we don't waste effort.
>
>I think Dan's blueprints represent how I saw the best way forward:
>- move cold migrate (and resize?) into conductor
>  - make api call conductor to start the process
>  - make above call out to scheduler to get the suggested nodes,
>rather than rpc to scheduler
>  - move code in scheduler into the conductor
>  - then look at moving stuff from compute manger into the scheduler
>  - should then have a single thread of execution call out to people
>who need to do things, with call backs as required
>- move live migrate into conductor, where possible merging with the
>above code, maybe just moving to conductor first.
>- move towards restructuring the above code with more 'orchestration
>logic'
>- look at moving evacuate into the above structure (i.e. 'dead' migration)
>
>John
>
>On 27 April 2013 16:43, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
>> Thx Russell,
>>
>> Now we just need to keep this momentum on fixing all of this going
>>forward. Likely a ton of coordination will be required to achieve the
>>biggest bang for the buck.
>>
>> Ntt and y! will do what we can to drive this prototype (whichever
>>pieces get accepted...) into reality (and get it reviewed in small
>>chunks...) and will likely be helping driving the core workflow library
>>into reality (which hopefully will be the foundation for the future
>>state fixing tasks/reworking workflows...).
>>
>> The nova core seems like it understands where it wants the conductor to
>>go. I am still iffy on mixing the db stuff with conductor like stuff,
>>due to scaling and functional differences, but that I think was
>>mentioned/discussed in the last meeting. It might help the larger work
>>that is undergoing if more of the conductor vision/plan were documented
>>which would help us all get on the same page (and know where to place it
>>in the larger work). Perhaps this can be a todo? This will help avoid
>>future conflicts and misunderstanding if there is time put upfront to
>>make this clear to all involved parties.
>>
>> Overall, it gets very interesting since these are all working along the
>>same goals (and will likely be happening simultaneously) and must be
>>driven toward the same finish line...
>>
>> Likely we all should keep in close touch to achieve maximum benefit...
>>What do u think?
>>
>> Sent from my really tiny device...
>>
>> On Apr 27, 2013, at 7:45 AM, "Russell Bryant" <rbryant at redhat.com>
>>wrote:
>>
>>> On 04/25/2013 08:08 PM, Joshua Harlow wrote:
>>>> Since I wanted to make sure everyone was aware of this, since some of
>>>> you might have missed the summit session and I'd like discussions so
>>>>we
>>>> can land code in havana.
>>>>
>>>> For those that missed the session & associated material.
>>>>
>>>> - https://etherpad.openstack.org/the-future-of-orch (session details +
>>>> discussion Š)
>>>
>>> So my take on all of this is:
>>>
>>> The goals here are good.  Having better tracking of state through long
>>> running opertions is A Good Thing, and we should continue to take steps
>>> in that direction.
>>>
>>> The trick with a large effort in an extremely active open source
>>>project
>>> is how to approach it in an iterative manner so that the patches have a
>>> chance of going in.
>>>
>>> I believe that the specific operations that would benefit the most from
>>> this type of improvement are migrate/live-migrate/resize/evacuate.
>>> These operations are complex long running tasks.  Further, they involve
>>> multiple compute nodes.  Right now the flow of control is passed around
>>> and not tracked as well as it could be.  We should improve it, which
>>> includes some of the ideas from this proposal (at some point down the
>>>path).
>>>
>>> So, here is my specific proposal for a first set of tasks to complete
>>>in
>>> Havana:
>>>
>>> 1) At the design summit, we spent a session discussing the state of
>>> these code paths.  The *first* problem we have to solve with them is
>>>the
>>> fact that they are quite separate code paths, when much of it can be
>>>and
>>> should be combined.
>>>
>>> This is a cleanup task, but a fairly complex one.  Combining these in
>>> the areas where it makes sense will also aid us in getting effective
>>> test coverage over these operations.  Right now we have poor coverage
>>> here and they are at high risk for breaking over time.
>>>
>>> Some summit discussion notes here:
>>>
>>> https://etherpad.openstack.org/HavanaUnifyMigrateAndLiveMigrate
>>>
>>> 2) At the same time as #1, one thing that would be a helpful design
>>> artifact is some diagrams that show the flow of how these operations
>>> work today.  I spoke with Joshua Harlow about this and it sounded like
>>> something he would be willing to work on.  I think many people would
>>> benefit with having a diagram that explains these operations.
>>>
>>> 3) Take a look at shifting the flow of control from distributed among
>>> services and compute nodes to a more centralized control.  This task is
>>> taking the code, much as it exists today, but moving it around.
>>> Specifically, this will probably mean having nova-conductor 'conduct'
>>> these operations.
>>>
>>>
>>> Havana blueprint (with a couple child blueprints) for reference:
>>>
>>> https://blueprints.launchpad.net/nova/+spec/unified-migrations
>>>
>>>
>>> I believe that all of those things will be a lot of work.  I will be
>>> quite happy if we accomplish all of that in the Havana cycle.  I also
>>> believe that all of these tasks are prerequisites to applying a
>>>workflow
>>> library (or service) to these operations.  Once we have the code
>>>cleaned
>>> up, moved so that it's more central, then we can look at applying a
>>> workflow library as the next step.
>>>
>>> Iterative, forward progress!  :-)
>>>
>>> Thanks,
>>>
>>> --
>>> Russell Bryant
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list