[openstack-dev] Nova workflow management update
john at johngarbutt.com
Mon Apr 29 18:04:28 UTC 2013
Sorry I was still traveling last week, but I am keen to help with this
too. Although I have other stuff at the top of my list right now
(getting the agent and config drive separation much cleaner inside
XenAPI and looking into the set/get-password situation)
I did some work on getting live-migrate working across libvirt and
XenAPI not too long ago, so I am keen to help with this one:
I am +1 to Russell's suggestion of small iterations, and we should
co-ordinate to make sure we don't waste effort.
I think Dan's blueprints represent how I saw the best way forward:
- move cold migrate (and resize?) into conductor
- make api call conductor to start the process
- make above call out to scheduler to get the suggested nodes,
rather than rpc to scheduler
- move code in scheduler into the conductor
- then look at moving stuff from compute manger into the scheduler
- should then have a single thread of execution call out to people
who need to do things, with call backs as required
- move live migrate into conductor, where possible merging with the
above code, maybe just moving to conductor first.
- move towards restructuring the above code with more 'orchestration logic'
- look at moving evacuate into the above structure (i.e. 'dead' migration)
On 27 April 2013 16:43, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
> Thx Russell,
> Now we just need to keep this momentum on fixing all of this going forward. Likely a ton of coordination will be required to achieve the biggest bang for the buck.
> Ntt and y! will do what we can to drive this prototype (whichever pieces get accepted...) into reality (and get it reviewed in small chunks...) and will likely be helping driving the core workflow library into reality (which hopefully will be the foundation for the future state fixing tasks/reworking workflows...).
> The nova core seems like it understands where it wants the conductor to go. I am still iffy on mixing the db stuff with conductor like stuff, due to scaling and functional differences, but that I think was mentioned/discussed in the last meeting. It might help the larger work that is undergoing if more of the conductor vision/plan were documented which would help us all get on the same page (and know where to place it in the larger work). Perhaps this can be a todo? This will help avoid future conflicts and misunderstanding if there is time put upfront to make this clear to all involved parties.
> Overall, it gets very interesting since these are all working along the same goals (and will likely be happening simultaneously) and must be driven toward the same finish line...
> Likely we all should keep in close touch to achieve maximum benefit... What do u think?
> Sent from my really tiny device...
> On Apr 27, 2013, at 7:45 AM, "Russell Bryant" <rbryant at redhat.com> wrote:
>> On 04/25/2013 08:08 PM, Joshua Harlow wrote:
>>> Since I wanted to make sure everyone was aware of this, since some of
>>> you might have missed the summit session and I'd like discussions so we
>>> can land code in havana.
>>> For those that missed the session & associated material.
>>> - https://etherpad.openstack.org/the-future-of-orch (session details +
>>> discussion …)
>> So my take on all of this is:
>> The goals here are good. Having better tracking of state through long
>> running opertions is A Good Thing, and we should continue to take steps
>> in that direction.
>> The trick with a large effort in an extremely active open source project
>> is how to approach it in an iterative manner so that the patches have a
>> chance of going in.
>> I believe that the specific operations that would benefit the most from
>> this type of improvement are migrate/live-migrate/resize/evacuate.
>> These operations are complex long running tasks. Further, they involve
>> multiple compute nodes. Right now the flow of control is passed around
>> and not tracked as well as it could be. We should improve it, which
>> includes some of the ideas from this proposal (at some point down the path).
>> So, here is my specific proposal for a first set of tasks to complete in
>> 1) At the design summit, we spent a session discussing the state of
>> these code paths. The *first* problem we have to solve with them is the
>> fact that they are quite separate code paths, when much of it can be and
>> should be combined.
>> This is a cleanup task, but a fairly complex one. Combining these in
>> the areas where it makes sense will also aid us in getting effective
>> test coverage over these operations. Right now we have poor coverage
>> here and they are at high risk for breaking over time.
>> Some summit discussion notes here:
>> 2) At the same time as #1, one thing that would be a helpful design
>> artifact is some diagrams that show the flow of how these operations
>> work today. I spoke with Joshua Harlow about this and it sounded like
>> something he would be willing to work on. I think many people would
>> benefit with having a diagram that explains these operations.
>> 3) Take a look at shifting the flow of control from distributed among
>> services and compute nodes to a more centralized control. This task is
>> taking the code, much as it exists today, but moving it around.
>> Specifically, this will probably mean having nova-conductor 'conduct'
>> these operations.
>> Havana blueprint (with a couple child blueprints) for reference:
>> I believe that all of those things will be a lot of work. I will be
>> quite happy if we accomplish all of that in the Havana cycle. I also
>> believe that all of these tasks are prerequisites to applying a workflow
>> library (or service) to these operations. Once we have the code cleaned
>> up, moved so that it's more central, then we can look at applying a
>> workflow library as the next step.
>> Iterative, forward progress! :-)
>> Russell Bryant
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
More information about the OpenStack-dev