[openstack-dev] Nova workflow management update
Joshua Harlow
harlowja at yahoo-inc.com
Wed May 1 18:43:53 UTC 2013
I've started
https://wiki.openstack.org/wiki/TheBetterPathToLiveMigrationResizing and
will try to continue there.
The other aspect that makes me wonder is after we have conductor doing
stuff is how do we ensure that locking of what it is doing is done
correctly.
Say u have the following:
API call #1 -> resize instance X (lets call this action A)
API call #2 -> resize instance X (lets call this action B)
Now both of those happen in the same millisecond, so what happens (thought
game time!).
It would seem they attempt to mark something in the DB saying 'working on
X' by altering instance X's 'task/vm_state'. Ok so u can put a transaction
around said write to the 'task/vm_state' of instance X to avoid both of
those api calls attempting to continue doing the work. So that’s good. So
then lets say api #1 sends a message to some conductor Z asking it to do
the work via the MQ, that’s great, then the conductor Z starts doing work
on instance X and such.
So now the big iffy question that I have is what happens if conductor Z is
'killed' (say via error, exception, power failure, kill -9). What happens
to action A? How can another conductor be assigned the work to do action
A? Will there be a new periodic task to scan the DB for 'dead' actions,
how do we determine if an action is dead or just taking a very long time?
This 'liveness' issue is a big one that I think needs to be considered and
if conductor and zookeeper get connected, then I think it can be done.
Then the other big iffy stuff is how do we stop a third API call from
invoking a third action on a resource associated with instance X (say a
deletion of a volume) while the first api action is still being conducted,
just associating a instance level lock via 'task/vm_state' is not the
correct way to way to lock resources associated with instance X. This is
where zookeeper can come into play again (since its core design was built
for distributed locking) and it can be used to not only lock the instance
X 'task/vm_state' but all other resources associated with instance X (in a
reliable manner).
Thoughts?
On 5/1/13 10:56 AM, "John Garbutt" <john at johngarbutt.com> wrote:
>Hey,
>
>I think some lightweight sequence diagrams could make sense.
>
>On 29 April 2013 21:55, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
>> Any thoughts on how the current conductor db-activity works with this?
>> I can see two entry points to conductor:
>> DB data calls
>> |
>> ------------------------------------------Conductor-->RPC/DB calls to
>>do
>> this stuff
>> |
>> Workflow on behalf of something calls |
>> | |
>> ---------------------------------------------|
>>
>> Maybe its not a concern for 'H' but it seems one of those doesn¹t belong
>> there (cough cough DB stuff).
>
>Maybe for the next release. It should become obvious I guess. I hope
>those db calls will disappear once we pull the workflows properly into
>conductor and the other servers become more stateless (in terms of
>nova db state).
>
>Key question: Should the conductor be allowed to make DB calls? I think
>yes?
>
>> My writeup @ https://wiki.openstack.org/wiki/StructuredStateManagement
>>is
>> a big part of the overall goal I think, where I think the small
>>iterations
>> are part of this goal, yet likely both small and big goals will be
>> happening at once, so it would be useful to ensure that we talk about
>>the
>> bigger goal and make sure the smaller iteration goal will eventually
>> arrive at the bigger goal (or can be adjusted to be that way). Since
>>some
>> rackspace folks will also be helping out building the underlying
>> foundation (convection library) for the end-goal it would be great to
>>work
>> together and make sure all small iterations also align with that
>> foundational library work.
>
>Take a look at spawn in XenAPI, it is heading down this direction:
>https://github.com/openstack/nova/blob/master/nova/virt/xenapi/vmops.py#L3
>35
>
>I think we should just make a very small bit of the operation do
>rollback and state management, which is more just an exercise, and
>then start to pull more of the code into line as time progresses.
>Probably best done on something that has already been pulled into a
>conductor style job?
>
>> I'd be interested in what u think about moving the scheduler code
>>around,
>> since this also connects into some work the cisco folks want to do for
>> better scheduling, so that is yet another coordination of work that
>>needs
>> to happen (to end up at the same end-goal there as well).
>
>Yes, I think its very related. I see this kind of thing:
>
>API --cast--> Conductor --call--> scheduler
> --call--> compute
> --call-->.....
> --db--> finally state update shows
>completion of task
>
>Eventually the whole workflow, its persistence and rollback will be
>controlled by the new framework. In the first case we may just make
>sure the resource assignment gets rolled back if the call after the
>schedule fails, and we correctly try to call the scheduler again? The
>current live-migration scheduling code sort of does this kind of thing
>already.
>
>> I was thinking that documenting the current situation, possibly @
>> https://wiki.openstack.org/wiki/TheBetterPathToLiveMigration would help.
>> Something like https://wiki.openstack.org/wiki/File:Run_workflow.png
>>might
>> help to easily visualize the current and fixed 'flow'/thread of
>>execution.
>
>Seems valuable. I will do something for live-migration one before
>starting on that. I kinda started on this (in text form) when I was
>doing the XenAPI live-migration:
>https://wiki.openstack.org/wiki/XenServer/LiveMigration#Live_Migration_RPC
>_Calls
>
>We should probably do one for resize too.
>
>John
More information about the OpenStack-dev
mailing list