[openstack-dev] Nova workflow management update

John Garbutt john at johngarbutt.com
Wed May 1 17:56:00 UTC 2013


Hey,

I think some lightweight sequence diagrams could make sense.

On 29 April 2013 21:55, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
> Any thoughts on how the current conductor db-activity works with this?
> I can see two entry points to conductor:
> DB data calls
>   |
>   ------------------------------------------Conductor-->RPC/DB calls to do
> this stuff
>                                                |
> Workflow on behalf of something calls          |
>   |                                            |
>   ---------------------------------------------|
>
> Maybe its not a concern for 'H' but it seems one of those doesn¹t belong
> there (cough cough DB stuff).

Maybe for the next release. It should become obvious I guess. I hope
those db calls will disappear once we pull the workflows properly into
conductor and the other servers become more stateless (in terms of
nova db state).

Key question: Should the conductor be allowed to make DB calls? I think yes?

> My writeup @ https://wiki.openstack.org/wiki/StructuredStateManagement is
> a big part of the overall goal I think, where I think the small iterations
> are part of this goal, yet likely both small and big goals will be
> happening at once, so it would be useful to ensure that we talk about the
> bigger goal and make sure the smaller iteration goal will eventually
> arrive at the bigger goal (or can be adjusted to be that way). Since some
> rackspace folks will also be helping out building the underlying
> foundation (convection library) for the end-goal it would be great to work
> together and make sure all small iterations also align with that
> foundational library work.

Take a look at spawn in XenAPI, it is heading down this direction:
https://github.com/openstack/nova/blob/master/nova/virt/xenapi/vmops.py#L335

I think we should just make a very small bit of the operation do
rollback and state management, which is more just an exercise, and
then start to pull more of the code into line as time progresses.
Probably best done on something that has already been pulled into a
conductor style job?

> I'd be interested in what u think about moving the scheduler code around,
> since this also connects into some work the cisco folks want to do for
> better scheduling, so that is yet another coordination of work that needs
> to happen (to end up at the same end-goal there as well).

Yes, I think its very related. I see this kind of thing:

API --cast--> Conductor --call--> scheduler
                                    --call--> compute
                                    --call-->.....
                                    --db--> finally state update shows
completion of task

Eventually the whole workflow, its persistence and rollback will be
controlled by the new framework. In the first case we may just make
sure the resource assignment gets rolled back if the call after the
schedule fails, and we correctly try to call the scheduler again? The
current live-migration scheduling code sort of does this kind of thing
already.

> I was thinking that documenting the current situation, possibly @
> https://wiki.openstack.org/wiki/TheBetterPathToLiveMigration would help.
> Something like https://wiki.openstack.org/wiki/File:Run_workflow.png might
> help to easily visualize the current and fixed 'flow'/thread of execution.

Seems valuable. I will do something for live-migration one before
starting on that. I kinda started on this (in text form) when I was
doing the XenAPI live-migration:
https://wiki.openstack.org/wiki/XenServer/LiveMigration#Live_Migration_RPC_Calls

We should probably do one for resize too.

John



More information about the OpenStack-dev mailing list