[openstack-dev] [Nova] no-db-compute, a new service

Russell Bryant rbryant at redhat.com
Fri Nov 9 18:04:50 UTC 2012


Greetings,

Dan Smith and I are getting pretty close to finishing the first stage of
no-db-compute work for Grizzly.  Specifically, that means these two things:

1) Sending more data from the nova-api service to avoid db reads in
nova-compute.

2) Pulling db access out of nova virt drivers and having the only code
in nova-compute that touches the db in nova/compute/manager.py.

Removing the remaining db access is going to take some more drastic
changes.  This is what the majority of our discussion at the design
summit was about.  The idea of a db writer service (nova-sink) was
discussed, but the majority opinion seemed to be that if we're going to
make a big change to how nova-compute service, the "nova-sink" idea was
not enough.  I'd like to kick that discussion off again so that we can
settle on a specific direction forward.


Here's a proposal based on the suggestions from the design summit
session.  The intent is to generate some discussion around this so that
we can keep writing code knowing that we're generally headed in a
direction that others are happy with.

How about we create a new service called "nova-compute-proxy".  In
short, I would envision it looking very much like "nova-sink" in the
short term, but evolving into much more over time as we carefully rework
how some operations are handled.

This service would act as a proxy for nova-compute in a couple of ways.

1) The nova-compute service would use it as a proxy to accomplish
certain tasks, such as targeted operations that need database access.

2) Over time, it would be used as a proxy for other services that wish
to execute some sort of compute action.  For example, instead of
directly asking nova-compute to complete a long running operation, the
nova-compute-proxy would take this operation and monitor its progress.


Architecture notes:

The nova-compute-proxy service must be horizontally scalable.  You can
run as many of them as needed to scale out.  There is not a 1 to 1
relationship with nova-compute services.  While nova-compute services
have ownership of specific instances, this is not true of
nova-compute-proxy.


Evolution:

The most immediate short term goal is to remove database accesses from
the nova-compute service.  Certain database accesses will just be pushed
into nova-compute-proxy and exposed as methods that can be executed
using rpc.call().  One such example will be instance_update() since that
is used in many places in nova-compute.  I expect a lot of this to be
done in the near term.

Operations that are top candidates for being orchestrated by
nova-compute-proxy are operations that cross multiple compute nodes,
such as migrations and resizes.  However, there may be benefit in doing
this for other long running operations, such as starting an instance.
We should look into doing as much of this as is practical in this
development cycle, but I suspect much of this would carry over into H
development.


Some questions, complications, vagueness:

a) There is still a bit of hand waving around the more significant
functionality will work in nova-compute-proxy (something like resizes).
 It seems to make sense from a high level, but I haven't tried mapping
out exact order of operations or anything.  Does it seem to make sense,
or do we need to dive deeper?

b) How about periodic tasks?  Right now nova-compute does a number of
periodic tasks, mostly associated with cleanup for the local instances.
 If long term we want to simplify nova-compute, where do these go?
Short term, for no-db-compute, we can leave them where they are and just
use nova-compute-proxy for db access as needed.  It feels like we need a
good long term vision here though, and I'm not sure what it is.

If nova-compute-proxy was directly tied to N instances of nova-compute,
nova-compute-proxy could just do the periodic tasks.  I was hoping we
could not have to tie instances of the two services together because it
seems to leave the architecture more flexible and less complicated.

Is the short-term approach for periodic tasks acceptable for now?

What should the long term plan be?  And does that have an impact on what
we do right now?


Thoughts?

-- 
Russell Bryant



More information about the OpenStack-dev mailing list