[openstack-dev] [Nova][Neutron] Thoughts on the nova<->neutron interface

Ian Wells ijw.ubuntu at cack.org.uk
Thu Jan 29 02:50:06 UTC 2015


On 28 January 2015 at 17:32, Robert Collins <robertc at robertcollins.net>
wrote:

> E.g. its a call (not cast) out to Neutron, and Neutron returns when
> the VIF(s) are ready to use, at which point Nova brings the VM up. If
> the call times out, we error.
>

I don't think this model really works with distributed systems, and it
really doesn't work when you have a limited number of threads to play with
- because they get consumed by anything that has to wait a long time for a
thing to happen, and eventually you can't service requests any more.  Also,
it's entirely opposite to what Nova does.  Does it return when the VM is
running?  No, it returns when the VM is requested, saying 'I note your
request and will act on it in my own time'.

What does Neutron have to do to complete a call?  That's entirely dependent
on the driver, but it could be talking to one, ten or a thousand devices,
any of which might be slow to respond: there is no upper bound on how long
it takes to bind a port, for instance.  So any REST call to Neutron should
change its DB and return, and leave an asynchronous process to deal with
making the network state change.  Neutron should notify Nova when it has
changed, and Nova can go on with its life doing other things till the
notification comes in.

Right now we have this mix of synchronous and async code, and its
> causing us to overlook things and have bugs. I'd be equally happy if
> we went all in with an async event driven approach, but we should
> decide if we're fish or fowl, not pick bits of both and hope reviewers
> can remember every little detail.
>

On this much we agree, I just happen to like fowl.

> > One other problem, not yet raised,  is that Nova doesn't express its
needs

> > when it asks for a port to be bound [...]


+1, OTOH I don't think this is a structural problem - it doesn't
> matter what protocol or calling style we use, this is just the
> parameters in the call :).
>

Agreed.  We just need to make it a proper negotiation, and that's it done.
No-one seems to have a problem with this, so I'll have a play with the idea
(out of tree for now, given the time of the cycle).


> I think your desire and Salvatore's are compatible: an interface that
> is excellent for Nova can also be excellent for other users.
>

Agreed.  But if there's one interface for everything it doesn't really need
to be a plugin.  The question is whether one interface is enough.


> Notifications aren't a complete solution to the orphaning issue unless
> the notification system is guaranteed non-lossy. Something like Kafka
> would be an excellent substrate for such a system, or we could look at
> per-service journalling (on either side of the integration point).
>

I prefer lossy notification systems.  RabbitMQ is non-lossy, and that means
it will sit on messages for days and then deliver them long past the point
at which they're useful, plus its queue depth is unbounded.  It's not a
great way to run an eventually consistent system, in my opinion.

The pattern I like is where you are notified (via an unreliable channel)
when your operation has completed, but you must also have a background
checking task that goes to see if the notification has gone missing by
checking the datamodel.  The task doesn't have to trigger very often, and
in fact you could hold it off indefinitely with a heartbeat providing the
communications channel remains functioning - but it does have to exist.
You don't have the problem of having to provide an infinite queue, and it's
not a crisis when your messaging system loses a message.
-- 
Ian.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150128/57bdc0fb/attachment.html>


More information about the OpenStack-dev mailing list