[openstack-dev] [Nova][Neutron][Technical Committee] nova-network -> Neutron. Throwing a wrench in the Neutron gap analysis
beagles at redhat.com
Fri Aug 8 20:54:19 UTC 2014
On Wed, Aug 06, 2014 at 01:40:28PM +0800, Tom Fifield wrote:
> >While DB migrations are running things like the nova metadata service
> >can/will misbehave - and user code within instances will be affected.
> >Thats arguably VM downtime.
> >OTOH you could define it more narrowly as 'VMs are not powered off' or
> >'VMs are not stalled for more than 2s without a time slice' etc etc -
> >my sense is that most users are going to be particularly concerned
> >about things for which they have to *do something* - e.g. VMs being
> >powered off or rebooted - but having no network for a short period
> >while vifs are replugged and the overlay network re-establishes itself
> >would be much less concerning.
> I think you've got it there, Rob - nicely put :)
> In many cases the users I've spoken to who are looking for a live path out
> of nova-network on to neutron are actually completely OK with some "API
> service" downtime (metadata service is an API service by their definition).
> A little 'glitch' in the network is also OK for many of them.
> Contrast that with the original proposal in this thread ("snapshot VMs in
> old nova-network deployment, store in Swift or something, then launch VM
> from a snapshot in new Neutron deployment") - it is completely unacceptable
> and is not considered a migration path for these users.
I've thought about this off and on since it was brought up at summit. I
have some concerns about expectations. While I could probably rattle on,
I'll stick to the two for now.
- We need to be clear with expectations with connection resets and other
odd connection behavior. There are some nice little gotchas for some
applications when an IP address is moved depending on how connection
is being used. Floating IPs could be interesting as well as
nova-network and neutron differ quite a bit in how they are
implemented. The ultimate effect on running applications will of
course depend on whether or not they can handle things of that nature.
Apps designed for failover, stale connections, etc, will probably fare
better than those that are not. Apps designed for cattle vms probably
will do okay too. I imagine pets will be higher risk and interestingly
enough, they seem to be a more likely target use case. I suppose this
falls under the category of "glitch", but the pessimist (realist?) in
me is having a hard time that some deployments are going to run into
problems... which is a nice segue into the next concern.
- I wonder about uncommunicated expectations with migration rollback in
case of the "all gone to hell, we need to put it back" situation. We
have been talking about migrating a live VM from nova-network to
neutron, but what about the way back? Are new VM boots going to be
prevented until an all-clear is given to prevent orphans if
nova-network needs to be put back in place? Or are we saying it is a
"never look back" type of deal? Has this been discussed and all
worked out and I just missed it? This concerns me a great deal because
cannot imagine any of the admins I've ever worked with doing something
without a failsafe backup to "known good" state whether the end up
needing it or not.
I'm not convinced that these have been thoroughly considered, nor are
they addressable in the very near future. I also am *deeply* concerned
that placing significant focus on this PRIOR to achieving parity with
nova-network both in function and stability jeopardizes all. That is not
to diminish the efforts of those that have already contributed heavily
in this area. However, this work is all for nothing if we haven't
covered the necessary gaps so that the users have something to migrate
More information about the OpenStack-dev