[openstack-dev] [Nova][Neutron][Technical Committee] nova-network -> Neutron. Throwing a wrench in the Neutron gap analysis

Brent Eagles beagles at redhat.com
Fri Aug 8 20:54:19 UTC 2014

On Wed, Aug 06, 2014 at 01:40:28PM +0800, Tom Fifield wrote:
> >While DB migrations are running things like the nova metadata service
> >can/will misbehave - and user code within instances will be affected.
> >Thats arguably VM downtime.
> >
> >OTOH you could define it more narrowly as 'VMs are not powered off' or
> >'VMs are not stalled for more than 2s without a time slice' etc etc -
> >my sense is that most users are going to be particularly concerned
> >about things for which they have to *do something* - e.g. VMs being
> >powered off or rebooted - but having no network for a short period
> >while vifs are replugged and the overlay network re-establishes itself
> >would be much less concerning.
> I think you've got it there, Rob - nicely put :)
> In many cases the users I've spoken to who are looking for a live path out
> of nova-network on to neutron are actually completely OK with some "API
> service" downtime (metadata service is an API service by their definition).
> A little 'glitch' in the network is also OK for many of them.
> Contrast that with the original proposal in this thread ("snapshot VMs in
> old nova-network deployment, store in Swift or something, then launch VM
> from a snapshot in new Neutron deployment") - it is completely unacceptable
> and is not considered a migration path for these users.
> Regards,
> Tom

I've thought about this off and on since it was brought up at summit. I
have some concerns about expectations. While I could probably rattle on,
I'll stick to the two for now.

- We need to be clear with expectations with connection resets and other
  odd connection behavior. There are some nice little gotchas for some
  applications when an IP address is moved depending on how connection
  is being used. Floating IPs could be interesting as well as
  nova-network and neutron differ quite a bit in how they are
  implemented. The ultimate effect on running applications will of
  course depend on whether or not they can handle things of that nature.
  Apps designed for failover, stale connections, etc, will probably fare
  better than those that are not. Apps designed for cattle vms probably
  will do okay too. I imagine pets will be higher risk and interestingly
  enough, they seem to be a more likely target use case. I suppose this
  falls under the category of "glitch", but the pessimist (realist?) in
  me is having a hard time that some deployments are going to run into
  problems... which is a nice segue into the next concern.

- I wonder about uncommunicated expectations with migration rollback in
  case of the "all gone to hell, we need to put it back" situation. We
  have been talking about migrating a live VM from nova-network to
  neutron, but what about the way back? Are new VM boots going to be
  prevented until an all-clear is given to prevent orphans if
  nova-network needs to be put back in place? Or are we saying it is a
  "never look back" type of deal? Has this  been discussed and all
  worked out and I just missed it? This concerns me a great deal because
  cannot imagine any of the admins I've ever worked with doing something
  without a failsafe backup to "known good" state whether the end up
  needing it or not.

I'm not convinced that these have been thoroughly considered, nor are
they addressable in the very near future. I also am *deeply* concerned
that placing significant focus on this PRIOR to achieving parity with
nova-network both in function and stability jeopardizes all. That is not
to diminish the efforts of those that have already contributed heavily
in this area. However, this work is all for nothing if we haven't
covered the necessary gaps so that the users have something to migrate



More information about the OpenStack-dev mailing list