[openstack-dev] [nova][neutron] Migration from nova-network to Neutron for large production clouds
Michael Still
mikal at stillhq.com
Mon Aug 25 21:38:25 UTC 2014
On Thu, Aug 21, 2014 at 1:17 AM, Tim Bell <Tim.Bell at cern.ch> wrote:
> Michael has been posting very informative blogs on the summary of the
> mid-cycle meetups for Nova. The one on the Nova Network to Neutron
> migration was of particular interest to me as it raises a number of
> potential impacts for the CERN production cloud. The blog itself is at
> http://www.stillhq.com/openstack/juno/000014.html
>
>
>
> I would welcome suggestions from the community on the approach to take and
> areas that the nova/neutron team could review to limit the impact on the
> cloud users.
>
>
>
> For some background, CERN has been running nova-network in flat DHCP mode
> since our first Diablo deployment. We moved to production for our users in
> July last year and are currently supporting around 70,000 cores, 6 cells,
> 100s of projects and thousands of VMs. Upgrades generally involve disabling
> the API layer while allowing running VMs to carry on without disruption.
> Within the time scale of the migration to Neutron (M release at the
> latest), these numbers are expected to double.
>
>
>
> For us, the concerns we have with the ‘cold’ approach would be on the user
> impact and operational risk of such a change. Specifically,
>
>
>
> 1. A big bang approach of shutting down the cloud, upgrade and the
> resuming the cloud would cause significant user disruption
>
> 2. The risks involved with a cloud of this size and the open source
> network drivers would be difficult to mitigate through testing and could
> lead to site wide downtime
>
> 3. Rebooting VMs may be possible to schedule in batches but would
> need to be staggered to keep availability levels
>
>
>
> Note, we are not looking to use Neutron features initially, just to find a
> functional equivalent of the flat DHCP network.
>
>
>
> We would appreciate suggestions on how we could achieve a smooth migration
> for the simple flat DHCP models.
>
>
Thanks for sending this Tim. Sorry for my slow reply, a day long meeting
and some international travel got in the way. When we originally talked, I
said I needed to understand more of the background to your need for a "zero
downtime" upgrade. That said...
Mark McClain and I discussed a possible plan for nova-network to neutron
upgrades at the Ops Meetup today, and it seemed generally acceptable. It
defines a "cold migration" as freezing the ability to create or destroy
instances during the upgrade, and then requiring a short network outage for
each instance in the cell.
This is why I'm trying to understand the "no downtime" use case better. Is
it literally no downtime, ever? Or is it a more simple "no simultaneous
downtime for instances"?
Michael
--
Rackspace Australia
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140826/9560db2d/attachment.html>
More information about the OpenStack-dev
mailing list