Open Stack

Wed Aug 29 19:49:24 UTC 2018

I've not followed all the arguments here regarding internals but CERN's background usage of Cells v2 (and thoughts on impact of cross cell migration) is below. Some background at https://www.openstack.org/videos/vancouver-2018/moving-from-cellsv1-to-cellsv2-at-cern. Some rough parameters with the team providing more concrete numbers if needed....

- The VMs to be migrated are not generally not expensive configurations, just hardware lifecycles where boxes go out of warranty or computer centre rack/cooling needs re-organising. For CERN, this is a 6-12 month frequency of ~10,000 VMs per year (with a ~30% pet share)
- We make a cell from identical hardware at a single location, this greatly simplifies working out hardware issues, provisioning and management
- Some cases can be handled with the 'please delete and re-create'. Many other cases need much user support/downtime (and require significant effort or risk delaying retirements to get agreement)
- When a new hardware delivery is made, we would hope to define a new cell (as it is a different configuration)
- Depending on the facilities retirement plans, we would work out what needed to be moved to new resources
- There are many different scenarios for migration (either live or cold)
-- All instances in the old cell would be migrated to the new hardware which would have sufficient capacity
-- All instances in a single cell would be migrated to several different cells such as the new cells being smaller
-- Some instances would be migrated because those racks need to be retired but other servers in the cell would remain for a further year or two until retirement was mandatory

With many cells and multiple locations, spreading the hypervisors across the cells in anticipation of potential migrations is unattractive.

From my understanding, these models were feasible with Cells V1.

We can discuss further, at the PTG or Summit, on the operational flexibility which we have taken advantage of so far and alternative models.

Tim

-----Original Message-----
From: Dan Smith <dms at danplanet.com>
Date: Wednesday, 29 August 2018 at 18:47
To: Jay Pipes <jaypipes at gmail.com>
Cc: "openstack-operators at lists.openstack.org" <openstack-operators at lists.openstack.org>
Subject: Re: [Openstack-operators] [nova][cinder][neutron] Cross-cell cold	migration

    > A release upgrade dance involves coordination of multiple moving
    > parts. It's about as similar to this scenario as I can imagine. And
    > there's a reason release upgrades are not done entirely within Nova;
    > clearly an external upgrade tool or script is needed to orchestrate
    > the many steps and components involved in the upgrade process.

    I'm lost here, and assume we must be confusing terminology or something.

    > The similar dance for cross-cell migration is the coordination that
    > needs to happen between Nova, Neutron and Cinder. It's called
    > orchestration for a reason and is not what Nova is good at (as we've
    > repeatedly seen)

    Most other operations in Nova meet this criteria. Boot requires
    coordination between Nova, Cinder, and Neutron. As do migrate, start,
    stop, evacuate. We might decide that (for now) the volume migration
    thing is beyond the line we're willing to cross, and that's cool, but I
    think it's an arbitrary limitation we shouldn't assume is
    impossible. Moving instances around *is* what nova is (supposed to be)
    good at.

    > The thing that makes *this* particular scenario problematic is that
    > cells aren't user-visible things. User-visible things could much more
    > easily be orchestrated via external actors, as I still firmly believe
    > this kind of thing should be done.

    I'm having a hard time reconciling these:

    1. Cells aren't user-visible, and shouldn't be (your words and mine).
    2. Cross-cell migration should be done by an external service (your
       words).
    3. External services work best when things are user-visible (your words).

    You say the user-invisible-ness makes orchestrating this externally
    difficult and I agree, but...is your argument here just that it
    shouldn't be done at all?

    >> As we discussed in YVR most recently, it also may become an important
    >> thing for operators and users where expensive accelerators are committed
    >> to instances with part-time usage patterns.
    >
    > I don't think that's a valid use case in respect to this scenario of
    > cross-cell migration.

    You're right, it has nothing to do with cross-cell migration at all. I
    was pointing to *other* legitimate use cases for shelve.

    > Also, I'd love to hear from anyone in the real world who has
    > successfully migrated (live or otherwise) an instance that "owns"
    > expensive hardware (accelerators, SR-IOV PFs, GPUs or otherwise).

    Again, the accelerator case has nothing to do with migrating across
    cells, but merely demonstrates another example of where shelve may be
    the thing operators actually desire. Maybe I shouldn't have confused the
    discussion by bringing it up.

    > The patterns that I have seen are one of the following:
    >
    > * Applications don't move. They are pets that stay on one or more VMs
    > or baremetal nodes and they grow roots.
    >
    > * Applications are designed to *utilize* the expensive hardware. They
    > don't "own" the hardware itself.
    >
    > In this latter case, the application is properly designed and stores
    > its persistent data in a volume and doesn't keep state outside of the
    > application volume. In these cases, the process of "migrating" an
    > instance simply goes away. You just detach the application persistent
    > volume, shut down the instance, start up a new one elsewhere (allowing
    > the scheduler to select one that meets the resource constraints in the
    > flavor/image), attach the volume again and off you go. No messing
    > around with shelving, offloading, migrating, or any of that nonsense
    > in Nova.

    Jay, you know I sympathize with the fully-ephemeral application case,
    right? Can we agree that pets are a thing and that migrations are not
    going to be leaving Nova's scope any time soon? If so, I think we can
    get back to the real discussion, and if not, I think we probably, er,
    can't :)

    > We should not pretend that what we're discussing here is anything
    > other than hacking orchestration workarounds into Nova to handle
    > poorly-designed applications that have grown roots on some hardware
    > and think they "own" hardware resources in a Nova deployment.

    I have no idea how we got to "own hardware resources" here. The point of
    this discussion is to make our instance-moving operations work across
    cells. We designed cellsv2 to be invisible and baked into the core of
    Nova. We intended for it to not fall into the trap laid by cellsv1,
    where the presence of multiple cells meant that a bunch of regular
    operations don't work like they would otherwise.

    If we're going to discuss removing move operations from Nova, we should
    do that in another thread. This one is about making existing operations
    work :)

    > If that's the case, why are we discussing shelve at all? Just stop the
    > instance, copy/migrate the volume data (if needed, again it completely
    > depends on the deployment, network topology and block storage
    > backend), to a new location (new cell, new AZ, new host agg, does it
    > really matter?) and start a new instance, attaching the volume after
    > the instance starts or supplying the volume in the boot/create
    > command.

    Because shelve potentially makes it less dependent on the answers to
    those questions and Matt suggested it as a first step to being able to
    move things around at all. It means that "copy the data" becomes "talk
    to glance" which compute nodes can already do. Requiring compute nodes
    across cells to talk to each other (which could be in different
    buildings, sites, or security domains) is a whole extra layer of
    complexity. I do think we'll go there (via resize/migrate at some point,
    but shelve going through glance for data and through a homeless phase in
    Nova does simplify a whole set of things.

    > The admin only "owns" the instance because we have no ability to
    > transfer ownership of the instance and a cell isn't a user-visible
    > thing. An external script that accomplishes this kind of orchestrated
    > move from one cell to another could easily update the ownership of
    > said instance in the DB.

    So step 5 was "do surgery on the database"? :)

    > My point is that Nova isn't an orchestrator, and building
    > functionality into Nova to do this type of cross-cell migration IMHO
    > just will lead to even more unmaintainable code paths that few, if
    > any, deployers will ever end up using because they will end up doing
    > it externally anyway due to the need to integrate with backend
    > inventory management systems and other things.

    On the contrary, per the original goal of cellsv2, I want to make the
    *existing* code paths in Nova work properly when multiple cells are
    present. Just like we had to make boot and list work properly with
    multiple cells, I think we need to do the same with migrate, shelve,
    etc.

    --Dan

    _______________________________________________
    OpenStack-operators mailing list
    OpenStack-operators at lists.openstack.org
    http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators

Open Stack

[Openstack-operators] [nova][cinder][neutron] Cross-cell cold migration

OpenStack

Community

Documentation

Branding & Legal