[Openstack-operators] resizing an instance

Rafi Khardalian rafi at metacloud.com
Tue May 21 22:53:38 UTC 2013


Unfortunately resizes are much worse than what's being described here.

Aside from the SSH, which we all agree has a number of associated problems,
there are 3 separate image conversions which occur (assuming you are using
raw backed qcow, which is the default).  The first conversion flattens the
image, eliminating the backing file.  This occurs on the origin hypervisor,
with the result being SCP'd to the destination.  Once complete, the second
conversion kicks off, which converts from qcow2 to raw.  This is done in an
attempt to resize the filesystem contained within the image.  This resize
will fail unless you're using AMIs, as there is no logic in place to resize
images containing partitions.  The third conversion occurs as the raw is
converted back into qcow2.

Until recently, there was absolutely nothing taking shared storage into
account.  In fact, if you are using shared storage, be absolutely certain
to apply the patch associated with this bug (
https://bugs.launchpad.net/nova/+bug/1177247).  Otherwise, there are a
number of cases where *you will lose data*.  If you want to go a step
further, you're welcome to apply my patch to make the entire process a lot
more efficient (https://gist.github.com/rmk40/ab2c6f518a7a40a261af).  The
patch copies the disk file around untouched and relies on code introduced
in Grizzly to repopulate the backing files on the destination via Glance.

The good news is, we know how much work migrate/resize needs and are
committed to fixing it in the Havana cycle.  I haven't submitted the
aforementioned patch for this very reason, we're overhauling the code path
entirely.  Nonetheless, Grizzly represents stable today, so if you're doing
resizes on a regular basis, take a look at the patch.  It has zero chance
of going into stable/grizzly, because of the constraints around how the
stable tree is managed.

- Rafi


On Tue, May 21, 2013 at 3:24 PM, Joshua Harlow <harlowja at yahoo-inc.com>wrote:

>  Hi Ryan,
>
>  Yes it is a little bit weird,  I think the reasons its doing this is
> likely due to it being the most generic solution to the resizing problem. I
> believe there is a flag like 'resize_same_host' but I haven't used it that
> might help.
>
>  I think your concern is valid though and could/should(?) be fixed.
>
>  You could imagine the scheduler 'preferring' the origin compute and the
> origin compute node in this case doing a 'fast path' resize path that would
> trigger the compute manager to just move some folders around (instead of
> invoking the ssh sequence you talked about). This would seem to make sense
> to me and would avoid the 'slow path' of actually moving the instance (and
> disks and so on) to a new node since the origin node doesn't have enough
> space/cpu/memory… That would make sense to me and likely with beefy enough
> compute nodes the 'fast path' would be the common case.
>
>  -Josh
>
>   From: Ryan Lane <rlane at wikimedia.org>
> Date: Tuesday, May 21, 2013 2:25 PM
> To: Juan José Pavlik Salles <jjpavlik at gmail.com>
> Cc: "openstack-operators at lists.openstack.org" <
> openstack-operators at lists.openstack.org>
> Subject: Re: [Openstack-operators] resizing an instance
>
>   On Tue, May 21, 2013 at 2:20 PM, Juan José Pavlik Salles <
> jjpavlik at gmail.com> wrote:
>
>> I'm not sure about your deployment, but i noticed that when i try to
>> resize a vm nova also tries to move the vm to another compute-node, so if
>> you don't have a shared storage for the VMs this is not possible (unless
>> your compute-node can ssh to the other compute-node ofcourse). I found in
>> my logs thigs like "*ssh root at node2 mkdir -p
>> /var/lib/nova/instances/instances_dir"* when trying to resize a VM.
>> Tomorrow i'll trying resizing with shared storage and i'll let you know.
>>
>>
>  Yes. This is the weirdest behavior. Why in the world is it necessary to
> move the instance to another compute node just to do a resize? It requires
> ssh between instances, makes the process *much* slower and also makes it
> way more error prone. I don't get it. This seems like a really convoluted
> way of handling resizes.
>
>  Is there some really great reasoning behind this?
>
>  - Ryan
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130521/bdacf27e/attachment.html>


More information about the OpenStack-operators mailing list