[openstack-dev] [nova][libvirt] Block migrations and Cinder volumes

Robert Collins robertc at robertcollins.net
Mon Feb 16 00:39:21 UTC 2015


On 19 June 2014 at 20:38, Daniel P. Berrange <berrange at redhat.com> wrote:
> On Wed, Jun 18, 2014 at 11:09:33PM -0700, Rafi Khardalian wrote:
>> I am concerned about how block migration functions when Cinder volumes are
>> attached to an instance being migrated.  We noticed some unexpected
>> behavior recently, whereby attached generic NFS-based volumes would become
>> entirely unsparse over the course of a migration.  After spending some time
>> reviewing the code paths in Nova, I'm more concerned that this was actually
>> a minor symptom of a much more significant issue.
>>
>> For those unfamiliar, NFS-based volumes are simply RAW files residing on an
>> NFS mount.  From Libvirt's perspective, these volumes look no different
>> than root or ephemeral disks.  We are currently not filtering out volumes
>> whatsoever when making the request into Libvirt to perform the migration.
>>  Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC)
>> when a block migration is requested, which applied to the entire migration
>> process, not differentiated on a per-disk basis.  Numerous guards within
>> Nova to prevent a block based migration from being allowed if the instance
>> disks exist on the destination; yet volumes remain attached and within the
>> defined XML during a block migration.
>>
>> Unless Libvirt has a lot more logic around this than I am lead to believe,
>> this seems like a recipe for corruption.  It seems as though this would
>> also impact any type of volume attached to an instance (iSCSI, RBD, etc.),
>> NFS just happens to be what we were testing.  If I am wrong and someone can
>> correct my understanding, I would really appreciate it.  Otherwise, I'm
>> surprised we haven't had more reports of issues when block migrations are
>> used in conjunction with any attached volumes.
>
> Libvirt/QEMU has no special logic. When told to block-migrate, it will do
> so for *all* disks attached to the VM in read-write-exclusive mode. It will
> only skip those marked read-only or read-write-shared mode. Even that
> distinction is somewhat dubious and so not reliably what you would want.
>
> It seems like we should just disallow block migrate when any cinder volumes
> are attached to the VM, since there is never any valid use case for doing
> block migrate from a cinder volume to itself.
>
> Regards,
> Daniel

Just ran across this from bug
https://bugs.launchpad.net/nova/+bug/1398999. Is there some way to
signal to libvirt that some block devices shouldn't be migrated by it
but instead are known to be networked etc? Or put another way, how can
we have our cake and eat it too. Its not uncommon for a VM to be
cinder booted but have local storage for swap... and AIUI the fix we
put in for this bug stops those VM's being migrated. Do you think it
is tractable (but needs libvirt work), or is it something endemic to
the problem (e.g. dirty page synchronisation with the VM itself) that
will be in the way?

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud



More information about the OpenStack-dev mailing list