<div dir="ltr">I am concerned about how block migration functions when Cinder volumes are attached to an instance being migrated. We noticed some unexpected behavior recently, whereby attached generic NFS-based volumes would become entirely unsparse over the course of a migration. After spending some time reviewing the code paths in Nova, I'm more concerned that this was actually a minor symptom of a much more significant issue.<div>
<br></div><div>For those unfamiliar, NFS-based volumes are simply RAW files residing on an NFS mount. From Libvirt's perspective, these volumes look no different than root or ephemeral disks. We are currently not filtering out volumes whatsoever when making the request into Libvirt to perform the migration. Libvirt simply receives an additional flag (VIR_MIGRATE_NON_SHARED_INC) when a block migration is requested, which applied to the entire migration process, not differentiated on a per-disk basis. Numerous guards within Nova to prevent a block based migration from being allowed if the instance disks exist on the destination; yet volumes remain attached and within the defined XML during a block migration.</div>
<div><br></div><div>Unless Libvirt has a lot more logic around this than I am lead to believe, this seems like a recipe for corruption. It seems as though this would also impact any type of volume attached to an instance (iSCSI, RBD, etc.), NFS just happens to be what we were testing. If I am wrong and someone can correct my understanding, I would really appreciate it. Otherwise, I'm surprised we haven't had more reports of issues when block migrations are used in conjunction with any attached volumes.</div>
<div><br></div><div>I have ideas on how we can address the issue if we can reach some consensus that the issue is valid, but we'll discuss those when if/when we get to that point.</div><div><br></div><div>Regards,</div>
<div>Rafi</div></div>