[openstack-dev] [nova] Adding temporary code to nova to work around bugs in system utilities
Jay Pipes
jaypipes at gmail.com
Mon Dec 8 20:19:40 UTC 2014
On 12/03/2014 04:00 AM, Tony Breeds wrote:
> Hi All,
> I'd like to accomplish 2 things with this message:
> 1) Unblock (one way or another) https://review.openstack.org/#/c/123957
> 2) Create some form of consensus on when it's okay to add temporary code to
> nova to work around bugs in external utilities.
>
> So some background on this specific issue. The issue was first reported in
> July 2014 at [1] and then clarified at [2]. The synopsis of the bug is that
> calling qemu-img convert -O raw /may/ generate a corrupt output file if the
> source image isn't fully flushed to disk. The coreutils folk discovered
> something similar in 2011 *sigh*
>
> The clear and correct solution is to ensure that qemu-img uses
> FIEMAP_FLAG_SYNC. This in turn produces a measurable slowdown in that code
> path, so additionally it's best if qemu-img uses an alternate method to
> determine data status in a disk image. This has been done and will be included
> in qemu 2.2.0 when it's released. These fixes prompted a more substantial
> rework of that code in qemu. Which is awesome but not *required* to fix the
> bug in qemu.
>
> While we wait for $distros to get the fixed qemu nova is still vulnerable to
> the bug. To that end I proposed a work around in nova that forces images
> retrieved from glance to disk with an fsync() prior to calling qemu-img on
> them. I admit that this is ugly and has a performance impact.
>
> In order to reduce the impact of the fsync() I considered:
> 1) Testing the qemu version and only fsync()ing on affected versions.
> - Vendors will backport the fix to there version of qemu. The fixed version
> will still claim to be 2.1.0 (for example) and therefore trigger the
> fsync() when not required. Given how unreliable this will be I dismissed
> it as an option
>
> 2) API Change
> - In the case of this specific bug we only need to fsync() in certain
> scenarios. It would be easy to add a flag to IMAGE_API.download() to
> determine if this fsync() is required. This has the nice property of only
> having a performance impact in the suspect case (personally I'll take
> slow-and-correct over fast-and-buggy any day). My hesitation is that
> after we've modified the API it's very hard to remove that change when we
> decide the work around is redundant.
>
> 3) Config file option
> - For many of the same reasons as the API change this seemed like a bad
> idea.
>
> Does anyone have any other ideas?
>
> One thing that I haven't done is measure the impact of the fsync() on any
> reasonable workload. This is mainly because I don't really know how. Sure I
> could do some statistics in devstack but I don't really think they'd be
> meaningful. Also the size of the image in glance is fairly important. An
> fsync() of an 100Gb image is many times more painful than an 1Gb image.
>
> While in Paris I was asked to look at other code paths in nova where we use
> qemu-img convert. I'm doing this analysis. To date I have some suspicions
> that snapshot (and migration) are affected, but no data that confirms or
> debases that. I continue to look at the appropriate code in nova, libvirt and
> qemu.
>
> I understand that there is more work to be done in this area, and I'm happy to
> do it. Having said that from where I sit that work is not directly related to
> the bug that started this.
>
> As the idea is to remove this code as soon as all the distros we care about
> have a fixed qemu I started an albeit brief discussion here[3] on which distros
> are in that list. Armed with that list I have opened (or am in the process of
> opening) bugs for each version of each distribution to make them aware of the
> issue and the fix. I have a status page at [4].
>
> okay I think I'm done raving.
>
> So moving forward:
>
> 1) So what should I do with the open review?
I reviewed the patch. I don't mind the idea of a [workarounds] section
of configuration options, but I had an issue with where that code was
executed.
> 2) What can we learn from this in terms of how we work around key utilities
> that are not in our direct power to change.
> - Is taking ugly code for "some time" okay? I understand that this is a
> complex issue as we're relying on $developer to be around (or leave enough
> information for those that follow) to determine when it's okay to remove
> the ugliness.
I think it would be fine to have a [workarounds] config section for just
this purpose.
Best,
-jay
More information about the OpenStack-dev
mailing list