[Openstack-security] [Bug 1350766] Re: Race condition: compute intermittently corrupts base images on download from glance

Michael Steffens michael_steffens at posteo.de
Mon Aug 4 11:41:00 UTC 2014


** Description changed:

  Under certain conditions, which I happen to meet often on my Icehouse
  single node setup, uploaded images or snapshots fail to boot. See also
  https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a
  -snapshot-from-a-running-instance/
  
  Reason: When first instantiating a QCOW2 image, it's
  
  (1)  downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part
  (2)  converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img
  
  The step (1) is performed in nova/image/glance.py,
  GlanceImageService.download using buffered IO, which does not guarantee
  the resulting data to be written to disk on file close. Consequently,
  the source image file may not be written completely when qemu-img sub-
  process starts reading in step (2). Whether the result is good or bad
  depends on speed of download, file size, and how quickly qemu-image can
  digest its input.
  
  Proposed fix: enforce fsync on output File object before returning from
  download. Patch attached.
+ 
+ Security considerations:
+ 
+  * Due to the race between resources shared between users and tenants
+ (compute node network and filesystem IO) a failure can be triggered
+ across tenants, implying the risk of DoS.
+ 
+  * To make things worse -- with the default setting of not cleaning the
+ image cache -- any corrupted image will remain in cache until replaced
+ with fresh upload using a new image ID. Affected snapshots remain
+ unusable forever, until ex- and re-imported manually under better
+ conditions.
+ 
+  * Base image corruptions here are not detected and cannot be caught.
+ Theoretically (a bit esoteric, quite unlikely, but not impossible), an
+ attacker might modulate resource usage to precisely create an
+ incompletely written image, that boots and runs, but has access control
+ information stripped.

-- 
You received this bug notification because you are a member of OpenStack
Security Group, which is subscribed to OpenStack.
https://bugs.launchpad.net/bugs/1350766

Title:
  Race condition: compute intermittently corrupts base images on
  download from glance

Status in OpenStack Compute (Nova):
  New

Bug description:
  Under certain conditions, which I happen to meet often on my Icehouse
  single node setup, uploaded images or snapshots fail to boot. See also
  https://ask.openstack.org/en/question/42804/icehouse-how-to-boot-a
  -snapshot-from-a-running-instance/

  Reason: When first instantiating a QCOW2 image, it's

  (1)  downloaded as QCOW2 to /var/lib/nova/instances/_base/IMAGEID.part
  (2)  converted to RAW format base /var/lib/nova/instances/_base/IMAGEID.converted using qemu-img

  The step (1) is performed in nova/image/glance.py,
  GlanceImageService.download using buffered IO, which does not
  guarantee the resulting data to be written to disk on file close.
  Consequently, the source image file may not be written completely when
  qemu-img sub-process starts reading in step (2). Whether the result is
  good or bad depends on speed of download, file size, and how quickly
  qemu-image can digest its input.

  Proposed fix: enforce fsync on output File object before returning
  from download. Patch attached.

  Security considerations:

   * Due to the race between resources shared between users and tenants
  (compute node network and filesystem IO) a failure can be triggered
  across tenants, implying the risk of DoS.

   * To make things worse -- with the default setting of not cleaning
  the image cache -- any corrupted image will remain in cache until
  replaced with fresh upload using a new image ID. Affected snapshots
  remain unusable forever, until ex- and re-imported manually under
  better conditions.

   * Base image corruptions here are not detected and cannot be caught.
  Theoretically (a bit esoteric, quite unlikely, but not impossible), an
  attacker might modulate resource usage to precisely create an
  incompletely written image, that boots and runs, but has access
  control information stripped.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1350766/+subscriptions




More information about the Openstack-security mailing list