[Openstack-operators] missing base images and spinning qemu-nbd processes?

Francois Deppierraz francois at ctrlaltdel.ch
Mon Apr 14 06:54:44 UTC 2014


Hi Jon,

I had the same problem here and finally got rid of the injection
functionality altogether. Here's the actual puppet config used:

  # Disable file injection
  # This is the default in icehouse
  #
https://blueprints.launchpad.net/nova/+spec/disable-file-injection-by-default

  nova_config { 'DEFAULT/libvirt_inject_key':
    value => false,
  }

  nova_config { 'DEFAULT/libvirt_inject_partition':
    value => '-2',
  }

The only drawbacks seems to be that non-cloudinit images won't get any
configuration. However due to the potential security impact of mounting
a guest image directly on the hypervisor, it feels much safer to just
disable it.

Hope this helps,

François

On 11. 04. 14 21:57, Jonathan Proulx wrote:
> Hi All,
> 
> I'm running Ubuntu 12.04 + Havana (2013.2.2)  from cloud archive
> 
> Over the past few days I've noticed a number of my nodes (5% ish) have
> been spending a lot of cpu time in 'system' state.  This seems to be
> related to 'qemu-nbd -c' process that are spinning madly mostly on
> disks from deleted instances. 'kill -9' seem the only way to get them
> to stop.
> 
> Today I caught one that was spinning but on a file & instance that
> actually existed.  Turns out the base image that the qcow2 file in the
> qemu-nbd command line referenced was missing.
> 
> /var/lib/nova/instances is on local disk on the node (not on a shared
> filesystem). Grepping the nova-compute logs I see recent references to
> the base image being "active" and not elegibale for removal, there are
> many similar older references (it's a popular base).  I can't find any
> point where it was referenced in the logs but wasn't active.
> 
> The instance in question was launched last night and was functioning
> *mostly* normally.  It should have has a 16G root base don instance
> type though the root was much smaller (same size as if the instance
> type had specified a 0 size root).  Presumably this is because the nbd
> mapping never happened properly to grow the rootfs?  But if that's the
> case, and the base was missing prior to launch, I don't see where the
> running OS came from.
> 
> Any guesses what is going on or best recovery practices?  For now I
> manually copied the base image from the glance store to the local file
> it was expected in (setting owner & perms to match others), which
> seems to work.
> 
> There haven't been any system or config level changes in the past
> couple months, though I did recently refresh the base image in
> question (and most of my other public base images).
> 
> -Jon
> 
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> 




More information about the OpenStack-operators mailing list