[Openstack-operators] missing base images and spinning qemu-nbd processes?

Michael Still mikal at stillhq.com
Sat Apr 12 00:25:02 UTC 2014


Does a grep of the logs with the instance UUID reveal any activity on
the instance which might provide a pointer?

Michael

On Sat, Apr 12, 2014 at 5:57 AM, Jonathan Proulx <jon at jonproulx.com> wrote:
> Hi All,
>
> I'm running Ubuntu 12.04 + Havana (2013.2.2)  from cloud archive
>
> Over the past few days I've noticed a number of my nodes (5% ish) have
> been spending a lot of cpu time in 'system' state.  This seems to be
> related to 'qemu-nbd -c' process that are spinning madly mostly on
> disks from deleted instances. 'kill -9' seem the only way to get them
> to stop.
>
> Today I caught one that was spinning but on a file & instance that
> actually existed.  Turns out the base image that the qcow2 file in the
> qemu-nbd command line referenced was missing.
>
> /var/lib/nova/instances is on local disk on the node (not on a shared
> filesystem). Grepping the nova-compute logs I see recent references to
> the base image being "active" and not elegibale for removal, there are
> many similar older references (it's a popular base).  I can't find any
> point where it was referenced in the logs but wasn't active.
>
> The instance in question was launched last night and was functioning
> *mostly* normally.  It should have has a 16G root base don instance
> type though the root was much smaller (same size as if the instance
> type had specified a 0 size root).  Presumably this is because the nbd
> mapping never happened properly to grow the rootfs?  But if that's the
> case, and the base was missing prior to launch, I don't see where the
> running OS came from.
>
> Any guesses what is going on or best recovery practices?  For now I
> manually copied the base image from the glance store to the local file
> it was expected in (setting owner & perms to match others), which
> seems to work.
>
> There haven't been any system or config level changes in the past
> couple months, though I did recently refresh the base image in
> question (and most of my other public base images).
>
> -Jon
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators



-- 
Rackspace Australia



More information about the OpenStack-operators mailing list