[Openstack] nova-compute won't restart (on some nodes) after Grizzly upgrade

Michael Still mikal at stillhq.com
Tue Aug 13 07:59:07 UTC 2013


Jonathan, sorry for the slow reply. I had a baby on Friday last week
instead of keeping up with email. I promise it wont happen again. ;)

Did you manage these instances in virsh manually at all as part of the
upgrade? If not, I'd love you to file a bug with a log to show the
problem.

Thanks,
Michael

On Sun, Aug 11, 2013 at 10:17 PM, Jonathan Proulx <jon at jonproulx.com> wrote:
> Hi Michael,
>
> Thanks for the offer.  I'd be happy to paste up some compute logs if you
> have a interest, but I got around the issue with:
>
> virsh list --all
>
> and then 'virsh undefine' for all deleted instances on each host.  I've used
> hypervisors directly and high level stuff like openstack (and others) but
> never spent much time at the libvirt layer so that was a bit of new info for
> me apparrently from the operators list not long after I sent my query here.
>
> Thanks,
> -Jon
>
>
> On Wed, Aug 7, 2013 at 9:02 PM, Michael Still <mikal at stillhq.com> wrote:
>>
>> Johnathan,
>>
>> this would be easier to debug with a nova-compute log. Are you willing
>> to post one somewhere that people could take a look at?
>>
>> Thanks,
>> Michael
>>
>> On Thu, Aug 8, 2013 at 7:35 AM, Jonathan Proulx <jon at jonproulx.com> wrote:
>> > Hi All,
>> >
>> > Apologies to those who saw this on the operators list earlier, there is
>> > a
>> > bit of new info here & having gotten no response there thought I'd take
>> > it
>> > to a wider audience...
>> >
>> >
>> > I'm almost through my grizzly upgrade.  I'd upgraded everything except
>> > nova-compute before upgrading that (ubuntu 12.04 cloud archieve pkgs).
>> >
>> > On most nodes the nova-compute service upgraded and restarted properly,
>> > but
>> > on some it imediately exits with:
>> >
>> > CRITICAL nova [-] 'instance_type_memory_mb'
>> >
>> > It would seem like this is https://code.launchpad.net/bugs/1161022 but
>> > the
>> > fix for that was released in March and I've verified is in the packaged
>> > version I'm using.
>> >
>> > The referenced bug involves the DB migration only updating non-deleted
>> > instances in the instance-system-metatata table and the patch skips the
>> > lookups that are broken (and irrelevant) for deleted instances.
>> >
>> > Tracing the DB calls from the host shows it is trying to do lookups for
>> > instances that were deleted last October, which is a bit surprising as
>> > it's
>> > run thousands of instances since & it's not looking those up.
>> >
>> > It is note worthy that that is around the time I upgraded from Essex ->
>> > Folsom so it's possible their state  is weirder than most having run
>> > through
>> > that update.
>> >
>> > There were directories for the instances in question in
>> > /var/lib/nova/instances, so I thought "Aha!" and moved them, but on
>> > restart
>> > I still get the same failure and same DB query for the old instances.
>> > Where
>> > is nova getting the idea it should look these up & how can I stop it?
>> >
>> > I've go so far as to generate instance_type_<foo> entries in the
>> > instance_system_metadata table  for all instances ever on my deployment
>> > (about 500k) but I still only have the cryptic "CRITICAL nova [-]
>> > 'instance_type_memory_mb'" error and a failure to start, so clearly I'm
>> > casing the wrong problem some how.
>> >
>> > Help?
>> > -Jon
>> >
>> > _______________________________________________
>> > Mailing list:
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> > Post to     : openstack at lists.openstack.org
>> > Unsubscribe :
>> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> >
>>
>>
>>
>> --
>> Rackspace Australia
>
>



-- 
Rackspace Australia




More information about the Openstack mailing list