[Openstack] how to deal with failed compute node

gtt116 gtt116 at 126.com
Thu Oct 18 06:23:11 UTC 2012


Hi guys,

Today, when terminate an instance, nova-api will check whether 
nova-compute service is alive. If nova-compute is dead, nova-api just 
delete the instance from the database, but do not release the fixed-ip, 
floating-ip, volumes, etc. If the failed nova-compute start again, it 
will found the erroneously running instance, and do cleanup. But before 
the nova-compute started, the resource that dead vm associated can not 
be used. like fixed-ip can not be associated to another vm.

So I found a method to quickly clean these resource. If nova-api find 
nova-compute is dead. Then it find another nova-compute that is alive. 
Although the alive nova-compute is not the real host of vm. It can clean 
the resource in database, even the network by make rpc call to 
nova-network. maybe some exception it will raise. But that works. What 
do you think about this?

why do we have a lot of nova-compute, nova-network? I think one reason 
is when one node failed, another can do some work for it.

Best regards,
gtt





More information about the Openstack mailing list