Open Stack

Thu Sep 6 13:00:13 UTC 2012

running Essex (as packaged) on Ubuntu 12.04 launching new instances
has stopped working. They almost imediately enter Error Spawning state
which is consistent with my experience of resource starvation in the
cluster or scheduling errors.  Most recently I was /var/lib/nova on
one of the compute nodes filled but the schedule kept trying to run
new instances there (which is it's own issue, but I fixed that for me
simply by expanding that partition).  Though in that case they were
assigned to that node and then failing, in this case there is no
compute node associated with the failed instance.

In my current case all compute nodes seem to have sufficient free
resources (vCPU, memory, and disk), this affects all users and all
tenants as far as I can tell and is not hitting any quota limits .  I
do notice this oddness in the compute_nodes table though not sure it's
related (or excatly what free_disk_gb represents as it's clearly not
simply local_gb - local_gb_used or it wouldn't have it's own column
and other nodes show larger or smaller free gb than the simple math
would suggest)

mysql> select id,vcpus_used,local_gb,local_gb_used,free_disk_gb from
compute_nodes where free_disk_gb < 0;
+----+------------+----------+---------------+--------------+
| id | vcpus_used | local_gb | local_gb_used | free_disk_gb |
+----+------------+----------+---------------+--------------+
|  3 |         23 |      605 |           214 |          -25 |
+----+------------+----------+---------------+--------------+

Any of this sound familiar to anyone?

Is there a way to trace what the scheduler is doing or failing to do?

Thanks,
-Jon

Open Stack

[Openstack-operators] Launching new instances failing despite free resources

OpenStack

Community

Documentation

Branding & Legal