[openstack-dev] [Magnum] Consistent functional test failures

Clark Boylan cboylan at sapwetik.org
Thu Aug 13 23:58:21 UTC 2015


On Thu, Aug 13, 2015, at 03:13 AM, Tom Cammann wrote:
> Hi Team,
> 
> Wanted to let you know why we are having consistent functional test 
> failures in the gate.
> 
> This is being caused by Nova returning "No valid host" to heat:
> 
> 2015-08-13 08:26:16.303 31543 INFO heat.engine.resource [-] CREATE: 
> Server "kube_minion" [12ab45ef-0177-4118-9ba0-3fffbc3c1d1a] Stack 
> "testbay-y366b2atg6mm-kube_minions-cdlfyvhaximr-0-dufsjliqfoet" 
> [b40f0c9f-cb54-4d75-86c3-8a9f347a27a6]
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource Traceback (most 
> recent call last):
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
> "/opt/stack/new/heat/heat/engine/resource.py", line 625, in
> _action_recorder
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource     yield
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
> "/opt/stack/new/heat/heat/engine/resource.py", line 696, in _do_action
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource     yield 
> self.action_handler_task(action, args=handler_args)
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
> "/opt/stack/new/heat/heat/engine/scheduler.py", line 320, in wrapper
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource     step = 
> next(subtask)
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
> "/opt/stack/new/heat/heat/engine/resource.py", line 670, in 
> action_handler_task
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource     while not 
> check(handler_data):
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
> "/opt/stack/new/heat/heat/engine/resources/openstack/nova/server.py", 
> line 759, in check_create_complete
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource     return 
> self.client_plugin()._check_active(server_id)
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource File 
> "/opt/stack/new/heat/heat/engine/clients/os/nova.py", line 232, in 
> _check_active
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource     'code': 
> fault.get('code', _('Unknown'))
> 2015-08-13 08:26:16.303 31543 ERROR heat.engine.resource 
> ResourceInError: Went to status ERROR due to "Message: No valid host was 
> found. There are not enough hosts available., Code: 500"
> 
> And this in turn is being caused by the compute instance running out of 
> disk space:
> 
> 2015-08-13 08:26:15.216 DEBUG nova.filters 
> [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Starting with 1 
> host(s) get_filtered_objects /opt/stack/new/nova/nova/filters.py:70
> 2015-08-13 08:26:15.217 DEBUG nova.filters 
> [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter 
> RetryFilter returned 1 host(s) get_filtered_objects 
> /opt/stack/new/nova/nova/filters.py:84
> 2015-08-13 08:26:15.217 DEBUG nova.filters 
> [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter 
> AvailabilityZoneFilter returned 1 host(s) get_filtered_objects 
> /opt/stack/new/nova/nova/filters.py:84
> 2015-08-13 08:26:15.217 DEBUG nova.filters 
> [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter RamFilter 
> returned 1 host(s) get_filtered_objects 
> /opt/stack/new/nova/nova/filters.py:84
> 2015-08-13 08:26:15.218 DEBUG nova.scheduler.filters.disk_filter 
> [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] 
> (devstack-trusty-rax-dfw-4299602, devstack-trusty-rax-dfw-4299602) 
> ram:5172 disk:17408 io_ops:0 instances:1 does not have 20480 MB usable 
> disk, it only has 17408.0 MB usable disk. host_passes 
> /opt/stack/new/nova/nova/scheduler/filters/disk_filter.py:60
> 2015-08-13 08:26:15.218 INFO nova.filters 
> [req-e5bb52cb-387e-4638-911e-8c72aa1b6400 admin admin] Filter DiskFilter 
> returned 0 hosts
> 
> For now a recheck seems to work about 1 in 2, so we can still land
> patches.
> 
> The fix for this could be to clean up our Magnum devstack install more 
> aggressively, which might be as simple as cleaning up the images we use, 
> or get infra to provide our tests with a larger disk size. I will 
> probably test out a patch today which cleans up the images we use in 
> devstack to see if that helps.
> 
It is not trivial to provide your tests with more disk as we are using
the flavors appropriate for our RAM and CPU needs and are constrained by
quotas in the clouds we use. Do you really need 20GB nested test
instances? The VMs these jobs run on have ~13GB images which is almost
half the size of the instances you are trying to boot there. I would
definitely look into trimming the disk requirements for the nested VMs
before anything else.

As for working ~50% of the time hpcloud gives us more disk than
rackspace which is likely why you see about half fail and half pass. The
runs that pass probably run on hpcloud VMs.

Clark



More information about the OpenStack-dev mailing list