[openstack-dev] [nova] Silent nova fails

Andrew Laski andrew at lascii.com
Mon Aug 17 14:03:54 UTC 2015


On 08/17/15 at 02:59pm, Timofei Durakov wrote:
>Hello,
>
>In current design there are places when nova fails while executing users
>CLI commands, but no error messages, except some logs in nova-compute,
>produced [1] . The problem is that there is no response from compute node
>to conductor, as RPC cast is used.
>
>To fix this nova should make a synchronous call before operation itself to
>verify that it is valid. E.g. here is my patch that fixes this problem in
>resize operation [2]

I think that Nova should avoid synchronous calls when at all possible.  
They often end up leading to timeouts and needing to be very careful 
about locking or idempotence because the natural reaction to a timeout 
is to try again, but often the original operation is still in progress.  
And when there is a timeout, or disconnect, you've lost the benefit you 
were hoping to gain of providing immediate feedback.  I think that 
rather than trying to treat requests as local operations we should 
embrace the asynchronous nature of the distributed system and work on a 
robust way to provide feedback that works with, rather than against, how 
Nova is architected.

There is already a framework in place for doing this called "instance 
actions" which are visible via the Nova API.  And a longer term solution 
under discussion called tasks.  By having a resize task exposed in the 
API a user could check on the status of that and see if it had 
succeeded/failed and get a relevant error message for a failure. 

>
>So, I would like to get feedback about such hypervisor checks before
>operations. Nova already makes these checks during live-migration process:
>conductor calls compute manager[3], which also consults with driver[4]. And
>as for me I think we should use such logic in resize operation.
>
>Timofey.
>
>[1] https://bugs.launchpad.net/nova/+bug/1455460
>
>[2] https://review.openstack.org/195088
>
>[3]
>https://github.com/openstack/nova/blob/master/nova/conductor/tasks/live_migrate.py#L144
>
>[4]
>https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L5157

>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list