[Openstack-security] [Bug 1713783] Re: After failed evacuation the recovered source compute tries to delete the instance

OpenStack Infra 1713783 at bugs.launchpad.net
Fri Apr 20 05:56:02 UTC 2018


Reviewed:  https://review.openstack.org/518733
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=604954a70a5dbf6a1bf79d8b67e8d92c2bf46386
Submitter: Zuul
Branch:    stable/ocata

commit 604954a70a5dbf6a1bf79d8b67e8d92c2bf46386
Author: Előd Illés <elod.illes at ericsson.com>
Date:   Wed Aug 30 16:54:36 2017 +0200

    Set error state after failed evacuation
    
    When evacuation fails with NoValidHost, the migration status remains
    'accepted' instead of 'error'. This causes problem in case the compute
    service starts up again and looks for evacuations with status 'accepted',
    as it then removes the local instances for those evacuations even though
    the instance was never actually evacuated to another host.
    
    Conflicts:
          nova/conductor/manager.py
    
    NOTE(mriedem): The conflict is due to not having change
    I6590f0eda4ec4996543ad40d8c2640b83fc3dd9d in Ocata.
    
    Change-Id: I06d78c744fa75ae5f34c5cfa76bc3c9460767b84
    Closes-Bug: #1713783
    (cherry picked from commit a8ebf5f1aac080854704e27146e8c98b053c6224)
    (cherry picked from commit a3f286f43d866cd343d26d9bafadecab1c225e4b)


** Changed in: nova/ocata
       Status: In Progress => Fix Committed

-- 
You received this bug notification because you are a member of OpenStack
Security SIG, which is subscribed to OpenStack.
https://bugs.launchpad.net/bugs/1713783

Title:
  After failed evacuation the recovered source compute tries to delete
  the instance

Status in OpenStack Compute (nova):
  Fix Released
Status in OpenStack Compute (nova) ocata series:
  Fix Committed
Status in OpenStack Compute (nova) pike series:
  Fix Committed
Status in OpenStack Security Advisory:
  Won't Fix

Bug description:
  Description
  ===========
  In case of a failed evacuation attempt the status of the migration is 'accepted' instead of 'failed' so when source compute is recovered the compute manager tries to delete the instance from the source host. However a secondary fault prevents deleting the allocation in placement so the actual deletion of the instance fails as well.

  Steps to reproduce
  ==================
  The following functional test reproduces the bug: https://review.openstack.org/#/c/498482/
  What it does: initiate evacuation when no valid host is available and evacuation fails, but nova manager still tries to delete the instance.
  Logs:

      2017-08-29 19:11:15,751 ERROR [oslo_messaging.rpc.server] Exception during message handling
      NoValidHost: No valid host was found. There are not enough hosts available.
      2017-08-29 19:11:16,103 INFO [nova.tests.functional.test_servers] Running periodic for compute1 (host1)
      2017-08-29 19:11:16,115 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/4e8e23ff-0c52-4cf7-8356-d9fa88536316/aggregates" status: 200 len: 18 microversion: 1.1
      2017-08-29 19:11:16,120 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/4e8e23ff-0c52-4cf7-8356-d9fa88536316/inventories" status: 200 len: 401 microversion: 1.0
      2017-08-29 19:11:16,131 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/4e8e23ff-0c52-4cf7-8356-d9fa88536316/allocations" status: 200 len: 152 microversion: 1.0
      2017-08-29 19:11:16,138 INFO [nova.compute.resource_tracker] Final resource view: name=host1 phys_ram=8192MB used_ram=1024MB phys_disk=1028GB used_disk=1GB total_vcpus=10 used_vcpus=1 pci_stats=[]
      2017-08-29 19:11:16,146 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/4e8e23ff-0c52-4cf7-8356-d9fa88536316/aggregates" status: 200 len: 18 microversion: 1.1
      2017-08-29 19:11:16,151 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/4e8e23ff-0c52-4cf7-8356-d9fa88536316/inventories" status: 200 len: 401 microversion: 1.0
      2017-08-29 19:11:16,152 INFO [nova.tests.functional.test_servers] Running periodic for compute2 (host2)
      2017-08-29 19:11:16,163 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/531b1ce8-def1-455d-95b3-4140665d956f/aggregates" status: 200 len: 18 microversion: 1.1
      2017-08-29 19:11:16,168 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/531b1ce8-def1-455d-95b3-4140665d956f/inventories" status: 200 len: 401 microversion: 1.0
      2017-08-29 19:11:16,176 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/531b1ce8-def1-455d-95b3-4140665d956f/allocations" status: 200 len: 54 microversion: 1.0
      2017-08-29 19:11:16,184 INFO [nova.compute.resource_tracker] Final resource view: name=host2 phys_ram=8192MB used_ram=512MB phys_disk=1028GB used_disk=0GB total_vcpus=10 used_vcpus=0 pci_stats=[]
      2017-08-29 19:11:16,192 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/531b1ce8-def1-455d-95b3-4140665d956f/aggregates" status: 200 len: 18 microversion: 1.1
      2017-08-29 19:11:16,197 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/531b1ce8-def1-455d-95b3-4140665d956f/inventories" status: 200 len: 401 microversion: 1.0
      2017-08-29 19:11:16,198 INFO [nova.tests.functional.test_servers] Finished with periodics
      2017-08-29 19:11:16,255 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2.1/6f70656e737461636b20342065766572/servers/5058200c-478e-4449-88c1-906fdd572662" status: 200 len: 1875 microversion: 2.53 time: 0.056198
      2017-08-29 19:11:16,262 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2.1/6f70656e737461636b20342065766572/os-migrations" status: 200 len: 373 microversion: 2.53 time: 0.004618
      2017-08-29 19:11:16,280 INFO [nova.api.openstack.requestlog] 127.0.0.1 "PUT /v2.1/6f70656e737461636b20342065766572/os-services/c269bc74-4720-4de4-a6e5-889080b892a0" status: 200 len: 245 microversion: 2.53 time: 0.016442
      2017-08-29 19:11:16,281 INFO [nova.service] Starting compute node (version 16.0.0)
      2017-08-29 19:11:16,296 INFO [nova.compute.manager] Deleting instance as it has been evacuated from this host

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1713783/+subscriptions




More information about the Openstack-security mailing list