[Openstack-operators] Migrating instances in grizzly

Juan José Pavlik Salles jjpavlik at gmail.com
Mon Sep 2 15:51:55 UTC 2013


Hi guys, last friday i started testing live-migration in my grizzly cloud
with shared storage (gfs2) but i run into a problem, a little weird:

This is the status before migrating:

-I've p9 instances also called instance-00000022 running on "acelga"
compute node.

*root at acelga:~/tools# virsh list*
* Id    Name                           State*
*----------------------------------------------------*
* 6     instance-00000022              running*
*
*
*root at acelga:~/tools# *
*
*
*
*
*root at cebolla:~/tool# virsh list*
* Id    Nombre                         Estado*
*----------------------------------------------------*
*
*
*root at cebolla:~/tool# *

-Here you can see all the info about the instance

*root at cebolla:~/tool# nova --os-username=noc-admin --os-tenant-name=noc
--os-password=XXXXXXX --os-auth-url http://172.19.136.1:35357/v2.0 show
de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09*
*
+-------------------------------------+-----------------------------------------------------------+
*
*| Property                            | Value
                        |*
*
+-------------------------------------+-----------------------------------------------------------+
*
*| status                              | ACTIVE
                       |*
*| updated                             | 2013-09-02T15:27:39Z
                       |*
*| OS-EXT-STS:task_state               | None
                       |*
*| OS-EXT-SRV-ATTR:host                | acelga
                       |*
*| key_name                            | None
                       |*
*| image                               | Ubuntu 12.04.2 LTS
(1359ca8d-23a2-40e8-940f-d90b3e68bb39) |*
*| vlan1 network                       | 172.16.16.175
                        |*
*| hostId                              |
81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53  |*
*| OS-EXT-STS:vm_state                 | active
                       |*
*| OS-EXT-SRV-ATTR:instance_name       | instance-00000022
                        |*
*| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
                        |*
*| flavor                              | m1.tiny (1)
                        |*
*| id                                  |
de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09                      |*
*| security_groups                     | [{u'name': u'default'}]
                        |*
*| user_id                             | 20390b639d4449c18926dca5e038ec5e
                       |*
*| name                                | p9
                       |*
*| created                             | 2013-09-02T15:27:06Z
                       |*
*| tenant_id                           | d1e3aae242f14c488d2225dcbf1e96d6
                       |*
*| OS-DCF:diskConfig                   | MANUAL
                       |*
*| metadata                            | {}
                       |*
*| accessIPv4                          |
                        |*
*| accessIPv6                          |
                        |*
*| progress                            | 0
                        |*
*| OS-EXT-STS:power_state              | 1
                        |*
*| OS-EXT-AZ:availability_zone         | nova
                       |*
*| config_drive                        |
                        |*
*
+-------------------------------------+-----------------------------------------------------------+
*
*root at cebolla:~/tool#*

-So i try to move it to the other node "cebolla"

*root at acelga:~/tools# nova --os-username=noc-admin --os-tenant-name=noc
--os-password=HjZ5V9yj --os-auth-url
http://172.19.136.1:35357/v2.0live-migration
de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09 cebolla
*
*root at acelga:~/tools# virsh list*
* Id    Name                           State*
*----------------------------------------------------*
*
*
*root at acelga:~/tools#*

No error messages at all on "acelga" compute node so far. If i check the
other node i can see the instance've been migrated

*root at cebolla:~/tool# virsh list*
* Id    Nombre                         Estado*
*----------------------------------------------------*
* 11    instance-00000022              ejecutando*
*
*
*root at cebolla:~/tool#*


-BUT... after a few seconds i get this on "acelga"'s nova-compute.log


*2013-09-02 15:35:45.784 4601 DEBUG nova.openstack.common.rpc.common [-]
Timed out waiting for RPC response: timed out _error_callback
/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:628
*
*2013-09-02 15:35:45.790 4601 ERROR nova.utils [-] in fixed duration
looping call*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils Traceback (most recent call
last):*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/utils.py", line 594, in _inner*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.f(*self.args, **
self.kw)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3129,
in wait_for_live_migration*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     migrate_data)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 3208, in
_post_live_migration*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     migration)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 664, in
network_migrate_instance_start*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     migration)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 415, in
network_migrate_instance_start*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
self.call(context, msg, version='1.41')*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py", line
80, in call*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     return rpc.call(context,
self._get_topic(topic), msg, timeout)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
line 140, in call*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
_get_impl().call(CONF, context, topic, msg, timeout)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 798, in call*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils
rpc_amqp.get_connection_pool(conf, Connection))*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
612, in call*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     rv = list(rv)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
554, in __iter__*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.done()*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.gen.next()*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
551, in __iter__*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     self._iterator.next()*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 648, in iterconsume*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     yield
self.ensure(_error_callback, _consume)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 566, in ensure*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     error_callback(e)*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 629, in _error_callback*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils     raise
rpc_common.Timeout()*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils Timeout: Timeout while
waiting on RPC response.*
*2013-09-02 15:35:45.790 4601 TRACE nova.utils*


-And the VM state never changes back to ACTIVE from MIGRATING:


*root at cebolla:~/tool# nova --os-username=noc-admin --os-tenant-name=noc
--os-password=XXXXX --os-auth-url http://172.19.136.1:35357/v2.0 show
de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09*
*
+-------------------------------------+-----------------------------------------------------------+
*
*| Property                            | Value
                        |*
*
+-------------------------------------+-----------------------------------------------------------+
*
*| status                              | MIGRATING
                        |*
*| updated                             | 2013-09-02T15:33:54Z
                       |*
*| OS-EXT-STS:task_state               | migrating
                        |*
*| OS-EXT-SRV-ATTR:host                | acelga
                       |*
*| key_name                            | None
                       |*
*| image                               | Ubuntu 12.04.2 LTS
(1359ca8d-23a2-40e8-940f-d90b3e68bb39) |*
*| vlan1 network                       | 172.16.16.175
                        |*
*| hostId                              |
81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53  |*
*| OS-EXT-STS:vm_state                 | active
                       |*
*| OS-EXT-SRV-ATTR:instance_name       | instance-00000022
                        |*
*| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
                        |*
*| flavor                              | m1.tiny (1)
                        |*
*| id                                  |
de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09                      |*
*| security_groups                     | [{u'name': u'default'}]
                        |*
*| user_id                             | 20390b639d4449c18926dca5e038ec5e
                       |*
*| name                                | p9
                       |*
*| created                             | 2013-09-02T15:27:06Z
                       |*
*| tenant_id                           | d1e3aae242f14c488d2225dcbf1e96d6
                       |*
*| OS-DCF:diskConfig                   | MANUAL
                       |*
*| metadata                            | {}
                       |*
*| accessIPv4                          |
                        |*
*| accessIPv6                          |
                        |*
*| OS-EXT-STS:power_state              | 1
                        |*
*| OS-EXT-AZ:availability_zone         | nova
                       |*
*| config_drive                        |
                        |*
*
+-------------------------------------+-----------------------------------------------------------+
*
*root at cebolla:~/tool#*


Funny fact:
-The vm still answer ping after migration, so i think this is good.

Any ideas about this problem? At first i thought it could be related to a
connection problem between the nodes, but the VM migrates completly in
hipervisor level somehow there is some "instance've been migrated ACK"
missing.


-- 
Pavlik Salles Juan José
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130902/e8b7058b/attachment.html>


More information about the OpenStack-operators mailing list