[Openstack-operators] Migrating instances in grizzly

Juan José Pavlik Salles jjpavlik at gmail.com
Tue Sep 3 00:44:09 UTC 2013


I've also found this in nova-conductor.log:

2013-09-02 15:35:27.208 DEBUG nova.openstack.common.rpc.common
[req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d 31020076174943bdb7486c330a298d93
d1e3aae242f14c488d2225dc
bf1e96d6] Timed out waiting for RPC response: timed out _error_callback
/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:628
2013-09-02 15:35:27.222 ERROR nova.openstack.common.rpc.amqp
[req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d 31020076174943bdb7486c330a298d93
d1e3aae242f14c488d2225dcbf
1e96d6] Exception during message handling
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp Traceback
(most recent call last):
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
430, in _proce
ss_data
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp     rval
= self.proxy.dispatch(ctxt, version, method, **args)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py",
line 133, in
dispatch
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
return getattr(proxyobj, method)(ctxt, **kwargs)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/conductor/manager.py", line 399, in
network_migrat
e_instance_start
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
self.network_api.migrate_instance_start(context, instance, migration)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/network/api.py", line 89, in wrapped
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
return func(self, context, *args, **kwargs)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/network/api.py", line 501, in
migrate_instance_sta
rt
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
self.network_rpcapi.migrate_instance_start(context, **args)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/network/rpcapi.py", line 333, in
migrate_instance_
start
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
version='1.2')
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py", line
80, in call
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
return rpc.call(context, self._get_topic(topic), msg, timeout)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
line 140, in ca
ll
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
return _get_impl().call(CONF, context, topic, msg, timeout)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 798, in
call
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
rpc_amqp.get_connection_pool(conf, Connection))
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
612, in call
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp     rv =
list(rv)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
554, in __iter__
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
self.done()
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
self.gen.next()
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
551, in __iter__
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
self._iterator.next()
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 648, in iterconsume
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp     yield
self.ensure(_error_callback, _consume)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 566, in ensure
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
error_callback(e)
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp   File
"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
line 629, in _error_callback
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp     raise
rpc_common.Timeout()
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp Timeout:
Timeout while waiting on RPC response.
2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
2013-09-02 15:35:27.237 ERROR nova.openstack.common.rpc.common
[req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d 31020076174943bdb7486c330a298d93
d1e3aae242f14c488d2225dcbf1e96d6] Returning exception Timeout while waiting
on RPC response. to caller

Does anybody know all the steps that take to live-migrate an instance ?? It
seems to be stopping inside the network_migrate_instance_start function,
really no clue at all...


2013/9/2 Juan José Pavlik Salles <jjpavlik at gmail.com>

> Hi guys, last friday i started testing live-migration in my grizzly cloud
> with shared storage (gfs2) but i run into a problem, a little weird:
>
> This is the status before migrating:
>
> -I've p9 instances also called instance-00000022 running on "acelga"
> compute node.
>
> *root at acelga:~/tools# virsh list*
> * Id    Name                           State*
> *----------------------------------------------------*
> * 6     instance-00000022              running*
> *
> *
> *root at acelga:~/tools# *
> *
> *
> *
> *
> *root at cebolla:~/tool# virsh list*
> * Id    Nombre                         Estado*
> *----------------------------------------------------*
> *
> *
> *root at cebolla:~/tool# *
>
> -Here you can see all the info about the instance
>
> *root at cebolla:~/tool# nova --os-username=noc-admin --os-tenant-name=noc
> --os-password=XXXXXXX --os-auth-url http://172.19.136.1:35357/v2.0 show
> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09*
> *
> +-------------------------------------+-----------------------------------------------------------+
> *
> *| Property                            | Value
>                           |*
> *
> +-------------------------------------+-----------------------------------------------------------+
> *
> *| status                              | ACTIVE
>                          |*
> *| updated                             | 2013-09-02T15:27:39Z
>                          |*
> *| OS-EXT-STS:task_state               | None
>                          |*
> *| OS-EXT-SRV-ATTR:host                | acelga
>                          |*
> *| key_name                            | None
>                          |*
> *| image                               | Ubuntu 12.04.2 LTS
> (1359ca8d-23a2-40e8-940f-d90b3e68bb39) |*
> *| vlan1 network                       | 172.16.16.175
>                           |*
> *| hostId                              |
> 81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53  |*
> *| OS-EXT-STS:vm_state                 | active
>                          |*
> *| OS-EXT-SRV-ATTR:instance_name       | instance-00000022
>                           |*
> *| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
>                           |*
> *| flavor                              | m1.tiny (1)
>                           |*
> *| id                                  |
> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09                      |*
> *| security_groups                     | [{u'name': u'default'}]
>                           |*
> *| user_id                             | 20390b639d4449c18926dca5e038ec5e
>                          |*
> *| name                                | p9
>                          |*
> *| created                             | 2013-09-02T15:27:06Z
>                          |*
> *| tenant_id                           | d1e3aae242f14c488d2225dcbf1e96d6
>                          |*
> *| OS-DCF:diskConfig                   | MANUAL
>                          |*
> *| metadata                            | {}
>                          |*
> *| accessIPv4                          |
>                           |*
> *| accessIPv6                          |
>                           |*
> *| progress                            | 0
>                           |*
> *| OS-EXT-STS:power_state              | 1
>                           |*
> *| OS-EXT-AZ:availability_zone         | nova
>                          |*
> *| config_drive                        |
>                           |*
> *
> +-------------------------------------+-----------------------------------------------------------+
> *
> *root at cebolla:~/tool#*
>
> -So i try to move it to the other node "cebolla"
>
> *root at acelga:~/tools# nova --os-username=noc-admin --os-tenant-name=noc
> --os-password=HjZ5V9yj --os-auth-url http://172.19.136.1:35357/v2.0live-migration de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09 cebolla
> *
> *root at acelga:~/tools# virsh list*
> * Id    Name                           State*
> *----------------------------------------------------*
> *
> *
> *root at acelga:~/tools#*
>
> No error messages at all on "acelga" compute node so far. If i check the
> other node i can see the instance've been migrated
>
> *root at cebolla:~/tool# virsh list*
> * Id    Nombre                         Estado*
> *----------------------------------------------------*
> * 11    instance-00000022              ejecutando*
> *
> *
> *root at cebolla:~/tool#*
>
>
> -BUT... after a few seconds i get this on "acelga"'s nova-compute.log
>
>
> *2013-09-02 15:35:45.784 4601 DEBUG nova.openstack.common.rpc.common [-]
> Timed out waiting for RPC response: timed out _error_callback
> /usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:628
> *
> *2013-09-02 15:35:45.790 4601 ERROR nova.utils [-] in fixed duration
> looping call*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils Traceback (most recent
> call last):*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/utils.py", line 594, in _inner*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.f(*self.args, **
> self.kw)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py", line 3129,
> in wait_for_live_migration*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     migrate_data)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 3208, in
> _post_live_migration*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     migration)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line 664, in
> network_migrate_instance_start*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     migration)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line 415, in
> network_migrate_instance_start*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
> self.call(context, msg, version='1.41')*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py", line
> 80, in call*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
> rpc.call(context, self._get_topic(topic), msg, timeout)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
> line 140, in call*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
> _get_impl().call(CONF, context, topic, msg, timeout)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
> line 798, in call*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils
> rpc_amqp.get_connection_pool(conf, Connection))*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
> 612, in call*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     rv = list(rv)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
> 554, in __iter__*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.done()*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/contextlib.py", line 24, in __exit__*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.gen.next()*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py", line
> 551, in __iter__*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     self._iterator.next()*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
> line 648, in iterconsume*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     yield
> self.ensure(_error_callback, _consume)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
> line 566, in ensure*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     error_callback(e)*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py",
> line 629, in _error_callback*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils     raise
> rpc_common.Timeout()*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils Timeout: Timeout while
> waiting on RPC response.*
> *2013-09-02 15:35:45.790 4601 TRACE nova.utils*
>
>
> -And the VM state never changes back to ACTIVE from MIGRATING:
>
>
> *root at cebolla:~/tool# nova --os-username=noc-admin --os-tenant-name=noc
> --os-password=XXXXX --os-auth-url http://172.19.136.1:35357/v2.0 show
> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09*
> *
> +-------------------------------------+-----------------------------------------------------------+
> *
> *| Property                            | Value
>                           |*
> *
> +-------------------------------------+-----------------------------------------------------------+
> *
> *| status                              | MIGRATING
>                           |*
> *| updated                             | 2013-09-02T15:33:54Z
>                          |*
> *| OS-EXT-STS:task_state               | migrating
>                           |*
> *| OS-EXT-SRV-ATTR:host                | acelga
>                          |*
> *| key_name                            | None
>                          |*
> *| image                               | Ubuntu 12.04.2 LTS
> (1359ca8d-23a2-40e8-940f-d90b3e68bb39) |*
> *| vlan1 network                       | 172.16.16.175
>                           |*
> *| hostId                              |
> 81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53  |*
> *| OS-EXT-STS:vm_state                 | active
>                          |*
> *| OS-EXT-SRV-ATTR:instance_name       | instance-00000022
>                           |*
> *| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
>                           |*
> *| flavor                              | m1.tiny (1)
>                           |*
> *| id                                  |
> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09                      |*
> *| security_groups                     | [{u'name': u'default'}]
>                           |*
> *| user_id                             | 20390b639d4449c18926dca5e038ec5e
>                          |*
> *| name                                | p9
>                          |*
> *| created                             | 2013-09-02T15:27:06Z
>                          |*
> *| tenant_id                           | d1e3aae242f14c488d2225dcbf1e96d6
>                          |*
> *| OS-DCF:diskConfig                   | MANUAL
>                          |*
> *| metadata                            | {}
>                          |*
> *| accessIPv4                          |
>                           |*
> *| accessIPv6                          |
>                           |*
> *| OS-EXT-STS:power_state              | 1
>                           |*
> *| OS-EXT-AZ:availability_zone         | nova
>                          |*
> *| config_drive                        |
>                           |*
> *
> +-------------------------------------+-----------------------------------------------------------+
> *
> *root at cebolla:~/tool#*
>
>
> Funny fact:
> -The vm still answer ping after migration, so i think this is good.
>
> Any ideas about this problem? At first i thought it could be related to a
> connection problem between the nodes, but the VM migrates completly in
> hipervisor level somehow there is some "instance've been migrated ACK"
> missing.
>
>
> --
> Pavlik Salles Juan José
>



-- 
Pavlik Salles Juan José
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130902/d8ca74f2/attachment-0001.html>


More information about the OpenStack-operators mailing list