[Openstack-operators] Migrating instances in grizzly

Daneyon Hansen (danehans) danehans at cisco.com
Thu Sep 5 15:50:46 UTC 2013


Have you tried the patch for this bug?

https://bugs.launchpad.net/oslo/+bug/856764



Regards,
Daneyon Hansen
Software Engineer
Email: danehans at cisco.com
Phone: 303-718-0400
http://about.me/daneyon_hansen




On 9/5/13 7:02 AM, "Emilien Macchi" <emilien.macchi at enovance.com> wrote:

>Hi,
>
>We have the same issue here with Grizzly 2013.1.2 / Ubuntu 12.04 /
>libvirt 1.0.2.
>
>Which release are you running ?
>
>Emilien Macchi
>----------------------------------------------------
># OpenStack Engineer
>// eNovance Inc.              http://enovance.com
>// ✉ emilien at enovance.com     ☎ +33 (0)1 49 70 99 80
>// 10 rue de la Victoire 75009 Paris
>
>On Tue 03 Sep 2013 02:44:09 AM CEST, Juan José Pavlik Salles wrote:
>> I've also found this in nova-conductor.log:
>>
>> 2013-09-02 15:35:27.208 DEBUG nova.openstack.common.rpc.common
>> [req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d
>> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dc
>> bf1e96d6] Timed out waiting for RPC response: timed out
>> _error_callback
>> 
>>/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:
>>628
>> 2013-09-02 15:35:27.222 ERROR nova.openstack.common.rpc.amqp
>> [req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d
>> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dcbf
>> 1e96d6] Exception during message handling
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> Traceback (most recent call last):
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 430, in _proce
>> ss_data
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> rval = self.proxy.dispatch(ctxt, version, method, **args)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> 
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py
>>",
>> line 133, in
>> dispatch
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return getattr(proxyobj, method)(ctxt, **kwargs)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py",
>> line 399, in network_migrat
>> e_instance_start
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.network_api.migrate_instance_start(context, instance, migration)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/network/api.py", line 89,
>> in wrapped
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return func(self, context, *args, **kwargs)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/network/api.py", line 501,
>> in migrate_instance_sta
>> rt
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.network_rpcapi.migrate_instance_start(context, **args)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/network/rpcapi.py", line
>> 333, in migrate_instance_
>> start
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> version='1.2')
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py",
>> line 80, in call
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return rpc.call(context, self._get_topic(topic), msg, timeout)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> 
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
>> line 140, in ca
>> ll
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return _get_impl().call(CONF, context, topic, msg, timeout)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> 
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 798, in
>> call
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> rpc_amqp.get_connection_pool(conf, Connection))
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 612, in call
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> rv = list(rv)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 554, in __iter__
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.done()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.gen.next()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 551, in __iter__
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self._iterator.next()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> 
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 648, in iterconsume
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> yield self.ensure(_error_callback, _consume)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> 
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 566, in ensure
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> error_callback(e)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> 
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 629, in _error_callback
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> raise rpc_common.Timeout()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> Timeout: Timeout while waiting on RPC response.
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> 2013-09-02 15:35:27.237 ERROR nova.openstack.common.rpc.common
>> [req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d
>> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dcbf1e96d6]
>> Returning exception Timeout while waiting on RPC response. to caller
>>
>> Does anybody know all the steps that take to live-migrate an instance
>> ?? It seems to be stopping inside the network_migrate_instance_start
>> function, really no clue at all...
>>
>>
>> 2013/9/2 Juan José Pavlik Salles <jjpavlik at gmail.com
>> <mailto:jjpavlik at gmail.com>>
>>
>>     Hi guys, last friday i started testing live-migration in my
>>     grizzly cloud with shared storage (gfs2) but i run into a problem,
>>     a little weird:
>>
>>     This is the status before migrating:
>>
>>     -I've p9 instances also called instance-00000022 running on
>>     "acelga" compute node.
>>
>>     /root at acelga:~/tools# virsh list/
>>     / Id    Name                           State/
>>     /----------------------------------------------------/
>>     / 6     instance-00000022              running/
>>     /
>>     /
>>     /root at acelga:~/tools# /
>>     /
>>     /
>>     /
>>     /
>>     /root at cebolla:~/tool# virsh list/
>>     / Id    Nombre                         Estado/
>>     /----------------------------------------------------/
>>     /
>>     /
>>     /root at cebolla:~/tool# /
>>
>>     -Here you can see all the info about the instance
>>
>>     /root at cebolla:~/tool# nova --os-username=noc-admin
>>     --os-tenant-name=noc --os-password=XXXXXXX --os-auth-url
>>     http://172.19.136.1:35357/v2.0 show
>>     de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09/
>>     
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>>     /| Property                            | Value
>>                                     |/
>>     
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>>     /| status                              | ACTIVE
>>                                      |/
>>     /| updated                             | 2013-09-02T15:27:39Z
>>                                      |/
>>     /| OS-EXT-STS:task_state               | None
>>                                      |/
>>     /| OS-EXT-SRV-ATTR:host                | acelga
>>                                      |/
>>     /| key_name                            | None
>>                                      |/
>>     /| image                               | Ubuntu 12.04.2 LTS
>>     (1359ca8d-23a2-40e8-940f-d90b3e68bb39) |/
>>     /| vlan1 network                       | 172.16.16.175
>>                                     |/
>>     /| hostId                              |
>>     81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53  |/
>>     /| OS-EXT-STS:vm_state                 | active
>>                                      |/
>>     /| OS-EXT-SRV-ATTR:instance_name       | instance-00000022
>>                                     |/
>>     /| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
>>     <http://acelga.psi.unc.edu.ar>
>>|/
>>     /| flavor                              | m1.tiny (1)
>>                                     |/
>>     /| id                                  |
>>     de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09                      |/
>>     /| security_groups                     | [{u'name': u'default'}]
>>                                     |/
>>     /| user_id                             |
>>     20390b639d4449c18926dca5e038ec5e                          |/
>>     /| name                                | p9
>>                                      |/
>>     /| created                             | 2013-09-02T15:27:06Z
>>                                      |/
>>     /| tenant_id                           |
>>     d1e3aae242f14c488d2225dcbf1e96d6                          |/
>>     /| OS-DCF:diskConfig                   | MANUAL
>>                                      |/
>>     /| metadata                            | {}
>>                                      |/
>>     /| accessIPv4                          |
>>                                     |/
>>     /| accessIPv6                          |
>>                                     |/
>>     /| progress                            | 0
>>                                     |/
>>     /| OS-EXT-STS:power_state              | 1
>>                                     |/
>>     /| OS-EXT-AZ:availability_zone         | nova
>>                                      |/
>>     /| config_drive                        |
>>                                     |/
>>     
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>>     /root at cebolla:~/tool#/
>>
>>     -So i try to move it to the other node "cebolla"
>>
>>     /root at acelga:~/tools# nova --os-username=noc-admin
>>     --os-tenant-name=noc --os-password=HjZ5V9yj --os-auth-url
>>     http://172.19.136.1:35357/v2.0 live-migration
>>     de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09 cebolla
>>     /
>>     /root at acelga:~/tools# virsh list/
>>     / Id    Name                           State/
>>     /----------------------------------------------------/
>>     /
>>     /
>>     /root at acelga:~/tools#/
>>
>>     No error messages at all on "acelga" compute node so far. If i
>>     check the other node i can see the instance've been migrated
>>
>>     /root at cebolla:~/tool# virsh list/
>>     / Id    Nombre                         Estado/
>>     /----------------------------------------------------/
>>     / 11    instance-00000022              ejecutando/
>>     /
>>     /
>>     /root at cebolla:~/tool#/
>>
>>
>>     -BUT... after a few seconds i get this on "acelga"'s
>>nova-compute.log
>>
>>
>>     /2013-09-02 15:35:45.784 4601 DEBUG
>>     nova.openstack.common.rpc.common [-] Timed out waiting for RPC
>>     response: timed out _error_callback
>>     
>>/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:
>>628/
>>     /2013-09-02 15:35:45.790 4601 ERROR nova.utils [-] in fixed
>>     duration looping call/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils Traceback (most
>>     recent call last):/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     "/usr/lib/python2.7/dist-packages/nova/utils.py", line 594, in
>>_inner/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>>     self.f(*self.args, **self.kw <http://self.kw>)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py",
>>     line 3129, in wait_for_live_migration/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     migrate_data)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
>>     3208, in _post_live_migration/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     migration)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line
>>     664, in network_migrate_instance_start/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     migration)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line
>>     415, in network_migrate_instance_start/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
>>     self.call(context, msg, version='1.41')/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py",
>>     line 80, in call/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
>>     rpc.call(context, self._get_topic(topic), msg, timeout)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
>>     line 140, in call/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     return
>>     _get_impl().call(CONF, context, topic, msg, timeout)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>>     line 798, in call/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>>     rpc_amqp.get_connection_pool(conf, Connection))/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>>     line 612, in call/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     rv = list(rv)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>>     line 554, in __iter__/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.done()/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     "/usr/lib/python2.7/contextlib.py", line 24, in __exit__/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     self.gen.next()/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>>     line 551, in __iter__/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>>     self._iterator.next()/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>>     line 648, in iterconsume/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     yield
>>     self.ensure(_error_callback, _consume)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>>     line 566, in ensure/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>>error_callback(e)/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils   File
>>     
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>>     line 629, in _error_callback/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils     raise
>>     rpc_common.Timeout()/
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils Timeout: Timeout
>>     while waiting on RPC response./
>>     /2013-09-02 15:35:45.790 4601 TRACE nova.utils/
>>
>>
>>     -And the VM state never changes back to ACTIVE from MIGRATING:
>>
>>
>>     /root at cebolla:~/tool# nova --os-username=noc-admin
>>     --os-tenant-name=noc --os-password=XXXXX --os-auth-url
>>     http://172.19.136.1:35357/v2.0 show
>>     de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09/
>>     
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>>     /| Property                            | Value
>>                                     |/
>>     
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>>     /| status                              | MIGRATING
>>                                     |/
>>     /| updated                             | 2013-09-02T15:33:54Z
>>                                      |/
>>     /| OS-EXT-STS:task_state               | migrating
>>                                     |/
>>     /| OS-EXT-SRV-ATTR:host                | acelga
>>                                      |/
>>     /| key_name                            | None
>>                                      |/
>>     /| image                               | Ubuntu 12.04.2 LTS
>>     (1359ca8d-23a2-40e8-940f-d90b3e68bb39) |/
>>     /| vlan1 network                       | 172.16.16.175
>>                                     |/
>>     /| hostId                              |
>>     81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53  |/
>>     /| OS-EXT-STS:vm_state                 | active
>>                                      |/
>>     /| OS-EXT-SRV-ATTR:instance_name       | instance-00000022
>>                                     |/
>>     /| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
>>     <http://acelga.psi.unc.edu.ar>
>>|/
>>     /| flavor                              | m1.tiny (1)
>>                                     |/
>>     /| id                                  |
>>     de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09                      |/
>>     /| security_groups                     | [{u'name': u'default'}]
>>                                     |/
>>     /| user_id                             |
>>     20390b639d4449c18926dca5e038ec5e                          |/
>>     /| name                                | p9
>>                                      |/
>>     /| created                             | 2013-09-02T15:27:06Z
>>                                      |/
>>     /| tenant_id                           |
>>     d1e3aae242f14c488d2225dcbf1e96d6                          |/
>>     /| OS-DCF:diskConfig                   | MANUAL
>>                                      |/
>>     /| metadata                            | {}
>>                                      |/
>>     /| accessIPv4                          |
>>                                     |/
>>     /| accessIPv6                          |
>>                                     |/
>>     /| OS-EXT-STS:power_state              | 1
>>                                     |/
>>     /| OS-EXT-AZ:availability_zone         | nova
>>                                      |/
>>     /| config_drive                        |
>>                                     |/
>>     
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>>     /root at cebolla:~/tool#/
>>
>>
>>     Funny fact:
>>     -The vm still answer ping after migration, so i think this is good.
>>
>>     Any ideas about this problem? At first i thought it could be
>>     related to a connection problem between the nodes, but the VM
>>     migrates completly in hipervisor level somehow there is some
>>     "instance've been migrated ACK" missing.
>>
>>
>>     --
>>     Pavlik Salles Juan José
>>
>>
>>
>>
>> --
>> Pavlik Salles Juan José
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>



More information about the OpenStack-operators mailing list