[Openstack-operators] Migrating instances in grizzly
Daneyon Hansen (danehans)
danehans at cisco.com
Thu Sep 5 15:50:46 UTC 2013
Have you tried the patch for this bug?
https://bugs.launchpad.net/oslo/+bug/856764
Regards,
Daneyon Hansen
Software Engineer
Email: danehans at cisco.com
Phone: 303-718-0400
http://about.me/daneyon_hansen
On 9/5/13 7:02 AM, "Emilien Macchi" <emilien.macchi at enovance.com> wrote:
>Hi,
>
>We have the same issue here with Grizzly 2013.1.2 / Ubuntu 12.04 /
>libvirt 1.0.2.
>
>Which release are you running ?
>
>Emilien Macchi
>----------------------------------------------------
># OpenStack Engineer
>// eNovance Inc. http://enovance.com
>// ✉ emilien at enovance.com ☎ +33 (0)1 49 70 99 80
>// 10 rue de la Victoire 75009 Paris
>
>On Tue 03 Sep 2013 02:44:09 AM CEST, Juan José Pavlik Salles wrote:
>> I've also found this in nova-conductor.log:
>>
>> 2013-09-02 15:35:27.208 DEBUG nova.openstack.common.rpc.common
>> [req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d
>> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dc
>> bf1e96d6] Timed out waiting for RPC response: timed out
>> _error_callback
>>
>>/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:
>>628
>> 2013-09-02 15:35:27.222 ERROR nova.openstack.common.rpc.amqp
>> [req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d
>> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dcbf
>> 1e96d6] Exception during message handling
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> Traceback (most recent call last):
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 430, in _proce
>> ss_data
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> rval = self.proxy.dispatch(ctxt, version, method, **args)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/dispatcher.py
>>",
>> line 133, in
>> dispatch
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return getattr(proxyobj, method)(ctxt, **kwargs)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/conductor/manager.py",
>> line 399, in network_migrat
>> e_instance_start
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.network_api.migrate_instance_start(context, instance, migration)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/network/api.py", line 89,
>> in wrapped
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return func(self, context, *args, **kwargs)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/network/api.py", line 501,
>> in migrate_instance_sta
>> rt
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.network_rpcapi.migrate_instance_start(context, **args)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/dist-packages/nova/network/rpcapi.py", line
>> 333, in migrate_instance_
>> start
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> version='1.2')
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py",
>> line 80, in call
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return rpc.call(context, self._get_topic(topic), msg, timeout)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
>> line 140, in ca
>> ll
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> return _get_impl().call(CONF, context, topic, msg, timeout)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 798, in
>> call
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> rpc_amqp.get_connection_pool(conf, Connection))
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 612, in call
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> rv = list(rv)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 554, in __iter__
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.done()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self.gen.next()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>> "/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 551, in __iter__
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> self._iterator.next()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 648, in iterconsume
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> yield self.ensure(_error_callback, _consume)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 566, in ensure
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> error_callback(e)
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 629, in _error_callback
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> raise rpc_common.Timeout()
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> Timeout: Timeout while waiting on RPC response.
>> 2013-09-02 15:35:27.222 1363 TRACE nova.openstack.common.rpc.amqp
>> 2013-09-02 15:35:27.237 ERROR nova.openstack.common.rpc.common
>> [req-e0473533-89af-4ff5-b6fa-4b0b6eb50a6d
>> 31020076174943bdb7486c330a298d93 d1e3aae242f14c488d2225dcbf1e96d6]
>> Returning exception Timeout while waiting on RPC response. to caller
>>
>> Does anybody know all the steps that take to live-migrate an instance
>> ?? It seems to be stopping inside the network_migrate_instance_start
>> function, really no clue at all...
>>
>>
>> 2013/9/2 Juan José Pavlik Salles <jjpavlik at gmail.com
>> <mailto:jjpavlik at gmail.com>>
>>
>> Hi guys, last friday i started testing live-migration in my
>> grizzly cloud with shared storage (gfs2) but i run into a problem,
>> a little weird:
>>
>> This is the status before migrating:
>>
>> -I've p9 instances also called instance-00000022 running on
>> "acelga" compute node.
>>
>> /root at acelga:~/tools# virsh list/
>> / Id Name State/
>> /----------------------------------------------------/
>> / 6 instance-00000022 running/
>> /
>> /
>> /root at acelga:~/tools# /
>> /
>> /
>> /
>> /
>> /root at cebolla:~/tool# virsh list/
>> / Id Nombre Estado/
>> /----------------------------------------------------/
>> /
>> /
>> /root at cebolla:~/tool# /
>>
>> -Here you can see all the info about the instance
>>
>> /root at cebolla:~/tool# nova --os-username=noc-admin
>> --os-tenant-name=noc --os-password=XXXXXXX --os-auth-url
>> http://172.19.136.1:35357/v2.0 show
>> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09/
>>
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>> /| Property | Value
>> |/
>>
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>> /| status | ACTIVE
>> |/
>> /| updated | 2013-09-02T15:27:39Z
>> |/
>> /| OS-EXT-STS:task_state | None
>> |/
>> /| OS-EXT-SRV-ATTR:host | acelga
>> |/
>> /| key_name | None
>> |/
>> /| image | Ubuntu 12.04.2 LTS
>> (1359ca8d-23a2-40e8-940f-d90b3e68bb39) |/
>> /| vlan1 network | 172.16.16.175
>> |/
>> /| hostId |
>> 81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53 |/
>> /| OS-EXT-STS:vm_state | active
>> |/
>> /| OS-EXT-SRV-ATTR:instance_name | instance-00000022
>> |/
>> /| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
>> <http://acelga.psi.unc.edu.ar>
>>|/
>> /| flavor | m1.tiny (1)
>> |/
>> /| id |
>> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09 |/
>> /| security_groups | [{u'name': u'default'}]
>> |/
>> /| user_id |
>> 20390b639d4449c18926dca5e038ec5e |/
>> /| name | p9
>> |/
>> /| created | 2013-09-02T15:27:06Z
>> |/
>> /| tenant_id |
>> d1e3aae242f14c488d2225dcbf1e96d6 |/
>> /| OS-DCF:diskConfig | MANUAL
>> |/
>> /| metadata | {}
>> |/
>> /| accessIPv4 |
>> |/
>> /| accessIPv6 |
>> |/
>> /| progress | 0
>> |/
>> /| OS-EXT-STS:power_state | 1
>> |/
>> /| OS-EXT-AZ:availability_zone | nova
>> |/
>> /| config_drive |
>> |/
>>
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>> /root at cebolla:~/tool#/
>>
>> -So i try to move it to the other node "cebolla"
>>
>> /root at acelga:~/tools# nova --os-username=noc-admin
>> --os-tenant-name=noc --os-password=HjZ5V9yj --os-auth-url
>> http://172.19.136.1:35357/v2.0 live-migration
>> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09 cebolla
>> /
>> /root at acelga:~/tools# virsh list/
>> / Id Name State/
>> /----------------------------------------------------/
>> /
>> /
>> /root at acelga:~/tools#/
>>
>> No error messages at all on "acelga" compute node so far. If i
>> check the other node i can see the instance've been migrated
>>
>> /root at cebolla:~/tool# virsh list/
>> / Id Nombre Estado/
>> /----------------------------------------------------/
>> / 11 instance-00000022 ejecutando/
>> /
>> /
>> /root at cebolla:~/tool#/
>>
>>
>> -BUT... after a few seconds i get this on "acelga"'s
>>nova-compute.log
>>
>>
>> /2013-09-02 15:35:45.784 4601 DEBUG
>> nova.openstack.common.rpc.common [-] Timed out waiting for RPC
>> response: timed out _error_callback
>>
>>/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py:
>>628/
>> /2013-09-02 15:35:45.790 4601 ERROR nova.utils [-] in fixed
>> duration looping call/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils Traceback (most
>> recent call last):/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>> "/usr/lib/python2.7/dist-packages/nova/utils.py", line 594, in
>>_inner/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>> self.f(*self.args, **self.kw <http://self.kw>)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>> "/usr/lib/python2.7/dist-packages/nova/virt/libvirt/driver.py",
>> line 3129, in wait_for_live_migration/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils migrate_data)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>> "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line
>> 3208, in _post_live_migration/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils migration)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>> "/usr/lib/python2.7/dist-packages/nova/conductor/api.py", line
>> 664, in network_migrate_instance_start/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils migration)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>> "/usr/lib/python2.7/dist-packages/nova/conductor/rpcapi.py", line
>> 415, in network_migrate_instance_start/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils return
>> self.call(context, msg, version='1.41')/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/proxy.py",
>> line 80, in call/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils return
>> rpc.call(context, self._get_topic(topic), msg, timeout)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/__init__.py",
>> line 140, in call/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils return
>> _get_impl().call(CONF, context, topic, msg, timeout)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 798, in call/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>> rpc_amqp.get_connection_pool(conf, Connection))/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 612, in call/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils rv = list(rv)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 554, in __iter__/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils self.done()/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>> "/usr/lib/python2.7/contextlib.py", line 24, in __exit__/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils self.gen.next()/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/amqp.py",
>> line 551, in __iter__/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>> self._iterator.next()/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 648, in iterconsume/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils yield
>> self.ensure(_error_callback, _consume)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 566, in ensure/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils
>>error_callback(e)/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils File
>>
>>"/usr/lib/python2.7/dist-packages/nova/openstack/common/rpc/impl_kombu.py
>>",
>> line 629, in _error_callback/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils raise
>> rpc_common.Timeout()/
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils Timeout: Timeout
>> while waiting on RPC response./
>> /2013-09-02 15:35:45.790 4601 TRACE nova.utils/
>>
>>
>> -And the VM state never changes back to ACTIVE from MIGRATING:
>>
>>
>> /root at cebolla:~/tool# nova --os-username=noc-admin
>> --os-tenant-name=noc --os-password=XXXXX --os-auth-url
>> http://172.19.136.1:35357/v2.0 show
>> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09/
>>
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>> /| Property | Value
>> |/
>>
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>> /| status | MIGRATING
>> |/
>> /| updated | 2013-09-02T15:33:54Z
>> |/
>> /| OS-EXT-STS:task_state | migrating
>> |/
>> /| OS-EXT-SRV-ATTR:host | acelga
>> |/
>> /| key_name | None
>> |/
>> /| image | Ubuntu 12.04.2 LTS
>> (1359ca8d-23a2-40e8-940f-d90b3e68bb39) |/
>> /| vlan1 network | 172.16.16.175
>> |/
>> /| hostId |
>> 81be94870821e17e327d92e9c80548ffcdd37d24054a235116669f53 |/
>> /| OS-EXT-STS:vm_state | active
>> |/
>> /| OS-EXT-SRV-ATTR:instance_name | instance-00000022
>> |/
>> /| OS-EXT-SRV-ATTR:hypervisor_hostname | acelga.psi.unc.edu.ar
>> <http://acelga.psi.unc.edu.ar>
>>|/
>> /| flavor | m1.tiny (1)
>> |/
>> /| id |
>> de2bcbed-f7b6-40cd-89ca-acf6fe2f2d09 |/
>> /| security_groups | [{u'name': u'default'}]
>> |/
>> /| user_id |
>> 20390b639d4449c18926dca5e038ec5e |/
>> /| name | p9
>> |/
>> /| created | 2013-09-02T15:27:06Z
>> |/
>> /| tenant_id |
>> d1e3aae242f14c488d2225dcbf1e96d6 |/
>> /| OS-DCF:diskConfig | MANUAL
>> |/
>> /| metadata | {}
>> |/
>> /| accessIPv4 |
>> |/
>> /| accessIPv6 |
>> |/
>> /| OS-EXT-STS:power_state | 1
>> |/
>> /| OS-EXT-AZ:availability_zone | nova
>> |/
>> /| config_drive |
>> |/
>>
>>/+-------------------------------------+---------------------------------
>>--------------------------+/
>> /root at cebolla:~/tool#/
>>
>>
>> Funny fact:
>> -The vm still answer ping after migration, so i think this is good.
>>
>> Any ideas about this problem? At first i thought it could be
>> related to a connection problem between the nodes, but the VM
>> migrates completly in hipervisor level somehow there is some
>> "instance've been migrated ACK" missing.
>>
>>
>> --
>> Pavlik Salles Juan José
>>
>>
>>
>>
>> --
>> Pavlik Salles Juan José
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
More information about the OpenStack-operators
mailing list