[Openstack] "nova live-migration" failing on Ubuntu precise; libvirt live migration is OK

Florian Haas florian at hastexo.com
Wed Jun 6 09:11:56 UTC 2012


Hi everyone,

a few people have reported issues with live migration lately, and
I've been digging into them to narrow them down.

The symptom is relatively easy to describe: you run "nova live-migration
<guest> <host>", and nothing happens.

A few words of background:

- System is Ubuntu precise with stock packages and regular updates, no
external PPAs. nova-compute is at version 2012.1-0ubuntu2.1.

- libvirtd is running with the "-l" option and with a working
TCP socket as described here:
http://docs.openstack.org/trunk/openstack-compute/admin/content/configuring-live-migrations.html

- /var/lib/nova/instances is on GlusterFS.

Now, if you're setting various --*vnc* flags in nova.conf, live
migration fails even at the libvirt level (a similar issue has been
reported here recently, see
https://lists.launchpad.net/openstack/msg12425.html).

# virsh migrate --live --p2p --domain instance-0000000a \
  --desturi qemu+tcp://skunk-x/system
error: Unable to read from monitor: Connection reset by peer

("skunk-x" is secondary IP address of the host "skunk", living in a
dedicated network used for migrations).

This is in the libvirt.log on the source host:
2012-06-05 20:39:25.838+0000: 12241: error :
virNetClientProgramDispatchError:174 : Unable to read from monitor:
Connection reset by peer

At the same time, I am seeing this in the libvirtd log on the target host:
2012-06-05 20:39:25.394+0000: 6828: error : qemuMonitorIORead:513 :
Unable to read from monitor: Connection reset by peer

Removing all --*vnc* flags from nova.conf resolved that issue for me.

Then, doing the same command as above resulted in a connection timeout,
because even if I set "qemu+tcp://skunk-x/system" as the libvirt
destination URI, libvirt opens a separate socket on an ephemeral port on
skunk's primary interface, which in that case was being blocked by my
iptables config:

# virsh migrate --live --p2p \
  --domain instance-0000000d --desturi qemu+tcp://skunk-x/system
error: unable to connect to server at 'skunk:49159': Connection timed out

Switching the migration to tunnelled mode solved that issue.

# virsh domstate instance-0000000d
running
# virsh migrate --live --p2p \
  --domain instance-0000000d --desturi qemu+tcp://skunk-x/system \
  --tunnelled
# virsh --connect qemu+tcp://skunk-x/system domstate instance-0000000d
running

So therefore, these are the flags that I'm using in my nova.conf:

--live_migration_uri="qemu+tcp://%s-x/system"
--live_migration_flag="VIR_MIGRATE_UNDEFINE_SOURCE,
VIR_MIGRATE_PEER2PEER, VIR_MIGRATE_TUNNELLED"

(Note that "VIR_MIGRATE_UNDEFINE_SOURCE, VIR_MIGRATE_PEER2PEER" is the
default for --live_migration_flag; VIR_MIGRATE_TUNNELLED is my addition.
I've also tried migrating over the primary interface, without
tunnelling. No change: works in libvirt, doesn't work with Nova.)

"nova live-migration <guest> <host>" returns an exit code of 0, and the
only trace that I find of the migration in the logs is this, which is
evidently from the pre_live_migration method.

2012-06-06 11:05:13 DEBUG nova.rpc.amqp [-] received {u'_context_roles':
[u'KeystoneServiceAdmin', u'admin', u'KeystoneAdmin'], u'_msg_id':
u'069c958b7c03482aa4f0dda00010eb10', u'_context_read_deleted': u'no',
u'_context_request_id': u'req-71c4ffea-4d3d-471c-98bc-8a27aaff8f2c',
u'args': {u'instance_id': 13, u'block_migration': False, u'disk': None},
u'_context_auth_token': '<SANITIZED>', u'_context_is_admin': True,
u'_context_project_id': u'9c929e61e7624fbe895ae0de38bd1471',
u'_context_timestamp': u'2012-06-06T09:05:09.992775',
u'_context_user_id': u'1c8c118c7c244d2d94cc516ab6f24c03', u'method':
u'pre_live_migration', u'_context_remote_address': u'10.43.0.2'} from
(pid=14437) _safe_log
/usr/lib/python2.7/dist-packages/nova/rpc/common.py:160

Looks like it never gets to live_migration.

I'd be thankful for any clues as to where to dig further.

Cheers,
Florian

-- 
Need help with High Availability?
http://www.hastexo.com/now




More information about the Openstack mailing list