[Openstack] nova-compute and cinder-scheduler HA

Сергей Мотовиловец motovilovets.sergey at gmail.com
Wed May 14 18:49:57 UTC 2014


Hello everyone!

I'm facing some troubles with nova and cinder here.

I have 2 control nodes (active/active) in my testing environment with
Percona XtraDB cluster (Galera+xtrabackup) + garbd on a separate node (to
avoid split-brain)  + OpenStack Icehouse, latest from Ubuntu 14.04 main
repo.

The problem is horizontal scalability of nova-conductor and
cinder-scheduler services, seems like all active instances of these
services are trying to execute same MySQL queries they get from Rabbit,
which leads to numerous deadlocks in set-up with Galera.

In case when multiple nova-conductor services are running (and using MySQL
instances on corresponding control nodes) it appears as "Deadlock found
when trying to get lock; try restarting transaction" in log.
With cinder-scheduler it leads to "InvalidBDM: Block Device Mapping is
Invalid."

Is there any possible way to make multiple instances of these services
running simultaneously and not duplicating queries?
(I don't really like the idea of handling this with Heartbeat+Pacemaker or
other similar stuff, mostly because I'm thinking about equal load
distribution across control nodes, but in this case it seems like it has an
opposite effect, multiplying load on MySQL)

Another thing that is extremely annoying: if instance stuck in ERROR state
because of deadlock during its termination - it is impossible to terminate
instance anymore in Horizon, only via nova-api with reset-state. How can
this be handled?

I'd really appreciate any help/advises/thoughts regarding these problems.


Best regards,
Motovilovets Sergey
Software Operation Engineer
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20140514/4da37243/attachment.html>


More information about the Openstack mailing list