[wallaby][ovn][Open vSwitch] HA for OVN DB servers using pacemaker

Faisal Sheikh faisalsheikh.cyber at gmail.com
Tue Dec 14 09:31:37 UTC 2021


Hi,

I am using Openstack Wallaby release with OVN on Ubuntu 20.04.
My environment consists of 2 compute nodes and 1 controller node.

ovs-vswitchd (Open vSwitch) 2.15.0
Ubuntu Kernel Version: 5.4.0-88-generic
- compute node1 172.16.30.1
- compute node2 172.16.30.3
- controller-khi01/Network node/Primary ovs-db IP 172.16.30.46
- backup ovs-db 172.16.30.47

I want to run OVSDB server to run in Active/Passive mode.
Active OVSDB-server is on node "controller-khi01" and Passive
OVSDB-server is on "controller02-khi01". I have setup Pacemaker
to manage the ovn-northd(Open vSwitch) service between primary
and backup ovn-db server. You can see the "pcs status" output below.
But I am unable to replicate database state between primary and
backup OVSDB server. If i execute "ovsdb-server --sync-from=server"
command,
getting below error.


root at controller-khi01# ovsdb-server --sync-from=172.16.30.47
2021-12-14T09:16:29Z|00001|lockfile|WARN|/var/lib/openvswitch/.conf.db.~lock~:
cannot lock file because it is already locked by pid 14121
ovsdb-server: I/O error: /etc/openvswitch/conf.db: failed to lock
lockfile (Resource temporarily unavailable)


root at controller-khi01# pcs status
Cluster name: cluster1
Cluster Summary:
  * Stack: corosync
  * Current DC: controller-khi01 (version 2.0.3-4b1f869f0f) - partition
with quorum
  * Last updated: Tue Dec 14 07:53:06 2021
  * Last change:  Tue Dec 14 06:16:21 2021 by hacluster via crmd on
controller02-khi01
  * 2 nodes configured
  * 3 resource instances configured

Node List:
  * Online: [ controller-khi01 controller02-khi01 ]

Full List of Resources:
  * Clone Set: ovndb_servers-clone [ovndb_servers] (promotable):
    * Masters: [ controller-khi01 ]
    * Slaves: [ controller02-khi01 ]
  * ovn-virtual-ip      (ocf::heartbeat:IPaddr2):        Started
controller-khi01

Failed Resource Actions:
  * ovndb_servers_monitor_10000 on controller02-khi01 'master' (8):
call=17,
status='complete', exitreason='', last-rc-change='2021-12-14 06:48:26Z',
queued=0ms, exec=0ms

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

I would really appreciate any input in this regard.

Best regards,

Faisal Sheikh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20211214/11a517fa/attachment.htm>


More information about the openstack-discuss mailing list