DC DR high availability - Snapshot taking feature and user experience
kkchn.in at gmail.com
Thu Sep 30 05:36:03 UTC 2021
I am in the process of setting up a DC and DR sites with
openstack(ussuri, qemu kvm with debian as base OS).
I would like to know what all are the options in DC and DR sites for High
What the architecture methods to follow at DC and DR so that, If VM
crashes at one physical machine at DC ( for example 3 controllers in DC),
or Host machine crashes( RAM, Hardware failure etc. ) / machine power
cable detached accidentally so that no services down / VMs and
applications not down.
How can achieve this ? share the best practices here.
Also If I take snapshots of the running VMs ( what will the user
experience ? will they in freeze / logged out from applications right now
they are logged in ? ) . Can we avoid the service unavailable for users
while taking snapshots ?
To DR we are rsyncing these snapshots after converting each VM image to
qcow2 then rsyncing . ( All these we are performing on Controller node .
Generally 100 GB VM will output 16 GB qcow2 image for example. ) its
taking 2 minutes for snapshot creation and 20 to 30 Minutes for qcow2
conversion . )
How many snapshots and conversion to qcow2( qemu-img convert) can be
performed on this controller machine where we performing this operation . (
Can we apply parallel processing for this and how ? ) . Seet he controller
specs where we perform this.
The controller spec ( 20 core CPUs with 1.5 TB RAM total 160 processors )
Then copying to DR ( which is 200 KM away from our DC takes 4 minutes with
Then we populate each VM with these copied qcow2 images and attaching the
Network ips and nat there. Is this the practice or any other good way to
perform this for a robust DC and DR setup .
Kindly share your thoughts.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the openstack-discuss