DC DR high availability - Snapshot taking feature and user experience - openstack-discuss

30 Sep 2021

      List,

I am in the process of  setting up  a DC and DR sites  with
openstack(ussuri, qemu kvm with debian as base OS).

I would like to know what all are the options in  DC and DR sites for High
availability .

What the architecture methods to follow at DC and DR so that,  If VM
crashes at  one physical machine at DC ( for example 3 controllers in DC),
or  Host machine crashes( RAM, Hardware failure etc. ) /  machine power
cable detached accidentally  so that no services down / VMs  and
applications not down.

How can achieve this ? share the best practices here.

Also  If I take snapshots of the running VMs ( what will the user
experience ? will they in freeze / logged out from applications right now
they are logged in ? ) . Can we avoid the service unavailable for users
while taking snapshots ?

To DR we are rsyncing these snapshots after converting  each VM image to
qcow2 then rsyncing .  ( All these we are performing on Controller node .
Generally 100 GB VM will output 16 GB qcow2 image for example.  )  its
taking 2 minutes for snapshot creation and 20 to 30 Minutes for qcow2
conversion . )

How many snapshots and conversion to qcow2( qemu-img convert) can be
performed on this controller machine where we performing this operation . (
Can we apply parallel processing for this and how ? ) .  Seet he controller
specs where we perform this.

The controller spec  ( 20 core CPUs with 1.5 TB RAM total 160 processors )

 Then copying to DR ( which is 200 KM away from our DC takes 4 minutes with
rsync).

Then we populate each VM with these copied qcow2 images and attaching the
Network ips and nat there.   Is this the practice or any other good way to
perform this for a robust DC and DR setup .

Kindly share your thoughts.

Regards,
Kris

DC DR high availability - Snapshot taking feature and user experience

KK CHN

tags

participants (1)