Data Center Survival in case of Disaster / HW Failure in DC
KK CHN
kkchn.in at gmail.com
Thu May 5 09:16:11 UTC 2022
List,
We are having an old cloud setup with OpenStack Ussuri usng Debian OS,
(Qemu KVM ). I know its very old and we can't upgrade to to new versions
right now.
The Deployment is as follows.
A. 3 Controller in (cum compute nodes . VMs are running on controllers
too..) in HA mode.
B. 6 separate Compute nodes
C. 3 separate Storage node with Ceph RBD
Question is
1. In case of any Sudden Hardware failure of one or more controller node
OR Compute node OR Storage Node what will be the immediate redundant
recovery setup need to be employed ?
2. In case H/W failure our recovery need to as soon as possible. For
example less than30 Minutes after the first failure occurs.
3. Is there setup options like a hot standby or similar setups or what we
need to employ ?
4. To meet all RTO (< 30 Minutes down time ) and RPO(from the exact point
of crash all applications and data must be consistent) .
5. Please share your thoughts for reliable crash/fault resistance
configuration options in DC.
We have a remote DR setup right now in a remote location. Also I would
like to know if there is a recommended way to make the remote DR site
Automatically up and run ? OR How to automate the service from DR site
to meet exact RTO and RPO
Any thoughts most welcom.
Regards,
Krish
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220505/57a22952/attachment.htm>
More information about the openstack-discuss
mailing list