DR options with openstack
Burak Hoban
Burak.Hoban at iag.com.au
Thu Jan 16 22:41:56 UTC 2020
Hey Tony,
Keep in mind that if you're looking to run OpenStack, but you're not feeling comfortable with the community support then there's always the option to go with a vendor backed version. These are usually a good option for those a little more risk adverse, or who don't have the time/or skills to maintain upstream releases - however going down that path usually means you can do less with OpenStack (depending on the vendor), but you have a large pool of resources to help troubleshoot and answer questions. We do both approaches internally for different clusters, so both approaches have their pro and cons.
You touched on a few points in your original email...
> If you had two OpenStack clusters, one in "site 1" and another in "site 2", then you could look at below for backup/restore of instances cross-cluster:
- Freezer -> https://wiki.openstack.org/wiki/Freezer
- Trillio (basically just a series of nova snapshots under the cover) -> https://www.trilio.io/
You could then over the top roll out a file level based backup tool on each instance, this would pretty much offer you replication functionality without having to do block-level tinkering.
> Failover of OpenStack controller/computes
If you have two sites, you can always go for 3x Controller deployment spanning cross site. Depending on latency obviously, however all you really need is a good enough link for RabbitMQ/Galera to talk reliably etc.
Failing that, I'd recommend backing up your Controller with ReaR. From there you can also schedule frequent automated jobs to do a OpenStack DB backups. Recovering should be a case of ReaR restore, load latest OpenStack DB and start everything up... You'll probably want to ensure your VLANs are spanned cross-site so you can reuse same IP addresses.
https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/post_deployment/backup_and_restore/05_rear.html
https://superuser.openstack.org/articles/tutorial-rear-openstack-deployment/
In reality, the best solution would be to have two isolated clusters, and your workloads spanned across both sites. Obviously that isn't always possible (from personal experience), but pushing people down the Kubernetes path and then for the rest automation/backup utilities may cater for your needs.
Having said that, Albert's link does look promising -> https://docs.openstack.org/cinder/pike/contributor/replication.html
Date: Thu, 16 Jan 2020 19:49:08 +0000
From: Albert Braden <Albert.Braden at synopsys.com>
To: Tony Pearce <tony.pearce at cinglevue.com>,
"openstack-discuss at lists.openstack.org"
<openstack-discuss at lists.openstack.org>
Subject: RE: DR options with openstack
Message-ID:
<BN8PR12MB3636451FC8E2BC6A50216425D9360 at BN8PR12MB3636.namprd12.prod.outlook.com>
Content-Type: text/plain; charset="utf-8"
Hi Tony,
It looks like Cheesecake didn’t survive but apparently some components of it did; details in https://docs.openstack.org/cinder/pike/contributor/replication.html
I’m not using Cinder now; we used it at eBay with Ceph and Netapp backends. Netapp makes it easy but is expensive; Ceph is free but you have to figure out how to make it work. You’re right about forking; we did it and then upgrading turned from an incredibly difficult ordeal to an impossible one. It’s better to stay with the “official” code so that upgrading remains an option.
I’m just an operator; hopefully someone more expert will reply with more useful info.
It’s true that our community lacks participation. It’s very difficult for a new operator to start using openstack and get help with the issues that they encounter. So far this mailing list has been the best resource for me. IRC and Ask Openstack are mostly unattended. I try to help out in #openstack when I can, but I don’t know a lot so I mostly end up telling people to ask on the list. On IRC sometimes I find help by asking in other openstack-* channels. Sometimes people complain that I’m asking in a developer channel, but sometimes I get help. Persistence is the key. If I keep asking long enough in enough places, eventually someone will answer. If all else fails, I open a bug.
Good luck and welcome to the Openstack community!
From: Tony Pearce <tony.pearce at cinglevue.com>
Sent: Wednesday, January 15, 2020 11:37 PM
To: openstack-discuss at lists.openstack.org
Subject: DR options with openstack
Hi all
My questions are;
1. How are people using iSCSI Cinder storage with Openstack to-date? For example a Nimble Storage array backend. I mean to say, are people using backend integration drivers for other hardware (like netapp)? Or are they using backend iscsi for example?
2. How are people managing DR with Openstack in terms of backend storage replication to another array in another location and continuing to use Openstack?
The environment which I am currently using;
1 x Nimble Storage array (iSCSI) with nimble.py Cinder driver
1 x virtualised Controller node
2 x physical compute nodes
This is Openstack Pike.
In addition, I have a 2nd Nimble Storage array in another location.
To explain the questions I’d like to put forward my thoughts for question 2 first:
For point 2 above, I have been searching for a way to utilise replicated volumes on the 2nd array from Openstack with existing instances. For example, if site 1 goes down how would I bring up openstack in the 2nd location and boot up the instances where their volumes are stored on the 2nd array. I found a proposal for something called “cheesecake” ref: https://specs.openstack.org/openstack/cinder-specs/specs/rocky/cheesecake-promote-backend.html<https://urldefense.proofpoint.com/v2/url?u=https-3A__specs.openstack.org_openstack_cinder-2Dspecs_specs_rocky_cheesecake-2Dpromote-2Dbackend.html&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=e2jC0sFEUAs6byl7JOv5IAZTKPkABl-Eh6rQwQ55tWk&s=oVEr3DpxprOpbuxZ_4WSfSqAVCaZUlPCFT6g6DsqQHQ&e=>
But I could not find if it had been approved or implemented. So I return to square 1. I have some thoughts about failing over the controller VM and compute node but I don’t think there’s any need to go into here because of the above blocker and for brevity anyway.
The nimble.py driver which I am using came with Openstack Pike and it appears Nimble / HPE are not maintaining it any longer. I saw a commit to remove nimble.py in Openstack Train release. The driver uses the REST API to perform actions on the array. Such as creating a volume, downloading the image, mounting the volume to the instance, snapshots, clones etc. This is great for me because to date I have around 10TB of openstack storage data allocated and the Nimble array shows the amount of data being consumed is <900GB. This is due to the compression and zero-byte snapshots and clones.
So coming back to question 2 – is it possible? Can you drop me some keywords that I can search for such as an Openstack component like Cheesecake? I think basically what I am looking for is a supported way of telling Openstack that the instance volumes are now located at the new / second array. This means a new cinder backend. Example, new iqn, IP address, volume serial number. I think I could probably hack the cinder db but I really want to avoid that.
So failing the above, it brings me to the question 1 I asked before. How are people using Cinder volumes? May be I am going about this the wrong way and need to take a few steps backwards to go forwards? I need storage to be able to deploy instances onto. Snapshots and clones are desired. At the moment these operations take less time than the horizon dashboard takes to load because of the waiting API responses.
When searching for information about the above as an end-user / consumer I get a bit concerned. Is it right that Openstack usage is dropping? There’s no web forum to post questions. The chatroom on freenode is filled with ~300 ghosts. Ask Openstack questions go without response. Earlier this week (before I found this mail list) I had to use facebook to report that the Openstack.org website had been hacked. Basically it seems that if you’re a developer that can write code then you’re in but that’s it. I have never been a coder and so I am somewhat stuck.
Thanks in advance
Sent from Mail<https://urldefense.proofpoint.com/v2/url?u=https-3A__go.microsoft.com_fwlink_-3FLinkId-3D550986&d=DwMFaQ&c=DPL6_X_6JkXFx7AXWqB0tg&r=XrJBXYlVPpvOXkMqGPz6KucRW_ils95ZMrEmlTflPm8&m=e2jC0sFEUAs6byl7JOv5IAZTKPkABl-Eh6rQwQ55tWk&s=Qo1wKkAeo1uTCH83dVO-IVt4MWhQRk7rg3xKmlzPGhI&e=> for Windows 10
_____________________________________________________________________
The information transmitted in this message and its attachments (if any) is intended
only for the person or entity to which it is addressed.
The message may contain confidential and/or privileged material. Any review,
retransmission, dissemination or other use of, or taking of any action in reliance
upon this information, by persons or entities other than the intended recipient is
prohibited.
If you have received this in error, please contact the sender and delete this e-mail
and associated material from any computer.
The intended recipient of this e-mail may only use, reproduce, disclose or distribute
the information contained in this e-mail and any attached files, with the permission
of the sender.
This message has been scanned for viruses.
_____________________________________________________________________
More information about the openstack-discuss
mailing list