[openstack-dev] [TROVE] Point in time recovery design

Denis Makogon dmakogon at mirantis.com
Tue Mar 4 17:05:05 UTC 2014


Hi, Daniel.

Lets make clear few things:
 "restore" - spinning-up instance from backup.
 "recover" - recover/restore(it terms of database
management/administration) fine/corrupted instance from the backup that has
its own timestamp.
There's a difference in approaches of spinning up new instance from backup
and recovering already running instance.
As i wrote, from administrative and user perspective Trove should be able
to do point in time recovery was fine or corrupted servers.
Also, point-in-time recovery could not be the part of scheduled tasks
because this is an operation is a disposable task.

So, point it time recovery is a "must have feature" for Trove. Basically,
Trove should do both independently - restore/spin-up new and recover
already running instance from backup.

Best regards,
Denis Makogon.


On Tue, Mar 4, 2014 at 6:36 PM, Daniel Morris
<daniel.morris at rackspace.com>wrote:

>   Nice write-up Denis.  Generally, I think this should merge with the
> work going on for scheduled tasks and scheduled backups.
>
>  https://wiki.openstack.org/wiki/Trove/scheduled-tasks
>
>  Point in time recovery was one of the original goals of the scheduled /
> automated backup work, but had not been fully worked out.  Currently this
> work is sitting idle - https://review.openstack.org/#/c/73702/
>
>  I believe that at the time this was originally discussed, the idea was
> that this would be handled on a new instance creation (not an active
> instance), and would be accomplished via a new instance creation as follows:
>
>  POST /instances
>
> {
>          "instance": {
>          "flavorRef": "https://service//v1.0/1234/flavors/1",
>          "name": "myinstance",
>          "volume": {
>              "size": 2
>          }
>          "restorePoint": {
>              "point_in_time" : "2012-03-28T22:00Z",
>              "instanceRef": "
> https://service/v1.0/1234/instances/2450c73f-7805-4afe-a42c-4094ab42666b"
>          }
>     }
> }
>
>  Regardless of the API design (up for debate), we need this capability
> integrated and just need to work out the best way to do it.
>
>  Daniel
>
>   From: Denis Makogon <dmakogon at mirantis.com>
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev at lists.openstack.org>
> Date: Tuesday, March 4, 2014 5:15 AM
> To: OpenStack Development Mailing List <openstack-dev at lists.openstack.org>
> Subject: [openstack-dev] [TROVE] Point in time recovery design
>
>      Trove. Point-in-Time recovery.
>
>
>
>
>    1.
>
>    Introduction<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.tg53tr6tfa3>
>    .
>    2.
>
>    What is a point in time recovery?<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.rnvnnwld05c2>
>    3.
>
>    What does it take to do a point in time recovery?<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.cp013g5it8qq>
>    4.
>
>    What to consider once you know your database is corrupted?<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.oe6kdjz502c7>
>    5.
>
>    Trove and Point-in-time recovery.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.1m5g1t97bfdx>
>    6.
>
>    Trove core ReST API and Point-in-Time Recovery/Restore flow.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.qjqnx4eo6du8>
>     1.
>
>       ReST routes.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.yedhftl8z7td>
>       2.
>
>       Request body.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.7grnwb9z2u3g>
>       3.
>
>       Response object.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.3n48u9fe0w43>
>        7.
>
>    Trove taskmanager RPC API and Point-in-Time Recovery/Restore flow.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.1pr74mrhlana>
>     1.
>
>       RPC message.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.hk9iralbbn60>
>       2.
>
>       RPC message type.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.kgt3wgj6q4j7>
>        8.
>
>    Trove guestagent RPC API and Point-in-Time Recovery/Restore flow.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.gme8lf1lvok2>
>     1.
>
>       RPC message.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.wxienab8nvi7>
>       2.
>
>       RPC message type.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.stpp0ym7nd04>
>       3.
>
>       Method implementation.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.y4xmxn4pju8f>
>        9.
>
>    Proposed implementation for Trove and for Python-troveclient.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.5bpxcfj9gujs>
>    10.
>
>    Useful links.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.p0h05fqgo9oj>
>
>
> <https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.p0h05fqgo9oj>
> I<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.p0h05fqgo9oj>
> ntroduction
>
> Every once in a while, an event might happen that corrupts a database. We
> have all made a stupid mistake at least once that had trashed a database.
> When this happens what do you do? If you do not have a database backup,
> then you had better own up to the problem you caused and tell your boss
> that you screwed up. If you do have at least a complete database backup
> then you most likely will be able to recover the corrupted database, up to
> the point that you corrupted the data. This article will discuss how to use
> a point in time restore to recover your databases.
>
> If you google "Point in time recovery" you also could find "Point in time
> restore". So, let decide how to call it. Historically, database has a
> feature called Point in time recovery.
>
>  What is a point-in-time recovery?
>
> So what is a point in time recovery? A point in time recovery is restoring
> a database to a specified date and time. When you have completed a point in
> time recovery, your database will be in the state it was at the specific
> date and time you identified when restoring your database. A point in time
> recovery is a method to recover your database to any point in time since
> the last database backup.
>
>  What does it take to do a point-in-time recovery?
>
> In order to perform a point in time recovery you will need to have an
> entire series of backups (complete, differential, and transaction log
> backups) up to and/or beyond the point in time in which you want to
> recover. If you are missing any backups, or have truncated the transaction
> log without first performing a transaction log backup, then you will not be
> able to perform a point in time recovery. At a minimum, you will need a
> complete backup and all the transaction log backups taken following the
> complete backup. Optionally if you are taking differential backups, then
> you will need the complete backup, the last differential backup prior to
> the corruption, then all the transaction log backups taken following the
> differential backup.
>
>   Trove and Point-in-time recovery
>
> OpenStack DBaaS Trove is able to perform instance restoration (whole new
> instance, from scratch) from previously stored backup in remote storage
> (OpenStack Swift, Amazon AWS S3, etc). From administration/regular user
> perspective Trove should be able to perform point in time recovery.
> Basically it's almost the same as restoring new instance, but the
> difference between restore (in terms of Trove) and recovery is huge.
>
> Restore gives an ability to spin-up new instance from backup (as
> mentioned earlier), but the Recovery gives an ability to restore already
> running instance from backup. For the beginning Trove would be able to
> recover/restore running instance from full backup.
>  Trove core ReST API and Point-in-Time Recovery/Restore flow
>  ReST routes
>
> HTTP method
>
> Routes
>
>
>  POST
>
> {tenant_id}/instances/{instance_id}/recover
>
>  or
>
>  {tenant_id}/instances/{instance_id}/restore
>    Request body
>
> "recovery": {
>
>     "instance": UUID,
>
>     "backup": UUID,
>
> }
>  Response object
>
> "recovery": {
>
>     "id": "instance_id",
>
>     "name": "instance_name",
>
>     "status": "BUILDING",
>
>     "datastore": "mysql",
>
>     "recovered_from_backup": "backup_id",
>
>     "point_in_time": "2011-01-22T13:25:27-06:00",
>
> }
>
>  Trove taskmanager RPC API and Point-in-Time Recovery/Restore flow
>  RPC message
>
> RPC method
>
> Method parameters
>
> do_instance_recovery
>
> instance_id
>
> backup_id
>    RPC message type
>
>     CAST with poll until instance reach ACTIVE status.
>  Trove guestagent RPC API and Point-in-Time Recovery/Restore flow
>  RPC message
>
> RPC method
>
> Method parameters
>
>
>  do_recovery
>
>        backup_info: {
>
>                      'id': backup_id,
>
>                      'location': location,
>
>                      'type': backup_type,
>
>                     'checksum': checksum,
>
>        }
>    RPC message type
>
>     CAST
>
>  Method implementation
>
> Re-used restore functionality (restore from full backup).
>
>  Proposed implementation for Trove and for Python-troveclient
>
>    1.
>
>    Trove: [1]
>    2.
>
>    Python-troveclient: [2]
>
>  Useful links      [1] https://review.openstack.org/#/c/77222/      [2]
> https://review.openstack.org/#/c/77223/
>
>
>  Best Regards,
>  Denis Makogon
>  dmakogon at mirantis.com
>  www.mirantis.com
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140304/76816375/attachment.html>


More information about the OpenStack-dev mailing list