[openstack-dev] [TROVE] Point in time recovery design

Denis Makogon dmakogon at mirantis.com
Tue Mar 4 11:15:01 UTC 2014


Trove. Point-in-Time recovery.




   1.

   Introduction<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.tg53tr6tfa3>
   .
   2.

   What is a point in time
recovery?<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.rnvnnwld05c2>
   3.

   What does it take to do a point in time
recovery?<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.cp013g5it8qq>
   4.

   What to consider once you know your database is
corrupted?<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.oe6kdjz502c7>
   5.

   Trove and Point-in-time
recovery.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.1m5g1t97bfdx>
   6.

   Trove core ReST API and Point-in-Time Recovery/Restore
flow.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.qjqnx4eo6du8>
   1.

      ReST routes.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.yedhftl8z7td>
      2.

      Request body.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.7grnwb9z2u3g>
      3.

      Response object.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.3n48u9fe0w43>
      7.

   Trove taskmanager RPC API and Point-in-Time Recovery/Restore
flow.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.1pr74mrhlana>
   1.

      RPC message.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.hk9iralbbn60>
      2.

      RPC message
type.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.kgt3wgj6q4j7>
      8.

   Trove guestagent RPC API and Point-in-Time Recovery/Restore
flow.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.gme8lf1lvok2>
   1.

      RPC message.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.wxienab8nvi7>
      2.

      RPC message
type.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.stpp0ym7nd04>
      3.

      Method implementation.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.y4xmxn4pju8f>
      9.

   Proposed implementation for Trove and for
Python-troveclient.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.5bpxcfj9gujs>
   10.

   Useful links.<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.p0h05fqgo9oj>

<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.p0h05fqgo9oj>
I<https://docs.google.com/a/mirantis.com/document/d/12qHHYCQ3BTOKCEcbfp-75NPJc15xPD01WEQe9OmyOxc/edit#heading=h.p0h05fqgo9oj>
ntroduction

Every once in a while, an event might happen that corrupts a database. We
have all made a stupid mistake at least once that had trashed a database.
When this happens what do you do? If you do not have a database backup,
then you had better own up to the problem you caused and tell your boss
that you screwed up. If you do have at least a complete database backup
then you most likely will be able to recover the corrupted database, up to
the point that you corrupted the data. This article will discuss how to use
a point in time restore to recover your databases.

If you google "Point in time recovery" you also could find "Point in time
restore". So, let decide how to call it. Historically, database has a
feature called Point in time recovery.

What is a point-in-time recovery?

So what is a point in time recovery? A point in time recovery is restoring
a database to a specified date and time. When you have completed a point in
time recovery, your database will be in the state it was at the specific
date and time you identified when restoring your database. A point in time
recovery is a method to recover your database to any point in time since
the last database backup.

What does it take to do a point-in-time recovery?

In order to perform a point in time recovery you will need to have an
entire series of backups (complete, differential, and transaction log
backups) up to and/or beyond the point in time in which you want to
recover. If you are missing any backups, or have truncated the transaction
log without first performing a transaction log backup, then you will not be
able to perform a point in time recovery. At a minimum, you will need a
complete backup and all the transaction log backups taken following the
complete backup. Optionally if you are taking differential backups, then
you will need the complete backup, the last differential backup prior to
the corruption, then all the transaction log backups taken following the
differential backup.

Trove and Point-in-time recovery

OpenStack DBaaS Trove is able to perform instance restoration (whole new
instance, from scratch) from previously stored backup in remote storage
(OpenStack Swift, Amazon AWS S3, etc). From administration/regular user
perspective Trove should be able to perform point in time recovery.
Basically it's almost the same as restoring new instance, but the
difference between restore (in terms of Trove) and recovery is huge.

Restore gives an ability to spin-up new instance from backup (as mentioned
earlier), but the Recovery gives an ability to restore already running
instance from backup. For the beginning Trove would be able to
recover/restore running instance from full backup.
Trove core ReST API and Point-in-Time Recovery/Restore flow
ReST routes

HTTP method

Routes


POST

{tenant_id}/instances/{instance_id}/recover

or

{tenant_id}/instances/{instance_id}/restore
Request body

"recovery": {

    "instance": UUID,

    "backup": UUID,

}
Response object

"recovery": {

    "id": "instance_id",

    "name": "instance_name",

    "status": "BUILDING",

    "datastore": "mysql",

    "recovered_from_backup": "backup_id",

    "point_in_time": "2011-01-22T13:25:27-06:00",

}

Trove taskmanager RPC API and Point-in-Time Recovery/Restore flow
RPC message

RPC method

Method parameters

do_instance_recovery

instance_id

backup_id
RPC message type

    CAST with poll until instance reach ACTIVE status.
Trove guestagent RPC API and Point-in-Time Recovery/Restore flow
RPC message

RPC method

Method parameters


do_recovery

       backup_info: {

                     'id': backup_id,

                     'location': location,

                     'type': backup_type,

                    'checksum': checksum,

       }
RPC message type

    CAST

Method implementation

Re-used restore functionality (restore from full backup).

Proposed implementation for Trove and for Python-troveclient

   1.

   Trove: [1]
   2.

   Python-troveclient: [2]

Useful links     [1] https://review.openstack.org/#/c/77222/     [2]
https://review.openstack.org/#/c/77223/


Best Regards,
Denis Makogon
dmakogon at mirantis.com
www.mirantis.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140304/a4ae2d61/attachment.html>


More information about the OpenStack-dev mailing list