<div dir="ltr">Hi Pierre-Samuel,<div>at this point most of the OpenStack projects have their own way to archive/delete soft deleted records.</div><div>But one thing usually missing is the retention period of soft deleted records and then the archived data.</div><div><br></div><div>I'm interested to learn more about what you are doing.</div><div>Is there any link to access the code?</div><div><br></div><div>Belmiro</div><div>CERN</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, May 9, 2019 at 5:25 PM Pierre-Samuel LE STANG <<a href="mailto:pierre-samuel.le-stang@corp.ovh.com">pierre-samuel.le-stang@corp.ovh.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left-width:1px;border-left-style:solid;border-left-color:rgb(204,204,204);padding-left:1ex">Hi all,<br>

<br>

At OVH we needed to write our own tool that archive data from OpenStack<br>

databases to prevent some side effect related to huge tables (slower response<br>

time, changing MariaDB query plan) and to answer to some legal aspects.<br>

<br>

So we started to write a python tool which is called OSArchiver that I briefly<br>

presented at Denver few days ago in the "Optimizing OpenStack at large scale"<br>

talk. We think that this tool could be helpful to other and are ready to open<br>

source it, first we would like to get the opinion of the ops community about<br>

that tool.<br>

<br>

To sum-up OSArchiver is written to work regardless of Openstack project. The<br>

tool relies on the fact that soft deleted data are recognizable because of<br>

their 'deleted' column which is set to 1 or uuid and 'deleted_at' column which<br>

is set to the date of deletion.<br>

<br>

The points to have in mind about OSArchiver:<br>

* There is no knowledge of business objects<br>

* One table might be archived if it contains 'deleted' column<br>

* Children rows are archived before parents rows<br>

* A row can not be deleted if it fails to be archived<br>

<br>

Here are features already implemented:<br>

* Archive data in an other database and/or file (actually SQL and CSV<br>

formats are supported) to be easily imported<br>

* Delete data from Openstack databases<br>

* Customizable (retention, exclude DBs, exclude tables, bulk insert/delete)<br>

* Multiple archiving configuration<br>

* Dry-run mode<br>

* Easily extensible, you can add your own destination module (other file<br>

format, remote storage etc...)<br>

* Archive and/or delete only mode<br>

<br>

It also means that by design you can run osarchiver not only on OpenStack<br>

databases but also on archived OpenStack databases.<br>

<br>

Thanks in advance for your feedbacks.<br>

<br>

-- <br>

Pierre-Samuel Le Stang<br>

<br>

</blockquote></div>