<div dir="ltr">Thanks, Paul and Clay. <div><br></div><div>By "deleted one data fragment" I meant I "rm" only the data file. I did not delete the hashes.pkl file in the outer directory. </div><div><br></div><div>I tried it again. This time deleting both the data file and the hashes.pkl file. The reconstructor is able to restore the data file correctly.</div><div><br></div><div>But now I wonder: is it "by design" that EC does not handle an accidental deletion of just the data file? Deleting both data file and hashes.pkl file is more like a deliberately-created failure case instead of a normal one. To me Swift EC repairing seems different from the triple-replication mode, where you delete any data file copy, it will be restored. </div><div> </div><div class="gmail_extra"><br clear="all"><div><div class="gmail_signature"><div><br></div><div>Thanks</div><div><br></div><div>Changbin<br></div></div></div>
<br><div class="gmail_quote">On Tue, Jul 21, 2015 at 5:28 PM, Luse, Paul E <span dir="ltr"><<a href="mailto:paul.e.luse@intel.com" target="_blank">paul.e.luse@intel.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div lang="EN-US" link="blue" vlink="purple">
<div>
<p class="MsoNormal"><span style="font-family:"Calibri","sans-serif";color:#44546a">I was about to ask that very same thing and, at the same time, if you can indicate if you’ve seen errors in any logs and if so please provide those as well. I’m hoping you
just didn’t delete the hashes.pkl file though </span><span style="font-family:Wingdings;color:#44546a">J</span><span style="font-family:"Calibri","sans-serif";color:#44546a"><u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri","sans-serif";color:#44546a"><u></u> <u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri","sans-serif";color:#44546a">-Paul<u></u><u></u></span></p>
<p class="MsoNormal"><span style="font-family:"Calibri","sans-serif";color:#44546a"><u></u> <u></u></span></p>
<p class="MsoNormal"><b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Clay Gerrard [mailto:<a href="mailto:clay.gerrard@gmail.com" target="_blank">clay.gerrard@gmail.com</a>]
<br>
<b>Sent:</b> Tuesday, July 21, 2015 2:22 PM<br>
<b>To:</b> OpenStack Development Mailing List (not for usage questions)<br>
<b>Subject:</b> Re: [openstack-dev] [Openstack] [Swift] Erasure coding reconstructor doesn't work<u></u><u></u></span></p><div><div class="h5">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">How did you "<span style="font-size:9.5pt">deleted one data fragment"?</span><u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">Like replication the EC consistency engine uses some sub directory hashing to accelerate replication requests in a consistent system - so if you just rm a file down in an hashdir somewhere you also need to
delete the hashes.pkl up in the part dir (or call the invalidate_hash method like PUT, DELETE, POST, and quarantine do)</span><u></u><u></u></p>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">Every so often someone discusses the idea of having the auditor invalidate a hash after "long enough" or take some action on empty hashdirs (mind the races!) - but its really only an issue when someone delete's
something by hand so we normally manage to get distracted with other things.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt">-Clay</span><u></u><u></u></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<p class="MsoNormal">On Tue, Jul 21, 2015 at 1:38 PM, Changbin Liu <<a href="mailto:changbin.liu@gmail.com" target="_blank">changbin.liu@gmail.com</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal">Folks,<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">To test the latest feature of Swift erasure coding, I followed this document (<a href="http://docs.openstack.org/developer/swift/overview_erasure_code.html" target="_blank">http://docs.openstack.org/developer/swift/overview_erasure_code.html</a>)
to deploy a simple cluster. I used Swift 2.3.0.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I am glad that operations like object PUT/GET/DELETE worked fine. I can see that objects were correctly encoded/uploaded and downloaded at proxy and object servers. <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">However, I noticed that swift-object-reconstructor seemed don't work as expected. Here is my setup: my cluster has three object servers, and I use this policy:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<div>
<p class="MsoNormal">[storage-policy:1]<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">policy_type = erasure_coding<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">name = jerasure-rs-vand-2-1<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">ec_type = jerasure_rs_vand<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">ec_num_data_fragments = 2<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">ec_num_parity_fragments = 1<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">ec_object_segment_size = 1048576<u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"><br clear="all">
<u></u><u></u></p>
<div>
<div>
<div>
<p class="MsoNormal">After I uploaded one object, I verified that: there was one data fragment on each of two object servers, and one parity fragment on the third object server. However, when I deleted one data fragment, no matter how long I waited, it never
got repaired, i.e., the deleted data fragment was never regenerated by the swift-object-reconstructor process.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">My question: is swift-object-reconstructor supposed to be "NOT WORKING" given the current implementation status? Or, is there any configuration I missed in setting up swift-object-reconstructor?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Thanks<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="color:#888888"><u></u> <u></u></span></p>
</div>
<div>
<p class="MsoNormal"><span style="color:#888888">Changbin<u></u><u></u></span></p>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">
OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><u></u><u></u></p>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div></div></div>
</div>
<br>__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote></div><br></div></div>