<div dir="ltr">Got it, thanks very much.</div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Nov 8, 2013 at 2:32 AM, Samuel Merritt <span dir="ltr"><<a href="mailto:sam@swiftstack.com" target="_blank">sam@swiftstack.com</a>></span> wrote:<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">On 11/7/13 5:59 AM, Daniel Li wrote:<br>

</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im">

<br>

Thanks very much for your help, and please see my inline comments/questions.<br>

<br>

On Thu, Nov 7, 2013 at 2:30 AM, Samuel Merritt <<a href="mailto:sam@swiftstack.com" target="_blank">sam@swiftstack.com</a><br></div><div><div class="h5">

<mailto:<a href="mailto:sam@swiftstack.com" target="_blank">sam@swiftstack.com</a>>> wrote:<br>

<br>

    On 11/6/13 7:12 AM, Daniel Li wrote:<br>

<br>

        Hi,<br>

              I have a question about swift:  what does swift do if the<br>

        auditor<br>

        find that all 3 replicas are corrupt?<br>

        will it notify the owner of the object(email to the account owner)?<br>

        what will happen if the GET request to the corrupted object?<br>

        will it return a special error telling that all the replicas are<br>

        corrupted?<br>

           Or will it just say that the object is not exist?<br>

           Or it just return one of the corrupted replica?<br>

           Or something else?<br>

<br>

<br>

    If all 3 (or N) replicas are corrupt, then the auditors will<br>

    eventually quarantine all of them, and subsequent GET requests will<br>

    receive 404 responses.<br>

<br>

    No notifications are sent, nor is it really feasible to start<br>

    sending them. "The auditor" is not a single process; there is one<br>

    Swift auditor process running on each node in a cluster. Therefore,<br>

    when an object is quarantined, there's no way for its auditor to<br>

    know if the other copies are okay or not.<br>

<br>

    Note that this is highly unlikely to ever happen, at least with the<br>

    default of 3 replicas. When an auditor finds a corrupt object, it<br>

    quarantines it (moves it to a "quarantines" directory).<br>

<br>

  Did you mean that when the auditor found the corruption, it did not<br>

copy good replica from other object server to overwrite the corrupted<br>

one, it just moved it to a quarantines directory?<br>

</div></div></blockquote>

<br>

That is correct. The object auditors don't perform any network IO, and in fact do not use the ring at all. All they do is scan the filesystems and quarantine bad objects in an infinite loop.<br>

<br>

(Of course, there are also container and account auditors that do similar things, but for container and account databases.)<div class="im"><br>

<br>

<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">

    Then, since that object is missing, the replication processes will<br>

    recreate the object by copying it from a node with a good copy.<br>

<br>

When did the replication processes recreated the object by copying it<br>

from a node with a good copy? Does the auditor send a message to<br>

replication so the replication will do the copy immediately? And what is<br>

a 'good' copy? Does the good copy's MD5 value is checked before copying?<br>

</blockquote>

<br></div>

It'll happen whenever the other replicators, which are running on other nodes, get around to it.<br>

<br>

Replication in Swift is push-based, not pull-based; there is no receiver here to which a message could be sent.<br>

<br>

Currently, a "good" copy is one that hasn't been quarantined. Since replication uses rsync to push files around the network, there's no checking of MD5 at copy time. However, there is work underway to develop a replication protocol that avoids rsync entirely and uses the object server throughout the entire replication process, and that would give the object server a chance to check MD5 checksums on incoming writes.<br>


<br>

Note that this is only important should 2 replicas experience near-simultaneous bitrot; in that case, there is a chance that bad-copy A will get quarantined and replaced with bad-copy B. Eventually, though, a bad copy will get quarantined and replaced with a good copy, and then you've got 2 good copies and 1 bad one, which reduces to a previously-discussed scenario.<div class="HOEnZb">

<div class="h5"><br>

<br>

______________________________<u></u>_________________<br>

OpenStack-dev mailing list<br>

<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.<u></u>org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/<u></u>cgi-bin/mailman/listinfo/<u></u>openstack-dev</a><br>

</div></div></blockquote></div><br></div>