<div dir="ltr"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 23, 2016 at 1:49 PM, Shrinand Javadekar <span dir="ltr"><<a href="mailto:shrinand@maginatics.com" target="_blank">shrinand@maginatics.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>

If objects are placed on different devices than the computed ones,<br>

they will be unavailable until the replication places them at the<br>

correct location. </blockquote><div><br></div><div>This part doesn't sound quite right to me, but the transaction logs will tell.</div><div><br></div><div>My guess is that if the nodes the data is getting written too (primary or handoff) are so overloaded they're getting timed out - it's possible after request_node_count checks on to the backend storage nodes the response still ends up looking like a 404 because none of the nodes that were able to respond had the data.</div><div><br></div><div>Imagine if the original PUT all three primaries fail their connect Timeout and so the request is streamed to first three handoffs only two of which complete successfully.  The storage nodes responded [Timeout, Timeout, Timeout, 201, 201, 503*] - so it's written successfully to two of the three handoff nodes (*ChunkWriteTimeout on the third, remember this cluster is terribly overloaded).</div><div><br></div><div>Then on GET the response might be [404, 404, 404, Timeout, Timeout, 404, 404, 404, 404] - the first primaries miss of course, but if the first two handoffs then timeout, it doesn't matter how many other handoff nodes are checked - the response has to be 404 - the two places we wrote the data are both so hammered under load they can't respond.</div><div><br></div><div>But it's not because replication needs to "move" anything - yes, it will eventually get moved from the handoffs to the primaries, but in the meantime the read path is going to use the same stable handoff pattern as the write path.</div><div><br></div><div>... but that's just my guess, it's a rather curious failure mode, the transaction logs would have all the details.  Happy hunting!</div><div><br></div><div>-Clay</div><div><br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">And this could take a really long time. Is that<br>

right?<br>

<br>

-Shri<br>

<div class="HOEnZb"><div class="h5"><br>

On Fri, May 20, 2016 at 4:53 PM, Mark Kirkwood<br>

<<a href="mailto:mark.kirkwood@catalyst.net.nz">mark.kirkwood@catalyst.net.nz</a>> wrote:<br>

> On 21/05/16 05:27, Shrinand Javadekar wrote:<br>

>><br>

>> Hi,<br>

>><br>

>> I am troubleshooting a test setup where Swift returned a 201 for<br>

>> objects that were put in it but later when I tried to read it, I got<br>

>> back 404s.<br>

>><br>

>> The system has been under load. I see lots of connection errors,<br>

>> lock-timeouts, etc. However, I am not sure if ever Swift should be<br>

>> returning a 404.<br>

>><br>

>> I tried simulating some of these on a different setup and always got<br>

>> the expected response (which wasn't a 404).<br>

>><br>

>> - Stopped memcached and did a blob get. This returned a 401 Unauthorized<br>

>> error.<br>

>><br>

>> - Stopped the object-server and did a blob get. This returned a 503<br>

>> internal server error.<br>

>><br>

>> - Stopped the container-server. This didn't have any effect. The<br>

>> container-server is not looked during every GET.<br>

>> - Stopped the account-server. Same result as container-server.<br>

>><br>

>> Any ideas on when Swift might return a 404 even though the object was<br>

>> successfully written?<br>

>><br>

><br>

> I addition to what John said, I've seen that sort of behaviour on slow or<br>

> heavily loaded systems (e.g):<br>

><br>

> - write an object (successful)<br>

> - immediately try to read it (404)<br>

> - a few minutes later try to read it (successful)<br>

><br>

> This is because the replication step can take some time to place the object<br>

> on all the devices where it is supposed to live (i.e a read may not always<br>

> look at where the object has just been written).<br>

><br>

> Cheers<br>

><br>

> Mark<br>

><br>

><br>

> _______________________________________________<br>

> Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

> Post to     : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a><br>

> Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

<br>

_______________________________________________<br>

Mailing list: <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

Post to     : <a href="mailto:openstack@lists.openstack.org">openstack@lists.openstack.org</a><br>

Unsubscribe : <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack</a><br>

</div></div></blockquote></div><br></div></div>