<div dir="ltr">On 24 May 2016 at 05:46, John Griffith <span dir="ltr"><<a href="mailto:john.griffith8@gmail.com" target="_blank">john.griffith8@gmail.com</a>></span> wrote:<br><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><br><div class="gmail_extra"><div style="font-family:monospace,monospace">Just curious about a couple things: Is this attempting to solve a problem in the actual Cinder Volume Service or is this trying to solve problems with backends that can't keep up and deliver resources under heavy load? </div></div></div></blockquote><div><br></div><div>I would posit that no backend can cope with infinite load, and with things like A/A c-vol on the way, cinder is likely to get more efficient to the point it will start stressing more backends. It is certainly worth thinking about. <br><br>We've more than enough backend technologies that have different but entirely reasonable metadata performance limitations, and several pieces of code outside of backend's control (examples: FC zoning, iSCSI multipath) seem to have clear scalability issues.<br></div><div><br></div><div>I think I share a worry that putting limits everywhere becomes a bandaid that avoids fixing deeper problems, whether in cinder or on the backends themselves.<br></div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div style="font-family:monospace,monospace">I get the copy-image to volume, that's a special case that certainly does impact Cinder services and the Cinder node itself, but there's already throttling going on there, at least in terms of IO allowed.</div></div></div></blockquote><div><br></div><div>Which is probably not the behaviour we want - queuing generally gives a better user experience than fair sharing beyond a certain point, since you get to the point that *nothing* gets completed in a reasonable amount of time with only moderate loads.<br><br></div><div>It also seems to be a very common thing for customers to try to boot 300 instances from volume as an early smoke test of a new cloud deployment. I've no idea why, but I've seen it many times, and others have reported the same thing. While I'm not entirely convinced it is a reasonable test, we should probably make sure that the usual behaviour for this is not horrible breakage. The image cache, if turned on, certainly helps massively with this, but I think some form of queuing is a good thing for both image cache work and probably backups too eventually. <br></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div style="font-family:monospace,monospace">Also, I'm curious... would the exiting API Rate Limit configuration achieve the same sort of thing you want to do here? Granted it's not selective but maybe it's worth mentioning.</div></div></div></blockquote><div><br></div><div>Certainly worth mentioning, since I'm not sure how many people are aware it exists. My experiences of it were that it was too limited to be actually useful (it only rate limits a single process, and we've usually got more than enough enough API workers across multiple nodes that very significant loads are possible before tripping any reasonable per-process rate limit).<br></div><div><br> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div style="font-family:monospace,monospace">If we did do something like this I would like to see it implemented as a driver config; but that wouldn't help if the problem lies in the Rabbit or RPC space. That brings me back to wondering about exactly where we want to solve problems and exactly which. If delete is causing problems like you describe I'd suspect we have an issue in our DB code (too many calls to start with) and that we've got some overhead elsewhere that should be eradicated. Delete is a super simple operation on the Cinder side of things (and most back ends) so I'm a bit freaked out thinking that it's taxing resources heavily.</div></div></div></blockquote><div><br></div><div>I agree we should definitely do more analysis of where the breakage occurs before adding many limits or queues. Image copy stuff is an easy to analyse first case - i/o stat can tell you exactly where the problem is.<br></div><div><br></div><div>Using the fake backend and a large number of API workers / nodes with a pathological load trivially finds breakages currently, though it depends exactly which code version you're running as to where the issues are. The compare & update changes (aka race avoidance patches) have removed a bunch of these, but seem to have led to a significant increase in DB load that means it is easier to get DB timeouts and other issues.<br><br></div><div>As for delete being resource heavy, our reference driver provides a pathological example with the secure delete code. Now that we've got a high degree of confidence in the LVM thin code (specifically, I'm not aware of any instances where it is worse than the LVM-thick code and I don't see any open bugs that disagree), is it time to dump the LVM-thick support completely?<br></div><div><br> </div></div><div class="gmail_signature"><div dir="ltr"><div>-- <br>Duncan Thomas</div></div></div>
</div></div>