[openstack-dev] [openstack-operators][cinder] max_concurrent_builds in Cinder

Duncan Thomas duncan.thomas at gmail.com
Tue May 24 08:17:16 UTC 2016

On 24 May 2016 at 05:46, John Griffith <john.griffith8 at gmail.com> wrote:

> ​Just curious about a couple things:  Is this attempting to solve a
> problem in the actual Cinder Volume Service or is this trying to solve
> problems with backends that can't keep up and deliver resources under heavy
> load?

I would posit that no backend can cope with infinite load, and with things
like A/A c-vol on the way, cinder is likely to get more efficient to the
point it will start stressing more backends. It is certainly worth thinking

We've more than enough backend technologies that have different but
entirely reasonable metadata performance limitations, and several pieces of
code outside of backend's control (examples: FC zoning, iSCSI multipath)
seem to have clear scalability issues.

I think I share a worry that putting limits everywhere becomes a bandaid
that avoids fixing deeper problems, whether in cinder or on the backends

> I get the copy-image to volume, that's a special case that certainly does
> impact Cinder services and the Cinder node itself, but there's already
> throttling going on there, at least in terms of IO allowed.

Which is probably not the behaviour we want - queuing generally gives a
better user experience than fair sharing beyond a certain point, since you
get to the point that *nothing* gets completed in a reasonable amount of
time with only moderate loads.

It also seems to be a very common thing for customers to try to boot 300
instances from volume as an early smoke test of a new cloud deployment.
I've no idea why, but I've seen it many times, and others have reported the
same thing. While I'm not entirely convinced it is a reasonable test, we
should probably make sure that the usual behaviour for this is not horrible
breakage. The image cache, if turned on, certainly helps massively with
this, but I think some form of queuing is a good thing for both image cache
work and probably backups too eventually.

> Also, I'm curious... would the exiting API Rate Limit configuration
> achieve the same sort of thing you want to do here?  Granted it's not
> selective but maybe it's worth mentioning.

Certainly worth mentioning, since I'm not sure how many people are aware it
exists. My experiences of it were that it was too limited to be actually
useful (it only rate limits a single process, and we've usually got more
than enough enough API workers across multiple nodes that very significant
loads are possible before tripping any reasonable per-process rate limit).

> If we did do something like this I would like to see it implemented as a
> driver config; but that wouldn't help if the problem lies in the Rabbit or
> RPC space.  That brings me back to wondering about exactly where we want to
> solve problems and exactly which.  If delete is causing problems like you
> describe I'd suspect we have an issue in our DB code (too many calls to
> start with) and that we've got some overhead elsewhere that should be
> eradicated.  Delete is a super simple operation on the Cinder side of
> things (and most back ends) so I'm a bit freaked out thinking that it's
> taxing resources heavily.

I agree we should definitely do more analysis of where the breakage occurs
before adding many limits or queues. Image copy stuff is an easy to analyse
first case - i/o stat can tell you exactly where the problem is.

Using the fake backend and a large number of API workers / nodes with a
pathological load trivially finds breakages currently, though it depends
exactly which code version you're running as to where the issues are. The
compare & update changes (aka race avoidance patches) have removed a bunch
of these, but seem to have led to a significant increase in DB load that
means it is easier to get DB timeouts and other issues.

As for delete being resource heavy, our reference driver provides a
pathological example with the secure delete code. Now that we've got a high
degree of confidence in the LVM thin code (specifically, I'm not aware of
any instances where it is worse than the LVM-thick code and I don't see any
open bugs that disagree), is it time to dump the LVM-thick support

Duncan Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160524/48f440a5/attachment.html>

More information about the OpenStack-dev mailing list