[Openstack-operators] [openstack-dev] [openstack-operators][cinder] max_concurrent_builds in Cinder
john.griffith8 at gmail.com
Tue May 24 02:46:14 UTC 2016
On Mon, May 23, 2016 at 8:32 AM, Ivan Kolodyazhny <e0ne at e0ne.info> wrote:
> Hi developers and operators,
> I would like to get any feedback from you about my idea before I'll start
> work on spec.
> In Nova, we've got max_concurrent_builds option  to set 'Maximum number
> of instance builds to run concurrently' per each compute. There is no
> equivalent Cinder.
> Why do we need it for Cinder? IMO, it could help us to address following
> - Creation of N volumes at the same time increases a lot of resource
> usage by cinder-volume service. Image caching feature  could help us a
> bit in case when we create volume form image. But we still have to upload N
> images to the volumes backend at the same time.
> - Deletion on N volumes at parallel. Usually, it's not very hard task
> for Cinder, but if you have to delete 100+ volumes at once, you can fit
> different issues with DB connections, CPU and memory usages. In case of
> LVM, it also could use 'dd' command to cleanup volumes.
> - It will be some kind of load balancing in HA mode: if cinder-volume
> process is busy with current operations, it will not catch message from
> RabbitMQ and other cinder-volume service will do it.
> - From users perspective, it seems that better way is to create/delete
> N volumes a bit slower than fail after X volumes were created/deleted.
> Ivan Kolodyazhny,
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> Just curious about a couple things: Is this attempting to solve a
problem in the actual Cinder Volume Service or is this trying to solve
problems with backends that can't keep up and deliver resources under heavy
load? I get the copy-image to volume, that's a special case that certainly
does impact Cinder services and the Cinder node itself, but there's already
throttling going on there, at least in terms of IO allowed.
Also, I'm curious... would the exiting API Rate Limit configuration achieve
the same sort of thing you want to do here? Granted it's not selective but
maybe it's worth mentioning.
If we did do something like this I would like to see it implemented as a
driver config; but that wouldn't help if the problem lies in the Rabbit or
RPC space. That brings me back to wondering about exactly where we want to
solve problems and exactly which. If delete is causing problems like you
describe I'd suspect we have an issue in our DB code (too many calls to
start with) and that we've got some overhead elsewhere that should be
eradicated. Delete is a super simple operation on the Cinder side of
things (and most back ends) so I'm a bit freaked out thinking that it's
taxing resources heavily.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators