[openstack-dev] [oslo] rpc concurrency control rfc
Edward Hope-Morley
edward.hope-morley at canonical.com
Wed Nov 27 19:54:31 UTC 2013
On 27/11/13 19:34, Daniel P. Berrange wrote:
> On Wed, Nov 27, 2013 at 06:43:42PM +0000, Edward Hope-Morley wrote:
>> On 27/11/13 18:20, Daniel P. Berrange wrote:
>>> On Wed, Nov 27, 2013 at 06:10:47PM +0000, Edward Hope-Morley wrote:
>>>> On 27/11/13 17:43, Daniel P. Berrange wrote:
>>>>> On Wed, Nov 27, 2013 at 05:39:30PM +0000, Edward Hope-Morley wrote:
>>>>>> On 27/11/13 15:49, Daniel P. Berrange wrote:
>>>>>>> On Wed, Nov 27, 2013 at 02:45:22PM +0000, Edward Hope-Morley wrote:
>>>>>>>> Moving this to the ml as requested, would appreciate
>>>>>>>> comments/thoughts/feedback.
>>>>>>>>
>>>>>>>> So, I recently proposed a small patch to the oslo rpc code (initially in
>>>>>>>> oslo-incubator then moved to oslo.messaging) which extends the existing
>>>>>>>> support for limiting the rpc thread pool so that concurrent requests can
>>>>>>>> be limited based on type/method. The blueprint and patch are here:
>>>>>>>>
>>>>>>>> https://blueprints.launchpad.net/oslo.messaging/+spec/rpc-concurrency-control
>>>>>>>>
>>>>>>>> The basic idea is that if you have server with limited resources you may
>>>>>>>> want restrict operations that would impact those resources e.g. live
>>>>>>>> migrations on a specific hypervisor or volume formatting on particular
>>>>>>>> volume node. This patch allows you, admittedly in a very crude way, to
>>>>>>>> apply a fixed limit to a set of rpc methods. I would like to know
>>>>>>>> whether or not people think this is sort of thing would be useful or
>>>>>>>> whether it alludes to a more fundamental issue that should be dealt with
>>>>>>>> in a different manner.
>>>>>>> Based on this description of the problem I have some observations
>>>>>>>
>>>>>>> - I/O load from the guest OS itself is just as important to consider
>>>>>>> as I/O load from management operations Nova does for a guest. Both
>>>>>>> have the capability to impose denial-of-service on a host. IIUC, the
>>>>>>> flavour specs have the ability to express resource constraints for
>>>>>>> the virtual machines to prevent a guest OS initiated DOS-attack
>>>>>>>
>>>>>>> - I/O load from live migration is attributable to the running
>>>>>>> virtual machine. As such I'd expect that any resource controls
>>>>>>> associated with the guest (from the flavour specs) should be
>>>>>>> applied to control the load from live migration.
>>>>>>>
>>>>>>> Unfortunately life isn't quite this simple with KVM/libvirt
>>>>>>> currently. For networking we've associated each virtual TAP
>>>>>>> device with traffic shaping filters. For migration you have
>>>>>>> to set a bandwidth cap explicitly via the API. For network
>>>>>>> based storage backends, you don't directly control network
>>>>>>> usage, but instead I/O operations/bytes. Ultimately though
>>>>>>> there should be a way to enforce limits on anything KVM does,
>>>>>>> similarly I expect other hypervisors can do the same
>>>>>>>
>>>>>>> - I/O load from operations that Nova does on behalf of a guest
>>>>>>> that may be running, or may yet to be launched. These are not
>>>>>>> directly known to the hypervisor, so existing resource limits
>>>>>>> won't apply. Nova however should have some capability for
>>>>>>> applying resource limits to I/O intensive things it does and
>>>>>>> somehow associate them with the flavour limits or some global
>>>>>>> per user cap perhaps.
>>>>>>>
>>>>>>>> Thoughts?
>>>>>>> Overall I think that trying to apply caps on the number of API calls
>>>>>>> that can be made is not really a credible way to avoid users inflicting
>>>>>>> DOS attack on the host OS. Not least because it does nothing to control
>>>>>>> what a guest OS itself may do. If you do caps based on num of APIs calls
>>>>>>> in a time period, you end up having to do an extremely pessistic
>>>>>>> calculation - basically have to consider the worst case for any single
>>>>>>> API call, even if most don't hit the worst case. This is going to hurt
>>>>>>> scalability of the system as a whole IMHO.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Daniel
>>>>>> Daniel, thanks for this, these are all valid points and essentially tie
>>>>>> with the fundamental issue of dealing with DOS attacks but for this bp I
>>>>>> actually want to stay away from this area i.e. this is not intended to
>>>>>> solve any tenant-based attack issues in the rpc layer (although that
>>>>>> definitely warrants a discussion e.g. how do we stop a single tenant
>>>>>> from consuming the entire thread pool with requests) but rather I'm
>>>>>> thinking more from a QOS perspective i.e. to allow an admin to account
>>>>>> for a resource bias e.g. slow RAID controller, on a given node (not
>>>>>> necessarily Nova/HV) which could be alleviated with this sort of crude
>>>>>> rate limiting. Of course one problem with this approach is that
>>>>>> blocked/limited requests still reside in the same pool as other requests
>>>>>> so if we did want to use this it may be worth considering offloading
>>>>>> blocked requests or giving them their own pool altogether.
>>>>>>
>>>>>> ...or maybe this is just pie in the sky after all.
>>>>> I don't think it is valid to ignore tenant-based attacks in this. You
>>>>> have a single resource here and it can be consumed by the tenant
>>>>> OS, by the VM associated with the tenant or by Nova itself. As such,
>>>>> IMHO adding rate limiting to Nova APIs alone is a non-solution because
>>>>> you've still left it wide open to starvation by any number of other
>>>>> routes which are arguably even more critical to address than the API
>>>>> calls.
>>>>>
>>>>> Daniel
>>>> Daniel, maybe I have misunderstood you here but with this optional
>>>> extension I am (a) not intending to solve DOS issues and (b) not
>>>> "ignoring" DOS issues since I do not expect to be adding any beyond or
>>>> accentuating those that already exist. The issue here is QOS not DOS.
>>> I consider QOS & DOS to be two sides of the same coin here. A denial of
>>> service is anything which affects the quality of service of the host.
>>> It doesn't have to be done with malicious intent either. I don't think
>>> your proposal provides significant QOS benefits except under some very
>>> narrowly constrained scenario, of which I'm yet to be convinced is
>>> very applicable to the bigger picture / real world deployment scneario.
>>>
>> lets flip this coin for a second and say if I have a cinder-volume node
>> that is using LVM with comparitively slow disks to some other nodes and
>> therefore I wanted to avoid having too many volume delete requests
>> (which will zero out the whole volume) from starving disk IO from tenant
>> reads/writes on those disks. Do we have a way to prevent this currently?
>> If not then I think this extension may come in handy.
> The volume zero'ing code is really dumb currently. It just invokes 'dd'
> which will write data as fast as the underlying storage can accept it.
> Even if you set the RPC limit to only allow one single voluem delete
> request at a time, this is still going to have a noticable impact on
> other tenant I/O performance. Replacing the dd code with something that
> rate-limits its own writes would be a more effective & admin-friendly
> way to avoid tenant I/O starvation IMHO. Sure this involves writing
> new code, but this is long term useful work on the root cause.
>
> Regards,
> Daniel
Ok well anyway, I think I am increasingly in agreement that this
extension (in its current form at least) could lead to undesirable
results. For example, if the API received a load of requests that block
and the API then crashes/gets restarted, it is possible that bad things
will happen since in the case of synchronous calls the AMQP ack may have
been sent back to the producer(s) thus indicating that the calls were
successfully accepted and then you would end up with resources getting
stuck cause the api would no longer have any knowledge of those requests
having been sent.
This may well vary from service to service but is IMO a sufficient
reason not to pursue this further in its current form.
More information about the OpenStack-dev
mailing list