[openstack-dev] Nova quota statistics counting issue

Matt Riedemann mriedem at linux.vnet.ibm.com
Fri Apr 15 19:35:26 UTC 2016



On 4/14/2016 3:07 PM, Andrew Laski wrote:
> On Wed, Apr 13, 2016, at 12:27 PM, Dmitry Stepanenko wrote:
>> Hi Team,
>> I worked on nova quota statistics issue
>> (https://bugs.launchpad.net/nova/+bug/1284424) happenning when nova-*
>> processes are restarted during removing instances and was able to
>> reproduce it. For repro I used devstack and started nova-api and
>> nova-compute in separate screen windows. For killing them I used
>> ctrl+c. As I found this issue happened if nova-* processes are killed
>> after instance was deleted but right before quota commit procedure
>> finishes.
>> We discussed these results with Markus Zoeller and decided that even
>> though killing nova processes is a bit exotic event, this still should
>> be fixed because quotas counting affects billing and very important
>> for us.
> +1. This is very important to get right. And while killing Nova
> processes is exotic during normal operation it could happen for upgrades
> and that should not cause quota issues.
>> So, we need to introduce some mechanism that will prevent us from
>> reaching inconsistent states in terms of quotas. In other words, this
>> mechanism should work in such a way that both instance create/remove
>> operation and quota usage recount operation happened or not happened
>> together.
> There's been some discussion around this, and there are other ML threads
> somewhat discussing it in the context of moving quota enforcement into a
> centralized service/library. There are a couple of approaches that could
> be taken for tackling quotas, but a larger issue is that we have no good
> way of knowing if some change helps the situation. What we need before
> making any changes is a functional test that reproduces the issue.
> Once that is in place I would love to see the removal of the
> quota_usages table and reservations and have quota be based on actual
> usage represented in the instances table. But there are a lot of other
> viewpoints and I think work in this area is going to have to start
> making small incremental improvements.
>> Any ideas how to do that properly?
>> Kind regards,
>> Dmitry
>> ____________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>

I've tried to start that here [1] but it needs work. I have a messier 
local version too that was (I think) reproducing a failure, but because 
it's a weird race condition mess, it's kind of hard to test and know 
when to assert the thing and stop the test.

Maybe I'll just push up the latest WIP of what I have locally and then 
someone else can take it over if they want.

[1] https://review.openstack.org/#/c/293800/

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list