[openstack-dev] [nova][quotas] quota accounting for failed resource creation

Vishvananda Ishaya vishvananda at gmail.com
Fri Dec 14 23:06:33 UTC 2012


On Dec 14, 2012, at 4:24 AM, Eoghan Glynn <eglynn at redhat.com> wrote:

> 
> Folks,
> 
> I'd like to get some clarity on how exactly we intend to do quota
> accounting wehn a resource creation fails *outside* the API layer.
> 
> It seems that there is/was at least an aspiration to try to
> ensure that the resource has been sucessfully created before
> committing the reservation.
> 
> However, following through on the code paths for several
> different resource types (instances in nova, volumes in cinder)
> seems that this failure detection is usually limited to the API
> layer. For example if the attempt to create the DB record for the
> new resource fails, then we rollback the reservation, whereas if
> say the compute node fails to spin up the instance, then the
> quota headroom will already have been consumed even though the
> resource never springs into life.
> 
> Interestingly, nova *used to* ensure that the instance could at
> least be scheduled before committing the reservation. However
> IIUC this was changed prior to the Folsom release in:
> 
>  https://github.com/openstack/nova/commit/8718f8e4
> 
> I'm wondering whether this change in policy was deliberate in the
> above commit, or just an unintended side-effect?
> 
> Whereas in another case, we do attempt to follow through the
> resource allocation path to ensure success before commiting - the
> quota accounting associated with an instance resize is contigent
> on successfully reaching the FINISH_RESIZE state.
> 
> So seems at the very least we have some asymmetry going on
> here.
> 
> Another related issue is that the healing/sync logic does not
> seem to take the resource state into account (it excludes deleted
> instances but not those in ERROR). So even if we successfully
> avoid over-counting quota on a still-born instance, the sync logic
> may kick in later and overwrite this intention.
> 
> So the main point here is whether we really want to ensure that
> quota isn't consumed for failed resource creation attempts?

I don't think we do. If something goes into ERROR it should still
count against quota until it is explicitly deleted by the user.

Vish




More information about the OpenStack-dev mailing list