[openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

Matthew Booth mbooth at redhat.com
Wed Nov 12 16:52:27 UTC 2014

Hash: SHA1

On 12/11/14 16:22, Dan Smith wrote:
>> An initial inconsistency I have noticed is that some objects
>> refresh themselves from the database when calling save(), but
>> others don't.
> I agree that it would be ideal for all objects to behave the same
> in this regard. I expect that in practice, it's not necessary for
> all objects to do this, and since it is an extra step/query in some
> places, it's not without cost to just always refresh (there is no
> reason to refresh a Flavor object AFAIK, for instance).
>> The lack of consistency in behaviour is obviously a problem, and
>> I can't think of any good reason for a second select for objects
>> which do that. However, I don't think it is good design for
>> save() to refresh the object at all, and the reason is
>> concurrency. The cached contents of a Nova object are *always*
>> potentially stale. A refresh does nothing to change that, because
>> the contents are again potentially stale as soon as it returns.
>> Handling this requires concurrency primitives which we don't
>> currently have (see the larger context I mentioned above).
>> Refreshing an object's contents might reduce the probability of a
>> race, but it doesn't fix it. Callers who want a refresh after
>> save can always call object.refresh(), but for others it's just
>> wasted hits on the db.
> Doing a refresh() after a save() is more than just another DB hit,
> it's another RPC round-trip, which is far more expensive.
> There is a lot of code in the compute manager (and elsewhere I'm
> sure) that expects to get back the refreshed object after a save
> (our instance update functions have always exhibited this behavior,
> so there is code built to expect it). Any time something could
> change async on the object that might affect what we're about to do
> will benefit from this implicit refreshing. A good example is when
> we're doing something long-running on an instance and it is deleted
> underneath us. If we didn't get the deleted=True change after a
> save(), we might go continue doing a lot of extra work before we
> notice it the next time.
> It's not that doing so prevents the object's contents from being
> stale, it's that it reduces the amount of time before we notice a
> change, and avoids us needing to explicitly check. Any code we have
> that can't tolerate the object being stale is broken anyway.
>> Refresh on save() is also arbitrary. Why should the object be
>> updated then rather than at any other time? The timing of an
>> update in thread X is unrelated to the timing of an update in
>> thread Y, but it's a problem whenever it happens.
> Because it's not about cache consistency, it's about the expense 
> required to do any of these things. To save(), we *have* to make
> the round-trip, so why not get a refresh at the same time? In cases
> where we explicitly want to refresh, we call refresh(), but
> otherwise we use natural synchronization points like save() to do
> that.

Ok, it makes more sense in this context.

>> Can anybody see a problem if we didn't fetch the row at all, and
>>  simply updated it? Absent locking or compare-and-swap this is 
>> effectively what we're already doing, and it reduces the db cost
>> of save to a single update statement. The difference would be
>> that the object would remain stale without an explicit refresh().
>> Value munging would remain unaffected.
> At least for instance, I don't want to do away with the implicit 
> refreshing. I would be up for any of these options:
> 1. Documenting which objects do and don't auto-refresh 2. Making
> the case for non-Instance objects to not auto-refresh 3. Making
> them all auto-refresh for consistency, with the appropriate 
> re-tooling of the db/api code to minimize performance impact.

I personally favour having consistent behaviour across the board. How
about updating them all to auto-refresh by default for consistency,
but adding an additional option to save() to disable it for particular
calls? Ideally we'd have the opposite default and fix all callers, but
I suspect that's more likely to bite us in the short term unless we're
confident we can identify all the critical callers.

I also suggest a tactical fix to any object which fetches itself twice
on update (e.g. Aggregate).

>> Additionally, Instance, InstanceGroup, and Flavor perform
>> multiple updates on save(). I would apply the same rule to the
>> sub-updates, and also move them into a single transaction such
>> that the updates are atomic.
> Yep, no complaints about fixing these non-atomic updates, of course
> :)


- -- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
Version: GnuPG v1


More information about the OpenStack-dev mailing list