[openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()

Dan Smith dms at danplanet.com
Wed Nov 12 16:22:22 UTC 2014

> An initial inconsistency I have noticed is that some objects refresh 
> themselves from the database when calling save(), but others don't.

I agree that it would be ideal for all objects to behave the same in
this regard. I expect that in practice, it's not necessary for all
objects to do this, and since it is an extra step/query in some places,
it's not without cost to just always refresh (there is no reason to
refresh a Flavor object AFAIK, for instance).

> The lack of consistency in behaviour is obviously a problem, and I 
> can't think of any good reason for a second select for objects which 
> do that. However, I don't think it is good design for save() to 
> refresh the object at all, and the reason is concurrency. The cached 
> contents of a Nova object are *always* potentially stale. A refresh 
> does nothing to change that, because the contents are again 
> potentially stale as soon as it returns. Handling this requires 
> concurrency primitives which we don't currently have (see the larger 
> context I mentioned above). Refreshing an object's contents might 
> reduce the probability of a race, but it doesn't fix it. Callers who 
> want a refresh after save can always call object.refresh(), but for 
> others it's just wasted hits on the db.

Doing a refresh() after a save() is more than just another DB hit, it's
another RPC round-trip, which is far more expensive.

There is a lot of code in the compute manager (and elsewhere I'm sure)
that expects to get back the refreshed object after a save (our instance
update functions have always exhibited this behavior, so there is code
built to expect it). Any time something could change async on the object
that might affect what we're about to do will benefit from this implicit
refreshing. A good example is when we're doing something long-running on
an instance and it is deleted underneath us. If we didn't get the
deleted=True change after a save(), we might go continue doing a lot of
extra work before we notice it the next time.

It's not that doing so prevents the object's contents from being stale,
it's that it reduces the amount of time before we notice a change, and
avoids us needing to explicitly check. Any code we have that can't
tolerate the object being stale is broken anyway.

> Refresh on save() is also arbitrary. Why should the object be updated
> then rather than at any other time? The timing of an update in thread
> X is unrelated to the timing of an update in thread Y, but it's a
> problem whenever it happens.

Because it's not about cache consistency, it's about the expense
required to do any of these things. To save(), we *have* to make the
round-trip, so why not get a refresh at the same time? In cases where we
explicitly want to refresh, we call refresh(), but otherwise we use
natural synchronization points like save() to do that.

> Can anybody see a problem if we didn't fetch the row at all, and 
> simply updated it? Absent locking or compare-and-swap this is 
> effectively what we're already doing, and it reduces the db cost of 
> save to a single update statement. The difference would be that the 
> object would remain stale without an explicit refresh(). Value 
> munging would remain unaffected.

At least for instance, I don't want to do away with the implicit
refreshing. I would be up for any of these options:

1. Documenting which objects do and don't auto-refresh
2. Making the case for non-Instance objects to not auto-refresh
3. Making them all auto-refresh for consistency, with the appropriate
   re-tooling of the db/api code to minimize performance impact.

> Additionally, Instance, InstanceGroup, and Flavor perform multiple 
> updates on save(). I would apply the same rule to the sub-updates, 
> and also move them into a single transaction such that the updates 
> are atomic.

Yep, no complaints about fixing these non-atomic updates, of course :)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141112/58bb70d9/attachment.pgp>

More information about the OpenStack-dev mailing list