[openstack-dev] [nova] Consistency, efficiency, and safety of NovaObject.save()
mbooth at redhat.com
Wed Nov 12 15:56:40 UTC 2014
I'm currently investigating the feasibility of a generic
compare-and-swap feature for NovaObject.save(). This post isn't about
that, but that's the larger context.
As a preliminary step towards that goal, I've started by investigating
how Nova objects are saved today. Ideally there would be some
consistency in how objects are saved, but unfortunately that's not there
today. An initial inconsistency I have noticed is that some objects
refresh themselves from the database when calling save(), but others don't.
For brevity, I have conflated what happens in object.save() with what
happens in db.api. Where the code lives isn't relevant here: I'm only
looking at what happens.
Specifically, the following objects refresh themselves on save:
whereas the following objects do not:
Excluding irrelevant complexity, the general model for objects which
refresh on update is:
object = <select row from object table>
return <select row from object table again>
Some objects skip out the second select and return the freshly saved
object. That is, a save involves an update + either 1 or 2 selects.
The lack of consistency in behaviour is obviously a problem, and I can't
think of any good reason for a second select for objects which do that.
However, I don't think it is good design for save() to refresh the
object at all, and the reason is concurrency. The cached contents of a
Nova object are *always* potentially stale. A refresh does nothing to
change that, because the contents are again potentially stale as soon as
it returns. Handling this requires concurrency primitives which we don't
currently have (see the larger context I mentioned above). Refreshing an
object's contents might reduce the probability of a race, but it doesn't
fix it. Callers who want a refresh after save can always call
object.refresh(), but for others it's just wasted hits on the db.
Refresh on save() is also arbitrary. Why should the object be updated
then rather than at any other time? The timing of an update in thread X
is unrelated to the timing of an update in thread Y, but it's a problem
whenever it happens.
Can anybody see a problem if we didn't fetch the row at all, and simply
updated it? Absent locking or compare-and-swap this is effectively what
we're already doing, and it reduces the db cost of save to a single
update statement. The difference would be that the object would remain
stale without an explicit refresh(). Value munging would remain unaffected.
Additionally, Instance, InstanceGroup, and Flavor perform multiple
updates on save(). I would apply the same rule to the sub-updates, and
also move them into a single transaction such that the updates are atomic.
Red Hat Engineering, Virtualisation Team
Phone: +442070094448 (UK)
GPG ID: D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
More information about the OpenStack-dev