[openstack-dev] [neutron][oslo.db] db retry madness

Michael Bayer mbayer at redhat.com
Thu Sep 7 15:11:59 UTC 2017


I'm trying to get a handle on neutron's retry decorators.    I'm
assisting a developer in the networking_odl project who wishes to make
use of neutron's retry_if_session_inactive decorator.

The developer has run into several problems, including that
networking_odl methods here seem to accept a "session" directly,
rather than the "context" as is standard throughout neutron, and
retry_if_session_inactive assumes the context calling pattern.
Additionally, the retry_db_errors() decorator which is also used by
retry_if_session_inactive is doing a deepcopy operation on arguments,
which when applied to ORM-mapped entities, makes a copy of them which
are then detached from the Session and they fail to function beyond
that.

For context, the developer's patch is at:
https://review.openstack.org/#/c/500584/3

and the neutron features I'm looking at can be seen when they were
reviewed at: https://review.openstack.org/#/c/356530/22


So I have a bunch of questions, many are asking the same things: "is
networking_odl's pattern legacy or not?" and "is this retry decorator
buggy?"   There's overlap in these questions.  But "yes" / "no" to
each would still help me figure out that this is a giraffe and not a
duck :) ("does it have a long neck?" "does it quack?" etc)

1. since the retry_if_session_inactive decorator assumes functions
that receive a "context", what does it mean that networking_odl uses
functions that receive a "session" and not a "context" ?   Should
networking_odl be modified such that it accepts the same "context"
that Neutron passes around?

2. Alternatively, should the retry_if_session_inactive decorator be
modified (or a new decorator added) that provides the identical
behavior for functions that receive a "session" argument rather than a
"context" argument ?  (or should this be done anyway even if
networking_odl's pattern is legacy?)

3. I notice that the session.is_active flag is considered to be
meaningful.   Normally, this flag is of no use, as the enginefacade
pattern only provides for a Session that has had the begin() method
called.  However, it looks like in neutron_lib/context.py, the Context
class is providing for a default session
"db_api.get_writer_session()", which I would assume is how we get a
Session with is_active is false:

    3a.  Is this a temporary measure to work around legacy patterns?
    3b.  If 3a., is the pattern in use by networking_odl, receiving a
"session" and not "context", the specific legacy pattern in question?
    3c.  if 3a and 3b, are we expecting all the various plugins to
start using "context" at some point ?  (and when they come to me with
breakage, I can push them in that direction?)

4. What is the bug we are fixing by using deepcopy() with the
arguments passed to the decorated function:

    4a.  Is it strictly that simple mutable lists and dicts of simple
types (ints, strings, etc.) are preserved across retries, assuming the
receiving functions might be modifying them?

    4b. Does this deepcopy() approach have any support for functions
that are being passed not just simple structures but ORM-mapped
objects?   Was this use case considered?

    4c. Does Neutron's model base have any support for deepcopy()?  I
notice that the base classes in model_base do **not** seem to have a
__deepcopy__ present.  This would mean that this deepcopy() will
definitely break any ORM mapped objects because you need to use an
approach like what Nova uses for copy():
https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/models.py#L46
.  This would be why the developer's objects are all getting kicked
out of the session.   (this is the "is this a bug?" part)

    4d. How frequent is the problem of mutable lists and dicts, and
should this complex and actually pretty expensive behavior be
optional, only for those functions that are known to receieve mutable
structures and change them in place?



More information about the OpenStack-dev mailing list