[openstack-dev] [all][oslo] Dealing with database connection sharing issues

Doug Hellmann doug at doughellmann.com
Fri Feb 20 15:17:21 UTC 2015



On Thu, Feb 19, 2015, at 09:45 PM, Mike Bayer wrote:
> 
> 
> Doug Hellmann <doug at doughellmann.com> wrote:
> 
> >> 5) Allow this sort of connection sharing to continue for a deprecation
> >> period with apppropriate logging, then make it a hard failure.
> >> 
> >> This would provide services time to find and fix any sharing problems
> >> they might have, but would delay the timeframe for a final fix.
> >> 
> >> 6-ish) Fix oslo-incubator service.py to close all file descriptors after
> >> forking.
> >> 
> > 
> > I'm not sure why 6 is "slower", can someone elaborate on that?
> 
> So, option “A”, they call engine.dispose() the moment they’re in a fork,
> the activity upon requesting a connection from the pool is: look in pool,
> no connections present, create a connection and return it.

This feels like something we could do in the service manager base class,
maybe by adding a "post fork" hook or something.

Josh's patch to forcibly close all file descriptors may be something
else we want, but if we can reset open connections "cleanly" when we
know how, that feels better than relying on detecting broken sockets.

> 
> Option “5”, the way the patch is right now to auto-invalidate on
> detection of new fork, the activity upon requesting a connection is from
> the pool is: look in pool, connection present, check that os.pid()
> matches what we’ve associated with the connection record, if not, we
> raise an exception indicating “invalid”, this is immediately caught, sets
> the connection record as “invalid”, the connection record them
> immediately disposes that file descriptor, makes a new connection and
> returns that.
> 
> Option “6”, the new fork starts, the activity upon requesting a
> connection from the pool is: look in pool, connection present, perform
> the oslo.db “ping” event, ping event emits “SELECT 1” to the MySQLdb
> driver, driver attempts to emit this statement on the socket, socket
> communication fails, MySQLdb converts to an exception, exception is
> raised, SQLAlchemy catches the exception, sends it to a parser to
> determine the nature of the exception, we see that it’s a “disconnect”
> exception, we set the “invalidate” flag on the exception, we re-raise,
> oslo.db’s exc_filters then catch the exception, more string parsers get
> involved, we determine we need to raise an oslo.db.DBDisconnect
> exception, we raise that, the “SELECT 1” ping handler catches that, we
> then emit “SELECT 1” again so that it reconnects, we then hit the
> connection record that’s in “invalid” state so it knows to reconnect, it
> reconnects and the “SELECT 1” continues on the new connection and we
> start up.
> 
> So essentially option “5” (the way the gerrit is right now) has a subset
> of the components of “6”; “6” has the additional steps of: emit a doomed
> statement on the closed socket, then when it fails raise / catch / parse
> / reraise / catch / parse / reraise that exception.   Option “5” just
> has, check the pid, raise / catch an exception.
> 
> IMO the two options are: “5”, check the pid and recover or “3” make it a
> hard failure.

And I don't think we want the database library doing anything with this
case at all. Recovery code is tricky, and often prevents valid use cases
(perhaps the parent *meant* for the child to reuse the open connection
and isn't going to continue using it so there won't be a conflict).

The bug here is in the way the application, using Oslo's service module,
is forking. We should fix the service module to make it possible to fork
correctly, and to have that be the default behavior. The db library
shouldn't be concerned with whether or not it's in a forked process --
that's not its job.

Doug

>  
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list