[openstack-dev] [oslo][oslo.db][keystone] A POC of Keystone over CockroachDB

Ronan-Alexandre Cherrueau ronan-alexandre.cherrueau at inria.fr
Mon Sep 4 16:06:43 UTC 2017


Hi folks,

Recently in the Inria's Discovery initiative[1], we got in touch with
CockroachLabs guys with an idea: make Keystone supports CockorachDB. So
we give it a try and you can find a very first result on our GitHub[2].
The GitHub project consists of a Vagrant file that spawns a VM with a
CockroachDB database and then installs Keystone with Devstack using
CockroachDB as backend.

CockroachDB claims to support the PostgreSQL protocol. It also provides
support for SQLAlchemy that mostly inherits from the PostgreSQL one. So,
making Keystone working with CockroachDB should be easy peasy right?
Almost! The rest of this email describes what we have done, to make it
works.


sqlalchemy-migrate doesn't work with CockroachDB
================================================

Keystone uses a database migration tool named sqlalchemy-migrate[3].
This tool is called during the deployment of Keystone to migrate the
database from its initial versions to its actual version.
Unfortunately, migration failed with the following (partial)
stacktrace:

,----
| DEBUG migrate.versioning.repository [-] Config:
OrderedDict([('db_settings', OrderedDict([('__name__', 'db_settings'),
('repository_id', 'keystone'), ('version_table', 'migrate_version'),
('required_dbs', '[]'), ('use_timestamp_numbering', 'False')]))])
__init__ /usr/local/lib/python2.7/dist-packages/migrate/versioning/repository.py:83
| INFO migrate.versioning.api [-] 66 -> 67...
| CRITICAL keystone [-] KeyError: 'cockroachdb'
| ...
| TRACE keystone     visitorcallable = get_engine_visitor(engine, visitor_name)
| TRACE keystone   File
"/usr/local/lib/python2.7/dist-packages/migrate/changeset/databases/visitor.py",
line 47, in get_engine_visitor
| TRACE keystone     return get_dialect_visitor(engine.dialect, name)
| TRACE keystone   File
"/usr/local/lib/python2.7/dist-packages/migrate/changeset/databases/visitor.py",
line 62, in get_dialect_visitor
| TRACE keystone     migrate_dialect_cls = DIALECTS[sa_dialect_name]
| TRACE keystone KeyError: 'cockroachdb'
`----

As we understand it, sqlalchemy-migrate contains dedicated SQL backend
codes and there is no such code for CockroachDB. As a workaround, we
have performed a second OS deployment with PostgreSQL as backend and
made a dump of the database after migrations. We then bypassed
migrations in our first deployment by setting up CockroachDB with the
dumped script. Note that we had to edit the pgdump script a little bit
because some features are missing in CockroachDB.

The workaround we have to go with represents an obstacle to test
CockroachDB with other OpenStack core services. We would be grateful
if some of you would help us with adding the support of CockroachDB in
sqlalchemy-migrate.


oslo.db contains backend specific code for error handling
=========================================================

The `oslo.db' library contains one file with backend-specific codes
(`oslo_db/sqlalchemy/exc_filters.py'[4]). This file aims at
abstracting database exceptions that differ but target the same class
of error (because each backend produces a specific exception with a
specific error message). Intercepted exceptions are wrapped into a
common OS exception to abstract backend errors in the rest of
`oslo.db'. CockroachDB doesn't produce same errors than PostgreSQL, so
we have to update this class. Note that our POC is not exhaustive
since we only added error messages we saw during Tempest/Rally tests.

You can look at the differences between OpenStack/oslo.db/stable/pike
and our fork on GitHub[5]. We only add two lines!


CockroachDB manages isolation without lock
==========================================

CockroachDB lets you write transactions that respect ACID properties.
Regarding the "I" (Isolation), CockroachDB offers snapshot and
serializable isolation but doesn't rely on locks to do that. So every
time there is concurrent editing transactions that end in a conflict,
then you have to retry the transactions. Fortunately, `oslo.db'
already offers a decorator[6] to do that automatically. But, based on
tests we run with Tempest/Rally, we figured out that some Keystone's
SQL requests needed the decorator.

You can look at the differences between OpenStack/keystone/stable/pike
and our fork on GitHub[7].


What's next?
============

You can drop yourself on the VM as a stack user and run Rally tests
(see README). Tempest is disabled because we have an issue with its
installation[8]. Note that modifications we made to make it works are
minimal. This is promising for the adoption of CockroachDB by other
core services. Nonetheless:

- Fixing the problem with sqlalchemy-migrate will be helpful to ease
  the deployment process. If you are willing to help, we will be
  really grateful.

- We have to look through the performance prism: Keystone over
  CockroachDB vs. Keystone over Galera. This part may involve more
  modifications of `oslo.db'. One thing we have in mind is the
  management of retrying transactions since CockroachDB's
  documentation suggests an approach[9] that doesn't match with actual
  `oslo.db' implementation.

Best,

[1] https://beyondtheclouds.github.io/
[2] https://github.com/BeyondTheClouds/openstack-cockroachdb-dev/
[3] https://github.com/openstack/sqlalchemy-migrate/tree/master/
[4] https://github.com/openstack/oslo.db/blob/stable/pike/oslo_db/sqlalchemy/exc_filters.py
[5] https://github.com/openstack/oslo.db/compare/openstack:stable/pike...BeyondTheClouds:cockroachdb/pike
[6] https://github.com/openstack/oslo.db/blob/stable/pike/oslo_db/api.py#L84
[7] https://github.com/openstack/keystone/compare/stable/pike...BeyondTheClouds:cockroachdb/pike
[8] https://bugs.launchpad.net/devstack/+bug/1714981
[9] https://www.cockroachlabs.com/docs/stable/transactions.html#transaction-retries

-- 
Ronan-A. Cherrueau
https://rcherrueau.github.io



More information about the OpenStack-dev mailing list