[openstack-dev] [tc] revised Postgresql support status patch for governance

Mike Bayer mbayer at redhat.com
Thu May 18 17:02:45 UTC 2017

On 05/17/2017 02:38 PM, Sean Dague wrote:
> Some of the concerns/feedback has been "please describe things that are
> harder by this being an abstraction", so examples are provided.

so let's go through this list:

- OpenStack services taking a more active role in managing the DBMS

mmmm, "managing" is vague to me, are we referring to the database 
service itself, e.g. starting / stopping / configuring?   installers 
like tripleo do this now, pacemaker is standard in HA for control of 
services, I think I need some background here as to what the more active 
role would look like.

- The ability to have zero down time upgrade for services such as

So "zero down time upgrades" seems to have broken into:

* "expand / contract with the code carefully dancing around the 
existence of two schema concepts simultaneously", e.g. nova, neutron. 
AFAIK there is no particular issue supporting multiple backends on this 
because we use alembic or sqlalchemy-migrate to abstract away basic 
ALTER TABLE types of feature.

* "expand / contract using server side triggers to reconcile the two 
schema concepts", e.g. keystone.   This is more difficult because there 
is currently no "trigger" abstraction layer.   Triggers represent more 
of an imperative programming model vs. typical SQL,  which is why I've 
not taken on trying to build a one-size-fits-all abstraction for this in 
upstream Alembic or SQLAlchemy.   However, it is feasible to build a 
"one-size-that-fits-openstack-online-upgrades" abstraction.  I was 
trying to gauge interest in helping to create this back in the 
"triggers" thread, in my note at 
which also referred to some very raw initial code examples.  However, it 
received strong pushback from a wide range of openstack veterans, which 
led me to believe this was not a thing that was happening.   Apparently 
Keystone has gone ahead and used triggers anyway, however I was not 
pulled into that process.   But if triggers are to be "blessed" by at 
least some projects, I can likely work on this problem for MySQL / 
Postgresql agnosticism.  If keystone is using triggers right now for 
online upgrades, I would ask, are they currently working on Postgresql 
as well with PG-specific triggers, or does Postgresql degrade into a 
"non-online" migration scenario if you're running Keystone?

- Consistent UTF8 4 & 5 byte support in our APIs

"5 byte support" appears to refer to utf-8's ability to be...well a 
total of 6 bytes.    But in practice, unicode itself only needs 4 bytes 
and that is as far as any database supports right now since they target 
unicode (see https://en.wikipedia.org/wiki/UTF-8#Description).  That's 
all any database we're talking about supports at most.  So...lets assume 
this means four bytes.

 From the perspective of database-agnosticism with regards to database 
and driver support for non-ascii characters, this problem has been 
solved by SQLAlchemy well before Python 3 existed when many DBAPIs would 
literally crash if they received a u'' string, and the rest of them 
would churn out garbage; SQLAlchemy implemented a full encode/decode 
layer on top of the Python DBAPI to fix this.  The situation is vastly 
improved now that all DBAPIs support unicode natively.

However, on the MySQL side there is this complexity that their utf-8 
support is a 3-byte only storage model, and you have to use utf8mb4 if 
you want the four byte model.   I'm not sure right now what projects are 
specifically hitting issues related to this.

Postgresql doesn't have such a limitation.   If your Postgresql server 
or specific database is set up for utf-8 (which should be the case), 
then you get full utf-8 character set support.

So I don't see the problem of "consistent utf8 support" having much to 
do with whether or not we support Posgtresql - you of course need your 
"CREATE DATABASE" to include the utf8 charset like we do on MySQL, but 
that's it.

- The requirement that Postgresql libraries are compiled for new users
   trying to just run unit tests (no equiv is true for mysql because of
   the pure python driver).

I would suggest that new developers for whom the presence of things like 
postgresql client libraries is a challenge (but somehow they are running 
a MySQL server for their pure python driver to talk to?)  don't actually 
have to worry about running the tests against Postgresql, this is how 
the "opportunistic" testing model in oslo.db has always worked; it only 
runs for the backends that you have set up.

Also, openstack got all the way through Kilo approximately using the 
native python-MySQL driver which required a compiled client library as 
well as the MySQL dependencies be installed.  The psycopg2 driver has a 
ton of whl's up on pypi (https://pypi.python.org/pypi/psycopg2) and all 
linux distros supply it as a package in any case, so an actual "compile" 
should not be needed.   Also, this is Openstack....it's basic existence 
is a kind of this vastly enormous glue between thousands of native 
(ultimately C-compiled) libraries and packages, and it runs only 
on...linux.    So this is just a weird point to bring up.  Seems a 
little red herring to me.

- Consistency around case sensitivity collation defaults that lead to
   strange bugs around searching/updating names in resources.

Finally, a real issue-ish thing that requires a resolution.    So the 
good news here is that while MySQL is defaulting to case-insensitive 
collations (which I assume we like) but Postgresql has almost no support 
for case-insensitive collations (unless you use this odd CITEXT 
datatype), it is possible to make a case-sensitive collation style 
become case-insensitive at SQL query time much more easily than it would 
be to go the other way.

SQLAlchemy already has some case-insensitive operators, most notably 
"ilike()", which is a case-insensitive "LIKE" that is backend agnostic. 
  If these search queries are just using LIKE then they only need use 
ilike() from SQLAlchemy instead of like().

If we are talking about the full range of operators like ==, !=, <, > 
etc., and/or if we are also concerned that developers may use like() 
when they really need to use ilike(), the issue can be addressed at the 
typing level as well.  Using SQLAlchemy, a String datatype that 
guarantees case-insensitive comparisons is straightforward to construct. 
   This would be a Python side replacement for the String type, and 
possibly Text, Unicode, etc. as needed.   It does **not** imply any 
database schema migrations.   The hypothetical CaseInsensitiveString 
would override all the comparison operators to ensure that on the 
Postgresql (or other case-sensitive) backend, both sides of the 
expression are embedded within the SQL LOWER() function, so that these 
comparisons act case insensitively.    The trick then is to get the 
downstream projects to use this type (which would be in oslo.db) in 
their models, which is the usual herding cats job.  But this is a pretty 
solvable problem.

so to sum up:

1. "more active management role" - not totally sure what that refers to 
beyond what is already present

2. "zero downtime upgrades" - to the extent that projects use server 
side constructs like triggers, effort is needed to produce 
implementations for both backends.   I can help with this effort but the 
downstream projects need to care and be open to reviewing / merging 
gerrits and things.   Not sure what the state of trigger-based online 
upgrades is for Postgresql right now, or if there is a degradation mode
present in Keystone w/ Postgresql.

3. utf-8 - no problem

4. native clients - red herring ?

5. case-insensitive - both query-level and model-level solutions exist 
that work within our existing frameworks and patterns

> A statement around support was also put in there, because support only
> meant QA jobs, or only developers for some folks. I think it's important
> to ensure we paint the whole picture with how people get support in an
> Open Source project.
> There seems to be general agreement that we need to be more honest with
> users, and that we've effectively been lying to them.
> I feel like the current sticking points come down to whether:
> * it's important that the operator community largely is already in one
> camp or not
> * future items listed that are harder are important enough to justify a
> strict trade off here
> * it's ok to have the proposal have a firm lean in tone, even though
> it's set of concrete actions are pretty reversible and don't commit to
> future removal of postgresql
> Also, as I stated on IRC, if some set of individuals came through and
> solved all the future problems on the list for us as a community, my
> care on how many DBs supported would drastically decrease. Because its
> the fact that it's costing us solving real problems that we want to
> solve (by making them too complex for anyone to take on), is my key
> concern. For folks asking the question about what they could do to make
> pg a first class citizen, that's a pretty good starting point.
> 	-Sean

More information about the OpenStack-dev mailing list