[openstack-dev] [nova][ceilometer][postgresql][gate][telemetry] PostgreSQL gate failure (again)

Monty Taylor mordred at inaugust.com
Wed Feb 1 16:42:53 UTC 2017

On 02/01/2017 10:06 AM, gordon chung wrote:
> On 01/02/17 10:22 AM, Monty Taylor wrote:
>> I personally continue to be of the opinion that without an explicit
>> vocal and well-staffed champion, supporting postgres is more trouble
>> than it is worth. The vast majority of OpenStack deployments are on
>> MySQL - and what's more, the code is written with MySQL in mind.
>> Postgres and MySQL have different trade offs, different things each are
>> good at and different places in which each has weakness. By attempting
>> to support Postgres AND MySQL, we prevent ourselves from focusing
>> adequate attention on making sure that our support for one of them is
>> top-notch and in keeping with best practices for that database.
> is there actually mysql only code or do you mean that once things pass 
> on gate with mysql, devs are happy? if the former, i understand not 
> wanting to have to hack postgresql to pass gate.
>> So let me state my opinion slightly differently. I think we should
>> support one and only one RDBMS backend for OpenStack, and we should open
>> ourselves up to use advanced techniques for that backend. I don't
>> actually care whether that DB is MySQL or Postgres - but the corpus of
>> existing deployments on MySQL and the existing gate jobs I think make
>> the choice one way or the other simple.
> i just want to point out that the current bug passes in mysql because 
> it's more lax in how it parses sql and allows for some 
> non-deterministism. the postgresql gate arguably caught bad sql which is 
> a good thing? ideally, i'd like to have postgresql tested as well but it 
> may be because most/all telemetry's sql backend hacks were required 
> because of mysql and not postgres.

MySQL has the ability to be stricter in what how it parses and what it
accepts. If that is a thing we value, we could totally do some work to
bump up the MySQL strictness mode and keep it there.

That said - 'bad sql' isn't a construct that exists in a vacuum. SQL is
merely a tool to talk to an engine. So while MySQL may allow some SQL
that Postgres would refuse to parse, the ultimate goal is getting data
in and out of the backend database in a manner that is efficient. That's
important (and the real thing I'm on about with all of this) because
MySQL and Postgres have very different query optimizers and deal with
indexing differently. So a query that's written "well" for one of them
can be terrible in the other and vice veras. That doesn't make either
query "invalid" - it makes that query a bad query for that engine.

For instance, InnoDB with MySQL stores rows in a B+ tree clustered by
primary key. Secondary indexes store references to the primary key,
because the clustered primary key nature makes the primary key index a
covering index so a secondary index can get to the data with two
operations. However, since the secondary index holds the primary key, in
many cases it can be much more efficient to add an artificial autoinc
primary key to the table than to use a natural primary key like a UUID,
otherwise the seconary index is going to take much more memory and scans
will have to swap in and out of buffers more. That's a table design idea
that comes from knowing how the underlying storage engine works and also
how index scans use buffers and caches during query processing. The same
design may not be a good choice in Postgres.

While it's quite awesome that SQL is somewhat of a standard-ish (not
that anyone actually cares about the actual standard, which MySQL
implements, as opposed to what is considered the de-facto standard which
is Oracle's fork that Postgres implements) - the underlying engine
differences make writing code that is portable across database engines
not worth the effort except in small or trivial applications, IMO. For
things that aim for actual massive scale, designing the database to take
advantage of the actual characteristics of the database engine is paramount.

As I mentioned before, I don't think it matters which of the two we pick
- although I know _way_ more about MySQL personally and it has a much
more proven track record at absurdly large scale - I just argue that we
should pick one and then actually design our database interactions based
on how that one works.

I don't actually expect us to decide to do that, of course. Why would we
ever say no to more places for people to express their desire for
difference? But I strongly believe it would be in the best interests of
OpenStack if we did.

> full disclaimer: i use postgres for local dev and mysql for my 
> 'production' (because packstack didn't do postgres). aka i use whatever 
> is easier to install.
> cheers,

More information about the OpenStack-dev mailing list