<html>

  <head>

    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Comments below..<br>

    <br>

    <div class="moz-cite-prefix">On 5/21/2017 1:38 PM, Monty Taylor

      wrote:<br>

    </div>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">Hi all!

      <br>

      <br>

      As the discussion around PostgreSQL has progressed, it has come

      clear to me that there is a decently deep philosophical question

      on which we do not currently share either definition or agreement.

      I believe that the lack of clarity on this point is one of the

      things that makes the PostgreSQL conversation difficult.

      <br>

      <br>

      I believe the question is between these two things:

      <br>

      <br>

      * Should OpenStack assume the existence of an external database

      service that it treat as an black-box on the other side of a

      connection string?

      <br>

      <br>

      * Should OpenStack take an active and/or opinionated role in

      managing the database service?

      <br>

      <br>

      A potentially obvious question about that (asked by Mike Bayer in

      a different thread) is: "what do you mean by managing?"

      <br>

      <br>

      What I mean by managing is doing all of the things you can do

      related to database operational controls short of installing the

      software, writing the basic db config files to disk and stopping

      and starting the services. It means being much more prescriptive

      about what types of config we support, validating config settings

      that cannot be overridden at runtime and refusing to operate if

      they are unworkable.

      <br>

    </blockquote>

    <br>

    I think it's helpful and important for us to have automation tooling

    like tripleo, puppet, etc. that can stand up a MySQL database. But

    we also have to realize that as shops mature, they will deploy more

    complicated database topologies, clustered configurations, and

    replication scenarios. So I think we shouldn't go overboard with

    being prescriptive. We also have to realize that in the enterprise

    space, databases are usually deployed and managed by a separate

    database team, which means less control over that layer. So we

    shouldn't force people into this model. We should provide best

    practice documentation, examples (tripleo, puppet, ansible, etc.),

    and leave it up to the operator. <br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      Why would we want to be 'more active'? When managing and tuning

      databases, there are some things that are driven by the

      environment and some things that are driven by the application.

      <br>

      <br>

      Things that are driven by the environment include things like the

      amount of RAM actually available, whether or not the machines

      running the database are dedicated or shared, firewall settings,

      selinux settings and what versions of software are available.

      <br>

    </blockquote>

    <br>

    This is a good example of an area that we should focus on

    documenting best practices and leave it to the operator to

    implement. Guidelines around cpu, memory, security settings,

    tunables, etc. are what's needed here. Today, there isn't really any

    guidance or best practices on even sizing the database(s) for a

    given deployment size. <br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      Things that are driven by the application are things like

      character set and collation, schema design, data types, schema

      upgrade and HA strategies.<br>

    </blockquote>

    <br>

    These are things that we can have a bit more control or direction

    on.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      One might argue that HA strategies are an operator concern, but in

      reality the set of workable HA strategies is tightly constrained

      by how the application works, and the pairing an application

      expecting one HA strategy with a deployment implementing a

      different one can have negative results ranging from unexpected

      downtime to data corruption.

      <br>

      <br>

      For example: An HA strategy using slave promotion and a VIP that

      points at the current write master paired with an application

      incorrectly configured to do such a thing can lead to writes to

      the wrong host after a failover event and an application that

      seems to be running fine until the data turns up weird after a

      while.

      <br>

    </blockquote>

    <br>

    This is definitely a more complicated area that becomes more and

    more specific to the clustering technology being used. Galera vs.

    MySQL Cluster is a good example. Galera has an active/passive

    architecture where the above issues become a concern for sure. While

    MySQL Cluster (NDB) is an active/active architecture, so losing a

    node only effects any uncommitted transactions, that could easily be

    addressed with a retry. These topologies will become more

    complicated as people start looking at cross regional replication

    and DR.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      For the areas in which the characteristics of the database are

      tied closely to the application behavior, there is a constrained

      set of valid choices at the database level. Sometimes that

      constrained set only has one member.

      <br>

      <br>

      The approach to those is what I'm talking about when I ask the

      question about "external" or "active".

      <br>

      <br>

      In the "external" approach, we document the expectations and then

      write the code assuming that the database is set up appropriately.

      We may provide some helper tools, such as 'nova-manage db sync'

      and documentation on the sequence of steps the operator should

      take.

      <br>

      <br>

      In the "active" approach, we still document expectations, but we

      also validate them. If they are not what we expect but can be

      changed at runtime, we change them overriding conflicting

      environmental config, and if we can't, we hard-stop indicating an

      unsuitable environment. Rather than providing helper tools, we

      perform the steps needed ourselves, in the order they need to be

      performed, ensuring that they are done in the manner in which they

      need to be done.

      <br>

    </blockquote>

    <br>

    This might be a trickier situation, especially if the database(s)

    are in a separate or dedicated environment that the OpenStack

    service processes don't have access to. Of course for SQL commands,

    this isn't a problem. But changing the configuration files and

    restarting the database may be a harder thing to expect.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      Some examples:

      <br>

      <br>

      * Character Sets / Collations

      <br>

      <br>

      We currently enforce at testing time that all database migrations

      are explicit about InnoDB. We also validate in oslo.db that table

      character sets have the string 'utf8' in them. (only on MySQL) We

      do not have any check for case-sensitive or case-insensitive

      collations (these affect sorting and comparison operations)

      Because we don't, different server config settings or different

      database backends for different clouds can actually behave

      differently through the REST API.

      <br>

      <br>

      To deal with that:

      <br>

      <br>

      First we'd have to decide whether case sensitive or case

      insensitive was what we wanted. If we decided we wanted case

      sensitive, we could add an enforcement of that in oslo.db, and

      write migrations to get from case insensitive indexes to case

      sensitive indexes on tables where we detected that a case

      insensitive collation had been used. If we decided we wanted to

      stick with case insensitive we could similarly add code to enforce

      it on MySQL. To enforce it actively on PostgresSQL, we'd need to

      either switch our code that's using comparisons to use the

      sqlalchemy case-insensitive versions explicitly, or maybe write

      some sort of overloaded driver for PG that turns all comparisons

      into case-insensitive, which would wrap both sides of comparisons

      in lower() calls (which has some indexing concerns, but let's

      ignore that for the moment) We could also take the 'external'

      approach and just document it, then define API tests and try to

      tie the insensitive behavior in the API to Interop Compliance. I'm

      not 100% sure how a db operator would remediate this - but PG has

      some fancy computed index features - so maybe it would be

      possible.

      <br>

    </blockquote>

    <br>

    I think that abstraction with oslo.db would be the right path here.

    But you are also right that we need to have a consistent compliance

    policy at the API layer. We may fix things down at the DB level with

    oslo.db, but everything on top of that needs to also fall in-line.

    There is a very high chance that there are hard-coded workarounds or

    assumptions in the services and apis today.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      A similar issue lurks with the fact that MySQL unicode storage is

      3-byte by default and 4-byte is opt-in. We could take the

      'external' approach and document it and assume the operator has

      configured their my.cnf with the appropriate default, or taken an

      'active' approach where we override it in all the models and make

      migrations to get us from 3 to 4 byte.

      <br>

    </blockquote>

    <br>

    I think an active approach on this would be ideal, just like the

    utf8 and InnoDB settings are today. FYI, not all services are

    enforcing these in a consistent manor today. Another example of

    something that should be abstracted at the oslo.db layer and get the

    human element out of the way.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      * Schema Upgrades

      <br>

      <br>

      The way you roll out online schema changes is highly dependent on

      your database architecture.

      <br>

      <br>

      Just limiting to the MySQL world:

      <br>

      <br>

      If you do Galera, you can do roll them out in Total Order or

      Rolling fashion. Total Order locks basically everything while it's

      happening, so isn't a candidate for "online". In rolling you apply

      the schema change to one node at a time. If you do that, the

      application has to be able to deal with both forms of the table,

      and you have to deal with ensuring that data can replicate

      appropriately while the schema change is happening.

      <br>

      <br>

      If you do DRBD active/passive or a single-node deployment you only

      have one upgrade operation to perform, but you will only lock

      certain things - depending on what schema change operations you

      were performing.

      <br>

      <br>

      If you do master/slave, you can roll out the schema change to your

      slaves one at a time, wait for them all to catch up, then promote

      a slave taking the current master out of commission - update the

      old master then then put it into the slave pool. Like Galera

      rolling, the app needs to be able to handle old and new versions

      and the replication stream needs to be able to replicate between

      the versions.

      <br>

      <br>

      Making sure that the stream is able to replicate puts a set of

      limitations on the types of schema changes you can perform, but it

      is an understandable constrained set.

      <br>

      <br>

      In either approach the OpenStack service has to be able to talk to

      both old and new versions of the schema. And in either approach we

      need to make sure to limit the schema change operations to the set

      that can be accomplished in an online fashion. We also have to be

      careful to not start writing values to new columns until all of

      the nodes have been updated, because the replication stream can't

      replicate the new column value to nodes that don't have the new

      column.

      <br>

    </blockquote>

    <br>

    This is another area where something like MySQL Cluster (NDB) would

    operate differently because it's an active/active architecture. So

    limiting the number of online changes while a table is locked across

    the cluster would be very important. There is also the timeouts for

    the applications to consider, something that could be abstracted

    again with oslo.db.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      In either approach we can decide to limit the number of

      architectures we support for "online" upgrades.

      <br>

      <br>

      In an 'external' approach, we make sure to do those things, we

      write documentation and we assume the database will be updated

      appropriately. We can document that if the deployer chooses to do

      Total Order on Galera, they will not have online upgrades. There

      will also have to be a deployer step to let the services know that

      they can start writing values to the new schema format once the

      upgrade is complete.

      <br>

      <br>

      In an 'active' approach, we can notice that we have an update

      available to run, and we can drive it from code. We can check for

      Galera, and if it's there we can run the upgrade in Rolling

      fashion one node at a time with no work needed on the part of the

      deployer. Since we're driving the upgrade, we know when it's done,

      so we can signal ourselves to start using the new version. We'd

      obviously have to pick the set of acceptable architectures we can

      handle consistently orchestrating.

      <br>

    </blockquote>

    <br>

    This would be an interesting idea to expand to a autonomic

    orchestration framework within the control plane to handle the

    database upgrades online and the restarting of the dependent

    services in the correct order. If we only focus on the database

    piece, it may not be as interesting for operators.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      * Versions

      <br>

      <br>

      It's worth noting that behavior for schema updates and other

      things change over time with backend database version. We set

      minimum versions of other things, like libvirt and OVS - so we

      might also want to set minimum versions for what we can support in

      the database. That way we can know for a given release of

      OpenStack what DDL operations are safe to use for a rolling

      upgrade and what are not. That means detecting such a version and

      potentially refusing to perform an upgrade if the version isn't

      acceptable. That reduces the operator's ability to choose what

      version of the database software to run, but increases our ability

      to be able to provide tooling and operations that we can be

      confident will work.

      <br>

    </blockquote>

    <br>

    Validating the MySQL database version is a good idea. The features

    do change over time. A good example is how in 5.7, you'll get

    warnings about duplicate indexes being dropped in a future release

    which will definitely affect multiple services today.<br>

    <br>

    <blockquote

      cite="mid:522b78dc-c8af-eeeb-5593-da580c93cc8b@inaugust.com"

      type="cite">

      <br>

      == Summary ==

      <br>

      <br>

      These are just a couple of examples - but I hope they're at least

      mildly useful to explain some of the sorts of issues at hand - and

      why I think we need to clarify what our intent is separate from

      the issue of what databases we "support".

      <br>

      <br>

      Some operations have one and only one "right" way to be done. For

      those operations if we take an 'active' approach, we can implement

      them once and not make all of our deployers and distributors each

      implement and run them. However, there is a cost to that.

      Automatic and prescriptive behavior has a higher dev cost that is

      proportional to the number of supported architectures. This then

      implies a need to limit deployer architecture choices.

      <br>

      <br>

      On the other hand, taking an 'external' approach allows us to

      federate the work of supporting the different architectures to the

      deployers. This means more work on the deployer's part, but also

      potentially a greater amount of freedom on their part to deploy

      supporting services the way they want. It means that some of the

      things that have been requested of us - such as easier operation

      and an increase in the number of things that can be upgraded with

      no-downtime - might become prohibitively costly for us to

      implement.

      <br>

      <br>

      I honestly think that both are acceptable choices we can make and

      that for any given topic there are middle grounds to be found at

      any given moment in time.

      <br>

      <br>

      BUT - without a decision as to what our long-term philosophical

      intent in this space is that is clear and understandable to

      everyone, we cannot have successful discussions about the impact

      of implementation choices, since we will not have a shared

      understanding of the problem space or the solutions we're talking

      about.

      <br>

      <br>

      For my part - I hear complaints that OpenStack is 'difficult' to

      operate and requests for us to make it easier. This is why I have

      been advocating some actions that are clearly rooted in an

      'active' worldview.

      <br>

      <br>

      Finally, this is focused on the database layer but similar

      questions arise in other places. What is our philosophy on

      prescriptive/active choices on our part coupled with automated

      action and ease of operation vs. expanded choices for the deployer

      at the expense of configuration and operational complexity. For

      now let's see if we can answer it for databases, and see where

      that gets us.

      <br>

      <br>

      Thanks for reading.

      <br>

      <br>

      Monty

      <br>

      <br>

__________________________________________________________________________

      <br>

      OpenStack Development Mailing List (not for usage questions)

      <br>

      Unsubscribe:

      <a class="moz-txt-link-abbreviated" href="mailto:OpenStack-dev-request@lists.openstack.org?subject:unsubscribe">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>

      <br>

      <a class="moz-txt-link-freetext" href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a>

      <br>

    </blockquote>

    <br>

    <div class="moz-signature"><br>

      <font color="#666666" size="2" face="Verdana, Arial, Helvetica,

        sans-serif"></font></div>

  </body>

</html>