[openstack-dev] [oslo][oslo.db] MySQL Cluster support

Mike Bayer mbayer at redhat.com
Thu Feb 2 19:52:46 UTC 2017



On 02/02/2017 02:16 PM, Octave J. Orgeron wrote:
> Hi Doug,
>
> Comments below..
>
> Thanks,
> Octave
>
> On 2/2/2017 11:27 AM, Doug Hellmann wrote:
>> It sounds like part of the plan is to use the configuration setting
>> to control how the migration scripts create tables. How will that
>> work? Does each migration need custom logic, or can we build helpers
>> into oslo.db somehow? Or will the option be passed to the database
>> to change its behavior transparently?
>
> These are good questions. For each service, when the db sync or db
> manage operation is done it will call into SQL Alchemy or Alembic
> depending on the methods used by the given service. For example, most
> use SQL Alchemy, but there are services like Ironic and Neutron that use
> Alembic. It is within these scripts under the <service>/db/* hierarchy
> that the logic exist today to configure the database schema for any
> given service. Both approaches will look at the schema version in the
> database to determine where to start the create, upgrade, heal, etc.
> operations. What my patches do is that in the scripts where a table
> needs to be modified, there will be custom IF/THEN logic to check the
> cfg.CONF.database.mysql_storage_engine setting to make the required
> modifications. There are also use cases where the api.py or model(s).py
> under the <service>/db/ hierarchy needs to look at this setting as well
> for API and CLI operations where mysql_engine is auto-inserted into DB
> operations. In those use cases, I replace the hard coded "InnoDB" with
> the mysql_storage_engine variable.

can you please clarify "replace the hard coded "InnoDB" " ?    Are you 
proposing to send reviews for patches against all occurrences of 
"InnoDB" in files like 
https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py 
?    The "InnoDB" keyword is hardcoded in hundreds of migration files 
across all openstack projects that use MySQL.   Are all of these going 
to be patched with some kind of conditional?


>
> It would be interesting if we could develop some helpers to automate
> this, but it would probably have to be at the SQL Alchemy or Alembic
> levels.

not really, you can build a hook that intercepts operations like 
CreateTable, or that intercepts SQL as it is emitted over a connection, 
in order to modify these values on the fly.  But that is a specific kind 
of approach with it's own set of surprises.   Alternatively you can make 
an alternate SQLAlchemy dialect that no longer recognizes "mysql_*" as 
the prefix for these arguments.   There's ways to do this part.

But more critically I noticed you referred to altering the names of 
columns to suit NDB.  How will this be accomplished?   Changing a column 
name in an openstack application is no longer trivial, because online 
upgrades must be supported for applications like Nova and Neutron.  A 
column name can't just change to a new name, both columns have to exist 
and logic must be added to keep these columns synchronized.

Unfortunately, throughout all of the OpenStack services today we
> are hard coding things like mysql_engine, using InnoDB specific features
> (savepoints, nested operations, etc.), and not following the strict SQL
> orders for modifying table elements (foreign keys, constraints, and
> indexes).

Savepoints aren't InnoDB specific, they are a standard SQL feature and 
also their use is not widespread right now.   I'm not sure what you mean 
by "the strict SQL orders", we use ALTER TABLE as is standard in MySQL 
for this and it's behind an abstraction layer that supports other 
databases such as Postgresql.


>
>   * Many of the SQL Alchemy and Alembic scripts only import the minimal
>     set of python modules. If we imported others, we would also have to
>     initialize those name spaces which means a lot more code :(

I'm not sure what this means, can you clarify ?


    * Reduces the amount of overhead required to make these changes.

What sort of "overhead", do you mean code complexity, performance ?



>
>
>>
>> Keep in mind that we do not encourage code outside of libraries to
>> rely on configuration settings defined within libraries, because
>> that limits our ability to change the names and locations of the
>> configuration variables.  If migration scripts need to access the
>> configuration setting we will need to add some sort of public API
>> to oslo.db to query the value. The function can simply return the
>> configured value.
>
> Configuration parameters within any given service will make use of a
> large namespace that pulls in things from oslo and the .conf files for a
> given service. So even when an API, CLI, or DB related call is made,
> these namespaces are key for things to work. In the case of the SQL
> Alchemy and Alembic scripts, they also make use of this namespace with
> oslo, oslo.db, etc. to figure out how to connect to the database and
> other database settings. I don't think we need a public API for these
> kinds of calls as the community already makes use of the libraries to
> build the namespace. My oslo.db setting and patches for each service
> just make use of the cfg.CONF.database namespace to determine the
> correct behavior to execute.
>
>>
>> What other behaviors are likely to be changed by the new option?
>> Will application runtime behavior need to know about the storage
>> engine?
>
> The changes will be transparent to the application runtime behavior. The
> APIs and CLI tools call into the <service>/db/api.py as the entry point
> for database calls. Behind this you usually have a models.py that is
> aware of the database schema to understand the layout of things. So the
> underlining structure is abstracted away from the run-time. These entry
> points sometimes do require minor modifications to handle any hard coded
> issues or intercept functions like savepoints and nested operations.
> Again I use the cfg.CONF.database namespace to check for the appropriate
> behavior and implement IF/THEN logic to do the right thing.
>
> Some of my design objectives for all of these patches are:
>
>   * Zero impact on OpenStack functionality and usability (API, CLI, user
>     experience)
>   * No loss in database structure. Consistent foreign keys, constraints,
>     indexes, etc.
>   * Minimal impact on column size and/or types to fit within NDB table
>     row limits. Many columns are over-sized today.
>   * Validate functionality of APIs, service processes, and CLI. Tempest
>     is our friend :)
>   * Zero impact for users not using MySQL Cluster (NDB).
>
>
>>
>> Doug
>>
>>> Thanks,
>>> Octave
>>>
>>> On 2/2/2017 6:46 AM, Doug Hellmann wrote:
>>>> Excerpts from Octave J. Orgeron's message of 2017-02-01 20:33:38 -0700:
>>>>> Hi Folks,
>>>>>
>>>>> I'm working on adding support for MySQL Cluster to the core OpenStack
>>>>> services. This will enable the community to benefit from an
>>>>> active/active, auto-sharding, and scale-out MySQL database. My approach
>>>>> is to have a single configuration setting in each core OpenStack service
>>>>> in the oslo.db configuration section called mysql_storage_engine that
>>>>> will enable the logic in the SQL Alchemy or Alembic upgrade scripts to
>>>>> handle the differences between InnoDB and NDB storage engines
>>>>> respectively. When enabled, this logic will make the required table
>>>>> schema changes around:
>>>>>
>>>>>    * Row character length limits 65k -> 14k
>>>>>    * Proper SQL ordering of foreign key, constraints, and index operations
>>>>>    * Interception of savepoint and nested operations
>>>>>
>>>>> By default this functionality will not be enabled and will have no
>>>>> impact on the default InnoDB functionality. These changes have been
>>>>> tested on Kilo and Mitaka in previous releases of our OpenStack
>>>>> distributions with Tempest. I'm working on updating these patches for
>>>>> upstream consumption. We are also working on a 3rd party CI for
>>>>> regression testing against MySQL Cluster for the community.
>>>>>
>>>>> The first change set is for oslo.db and can be reviewed at:
>>>>>
>>>>> https://review.openstack.org/427970
>>>>>
>>>>> Thanks,
>>>>> Octave
>>>>>
>>>> Is it possible to detect the storage engine at runtime, instead of
>>>> having the operator configure it?
>>>>
>>>> Doug
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> --
>
> Oracle <http://www.oracle.com/>
> Octave J. Orgeron | Sr. Principal Architect and Software Engineer
> Oracle Linux OpenStack
> Mobile: +1-720-616-1550 <tel:+17206161550>
> 500 Eldorado Blvd. | Broomfield, CO 80021
> Certified Oracle Enterprise Architect: Systems Infrastructure
> <http://www.oracle.com/us/solutions/enterprise-architecture/index.html>
> Green Oracle <http://www.oracle.com/commitment> Oracle is committed to
> developing practices and products that help protect the environment
>
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list