[openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Octave J. Orgeron octave.orgeron at oracle.com
Wed Jul 26 23:01:50 UTC 2017


Hi Michael,

Yeah, the data types are the same database wise, it's just the total row 
size that different between InnoDB and NDB when it comes to table 
structure. So it's more of a decision point of:

A. Change the column size or type across the database types.
B. Have a mechanism to dynamically change the size or type of a specific 
column based on the database engine.

While I do like A, I know that it requires agreement and jumping through 
hoops to get there. As where B, definitely requires some overhead in 
oslo.db and functions/args to make the dynamic change, but it reduces 
the impacts, politics, and hoop jumping. Either way though, we'll have 
to modify the migration scripts, deal with foreign keys, and still have 
to deal with any migrations that don't make proper use of oslo.db.

Let me know which way we should proceed. I'd like to get things moving 
forward again.

Thanks,
Octave

On 7/26/2017 4:02 PM, Michael Bayer wrote:
> I realize now that we are in fact going for a total "row size", when I
> was under the impression that ndb had a simple limit of 64 characters
> for a VARCHAR.
>
> As I was going on the completely wrong assumptions, I'd like to
> rethink the approach of datatypes.
>
> I do think a real migration that simply reduces the sizes of selected
> columns is the best approach in this case, and that the types like
> AutoStringXYZ should go away completely.
>
> To that end I've proposed reverting the one ndb patchset that has
> merged which is the one in Cinder:
>
> https://review.openstack.org/#/c/487603/
>
> However, if Cinder declines to revert this, the "AutoXYZ" types in
> oslo.db (which have also been released) will have to go through a
> deprecation cycle.
>
> Additionally, my concern that projects will not have any way to guard
> against ever going over a 14K row size remains, and I still think that
> checks need to be put in place in oslo.db that would sum the total row
> size of any given table and raise an error if the limit is surpassed.
>
>
>
>
> On Wed, Jul 26, 2017 at 5:40 PM, Octave J. Orgeron
> <octave.orgeron at oracle.com> wrote:
>> Hi Michael,
>>
>> Comments below..
>>
>> On 7/26/2017 1:08 PM, Michael Bayer wrote:
>>
>>
>>
>> On Jul 25, 2017 3:38 PM, "Octave J. Orgeron" <octave.orgeron at oracle.com>
>> wrote:
>>
>> Hi Michael,
>>
>> I understand that you want to abstract this completely away inside of
>> oslo.db. However, the reality is that making column changes based purely on
>> the size and type of that column, without understanding what that column is
>> being used for is extremely dangerous. You could end up clobbering a column
>> that needs a specific length for a value,
>>
>>
>>
>> Nowhere in my example is the current length truncated.   Also, if two
>> distinct lengths truly must be maintained we add a field "minimum_length".
>>
>>
>> prevent
>>
>>   an index from working, etc.
>>
>>
>> That's what the indexable flag would achieve.
>>
>> It
>>
>>   wouldn't make sense to just do global changes on a column based on the
>> size.
>>
>>
>> This seems to be what your patches are doing, however.
>>
>>
>> This is incorrect. I only change columns that meet my criteria for being
>> changed. I'm not globally changing columns across every table and service.
>> So to be clear and make sure we are on the same page..
>>
>> Are you proposing that we continue to select specific columns and adjust
>> their size by using the below, instead of the ndb.Auto* functions?
>>
>> oslo_db.sqlalchemy.String(<value>, indexable=<value>, ndb_size=<value>,
>> ndb_type=<value>)
>>
>> i.e.
>>
>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for most
>> dbs, TINYTEXT for ndb
>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>> dbs, TEXT for ndb
>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>> VARCHAR(64) on ndb
>>
>> So if I need to change a column that today says:
>>
>> sa.String(4096)
>>
>> I would modify it to:
>>
>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT)
>>
>> OR
>>
>> Are you proposing that we change very single column across every single
>> database blindly using some logic in oslo.db, where even if a column doesn't
>> need to be changed, it gets changed based on the database engine type and
>> the size of the column?
>>
>> So even if we have a table that doesn't need to be changed or touched, we
>> would end up with:
>>
>> mysql_enable_ndb = True
>>
>> sa.String(255) -> TINYTEXT
>>
>> If that is the type of behavior you are aiming for, I think don't that makes
>> sense.
>>
>>
>>
>>
>>
>> There are far more tables that fit in both InnoDB and NDB already than those
>> that don't. As I've stated many times before, the columns that I make
>> changes to are evaluated to understand:
>>
>> 1. What populates it?
>> 2. Who consumes it?
>> 3. What are the possible values and required lengths?
>> 4. What is the impact of changing the size or type?
>> 5. Evaluated against the other columns in the table, which one makes the
>> most sense to adjust?
>>
>> I don't see a way of automating that and making it maintainable without a
>> lot more overhead in code and people.
>>
>>
>> My proposal is intended to *reduce* the great verbosity in the current
>> patches I see and remove the burden of every project having to be aware of
>> "ndb" every time a column is added.
>>
>>
>> I agree with using as few arguments to the oslo.db.sqlalchemy.String
>> function. But at the same time, if a column needs to be adjusted, someone
>> has to put the right arguments there. As far as the burden goes, Oracle is
>> already taking the ownership of making MySQL Cluster work across services,
>> which means maintaining patches and creating new ones as projects evolve.
>>
>> Also, if we want one behavior for NDB, another for PostgreSQL, and yet
>> another for DB2 or Oracle DB, wouldn't we need to be somewhat verbose on
>> what we want?
>>
>> i.e.
>>
>> String(8192, ndb_type=TEXT, pgs_type=text, db2_type=CLOB, ora_type=CLOB)
>>
>>
>>
>>
>> If
>>
>>   we really want to remove the complexity here, why don't we just change the
>> sizes and types on these handful of table columns so that they fit within
>> both InnoDB and NDB?
>>
>>
>>
>> Because that requires new migrations which are a great risk and
>> inconvenience to projects.
>>
>>
>> When it comes to projects that need table columns adjusted, so far we are
>> only talking about Cinder, Neutron, Nova, and Magnum. Also, let's keep in
>> mind that it's only a handful of tables that are being touched. I still feel
>> this is being blown out of proportion. Here are some metrics to consider,
>> the 4 services with tables that need to be adjusted:
>>
>> Service:    # of Tables with columns changed:
>> Cinder         1
>> Neutron      5
>> Nova            1
>> Magnum     2
>>
>> With the exception of Magnum, those services tend to have over 75 or 100
>> tables. So I ask, are we blowing this out of proportion compared the normal
>> churn on tables in these services? For example, Neutron dropped 30+ tables
>> and changed dozens. These databases are not so static over time to begin
>> with.
>>
>>
>>
>> That
>>
>>   way we don't need these functions and the tables are exactly the same? That
>> would only leave us with the createtable, savepoint/rollback, etc. stuff to
>> address which is already taken care of in the ndb module in oslo.db? Then we
>> just fix the foreign key stuff as I've been doing, since it has zero impact
>> on InnoDB deployments and if anything ensures things are consistent. That
>> would then leave us to really focus on fixing migrations to use oslo.db and
>> pass the correct flags, which is a more lengthy process than the rest of
>> this.
>>
>> I don't see the point in trying to make this stuff anymore complicated.
>>
>>
>> The proposal is to make it simpler than it is right now.
>>
>> Run though every column change youve proposed and show me which ones don't
>> fit into my proposed ruleset.   I will add additional declarative flags to
>> ensure those use cases are covered.
>>
>>
>>
>>
>>
>> Octave
>>
>>
>> On 7/25/2017 12:20 PM, Michael Bayer wrote:
>>> On Mon, Jul 24, 2017 at 5:41 PM, Michael Bayer <mbayer at redhat.com> wrote:
>>>>> oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for
>>>>> most
>>>>> dbs, TINYTEXT for ndb
>>>>> oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most
>>>>> dbs, TEXT for ndb
>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs,
>>>>> VARCHAR(64) on ndb
>>>>>
>>>>> This way, we can override the String with TINYTEXT or TEXT or change the
>>>>> size for ndb.
>>>>>> oslo_db.sqlalchemy.String(255)     -> VARCHAR(255) on most dbs,
>>>>>> TINYTEXT() on ndb
>>>>>> oslo_db.sqlalchemy.String(255, ndb_size=64)     -> VARCHAR(255) on
>>>>>> most dbs, VARCHAR(64) on ndb
>>>>>> oslo_db.sqlalchemy.String(50)     -> VARCHAR(50) on all dbs
>>>>>> oslo_db.sqlalchemy.String(64)     -> VARCHAR(64) on all dbs
>>>>>> oslo_db.sqlalchemy.String(80)     -> VARCHAR(64) on most dbs,
>>>>>> TINYTEXT()
>>>>>> on ndb
>>>>>> oslo_db.sqlalchemy.String(80, ndb_size=55)     -> VARCHAR(64) on most
>>>>>> dbs, VARCHAR(55) on ndb
>>>>>>
>>>>>> don't worry about implementation, can the above declaration ->
>>>>>> datatype mapping work ?
>>>>>>
>>>>>>
>>>>> In my patch for Neutron, you'll see a lot of the AutoStringText() calls
>>>>> to
>>>>> replace exceptionally long String columns (4096, 8192, and larger).
>>>> MySQL supports large VARCHAR now, OK.   yeah this could be
>>>> String(8192, ndb_type=TEXT) as well.
>>> OK, no, sorry each time I think of this I keep seeing the verbosity of
>>> imports etc. in the code, because if we had:
>>>
>>> String(80, ndb_type=TEXT)
>>>
>>> then we have to import both String and TEXT, and then what if there's
>>> ndb.TEXT, the code is still making an ndb-specific decision, etc.
>>>
>>> I still see that this can be mostly automated from a simple ruleset
>>> based on the size:
>>>
>>> length <= 64 :    VARCHAR(length) on all backends
>>> length > 64, length <= 255:   VARCHAR(length) for most backends,
>>> TINYTEXT for ndb
>>> length > 4096:  VARCHAR(length) for most backends, TEXT for ndb
>>>
>>> the one case that seems outside of this is:
>>>
>>> String(255)  where they have an index or key on the VARCHAR, and in
>>> fact they only need < 64 characters to be indexed.  In that case you
>>> don't want to use TINYTEXT, right?   So one exception:
>>>
>>> oslo_db.sqlalchemy.types.String(255, indexable=True)
>>>
>>> e.g. a declarative hint to the oslo_db backend to not use a LOB type.
>>>
>>> then we just need oslo_db.sqlalchemy.types.String, and virtually
>>> nothing except the import has to change, and a few keywords.
>>>
>>> What we're trying to do in oslo_db is as much as possible state the
>>> intent of a structure or datatype declaratively, and leave as much of
>>> the implementation up to oslo_db itself.
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> --
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-- 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20170726/e3b220ab/attachment.html>


More information about the OpenStack-dev mailing list