[openstack-dev] [oslo.db] [ndb] ndb namespace throughout openstack projects

Octave J. Orgeron octave.orgeron at oracle.com
Mon Jul 24 21:10:40 UTC 2017


Hi Michael,

Comments below..

On 7/24/2017 2:49 PM, Michael Bayer wrote:
> On Mon, Jul 24, 2017 at 3:37 PM, Octave J. Orgeron
> <octave.orgeron at oracle.com> wrote:
>> For these, here is a brief synopsis:
>>
>> AutoStringTinyText, will convert a column to the TinyText type. This is used
>> for cases where a 255 varchar string needs to be converted to a text blob to
>> make the row fit within the NDB limits. If you are using ndb, it'll convert
>> it to TinyText, otherwise it leaves it alone. The reason that TinyText type
>> was chosen is because it'll hold the same 255 varchars and saves on space.
>>
>> AutoStringText, does the same as the above, but converts the type to Text
>> and is meant for use cases where you need more than 255 varchar worth of
>> space. Good examples of these uses are where outputs of hypervisor and OVS
>> commands are dumped into the database.
>>
>> AutoStringSize, you pass two parameters, one being the non-NDB size and the
>> second being the NDB size. The point here is where you need to reduce the
>> size of the column to fit within the NDB limits, but you want to preserve
>> the String varchar type because it might be used in a key, index, etc. I
>> only use these in cases where the impacts are very low.. for example where a
>> column is used for keeping track of status (up, down, active, inactive,
>> etc.) that don't require 255 varchars.
> Can the "auto" that is supplied by AutoStringTinyText and
> AutoStringSize be merged?

I don't think it makes sense to make these global. We don't need to 
change all occurrences of String(255) to TinyText for example. We make 
that determination through understanding the table structure and usage. 
But I do like the idea of changing the second option to ndb_size=, I 
think that makes things very clear. If you want to collapse the use 
cases.. what about?:

oslo_db.sqlalchemy.String(255, ndb_type=TINYTEXT) -> VARCHAR(255) for 
most dbs, TINYTEXT for ndb
oslo_db.sqlalchemy.String(4096, ndb_type=TEXT) -> VARCHAR(4096) for most 
dbs, TEXT for ndb
oslo_db.sqlalchemy.String(255, ndb_size=64) -> VARCHAR(255) on most dbs, 
VARCHAR(64) on ndb

This way, we can override the String with TINYTEXT or TEXT or change the 
size for ndb.

>
> oslo_db.sqlalchemy.String(255)     -> VARCHAR(255) on most dbs,
> TINYTEXT() on ndb
> oslo_db.sqlalchemy.String(255, ndb_size=64)     -> VARCHAR(255) on
> most dbs, VARCHAR(64) on ndb
> oslo_db.sqlalchemy.String(50)     -> VARCHAR(50) on all dbs
> oslo_db.sqlalchemy.String(64)     -> VARCHAR(64) on all dbs
> oslo_db.sqlalchemy.String(80)     -> VARCHAR(64) on most dbs, TINYTEXT() on ndb
> oslo_db.sqlalchemy.String(80, ndb_size=55)     -> VARCHAR(64) on most
> dbs, VARCHAR(55) on ndb
>
> don't worry about implementation, can the above declaration ->
> datatype mapping work ?
>
> Also where are we using AutoStringText(), it sounds like this is just
> what SQLAlchemy calls the Text() datatype?   (e.g. an unlengthed
> string type, comes out as CLOB etc).
>
In my patch for Neutron, you'll see a lot of the AutoStringText() calls 
to replace exceptionally long String columns (4096, 8192, and larger).



>
>
>> In many cases, the use of these could be removed by simply changing the
>> columns to more appropriate types and sizes. There is a tremendous amount of
>> wasted space in many of the databases. I'm more than willing to help out
>> with this if teams decide they would rather do that instead as the long-term
>> solution. Until then, these functions enable the use of both with minimal
>> impact.
>>
>> Another thing to keep in mind is that the only services that I've had to
>> adjust column sizes for are:
>>
>> Cinder
>> Neutron
>> Nova
>> Magnum
>>
>> The other services that I'm working on like Keystone, Barbican, Murano,
>> Glance, etc. only need changes to:
>>
>> 1. Ensure that foreign keys are dropped and created in the correct order
>> when changing things like indexes, constraints, etc. Many services do these
>> proper steps already, there are just cases where this has been missed
>> because InnoDB is very forgiving on this. But other databases are not.
>> 2. Fixing the database migration and sync operations to use oslo.db, pass
>> the right parameters, etc. Something that should have been done in the first
>> place, but hasn't. So this more of a house cleaning step to insure that
>> services are using oslo.db correctly.
>>
>> The only other oddball use case is deal with disabling nested transactions,
>> where Neutron is the only one that does this.
>>
>> On the flip side, here is a short list of services that I haven't had to
>> make ANY changes for other than having oslo.db 4.24 or above:
>>
>> aodh
>> gnocchi
>> heat
>> ironic
>> manila
>>
>>> 3. it's not clear (I don't even know right now by looking at these
>>> reviews) when one would use "AutoStringTinyText" or "AutoStringSize".
>>> For example in
>>> https://review.openstack.org/#/c/446643/10/nova/db/sqlalchemy/migrate_repo/versions/216_havana.py
>>> I see a list of String(255)'s changed to one type or the other without
>>> any clear notion why one would use one or the other.  Having names
>>> that define simply the declared nature of the type would be most
>>> appropriate.
>>
>> One has to look at what the column is being used for and decide what
>> appropriate remediation steps are. This takes time and one must research
>> what kind of data goes in the column, what puts it there, what consumes it,
>> and what remediation would have the least amount of impact.
>>
>>> I can add these names up to oslo.db and then we would just need to
>>> spread these out through all the open ndb reviews and then also patch
>>> up Cinder which seems to be the only ndb implementation that's been
>>> merged so far.
>>>
>>> Keep in mind this is really me trying to correct my own mistake, as I
>>> helped design and approved of the original approach here where
>>> projects would be consuming against the "ndb." namespace.  However,
>>> after seeing it in reviews how prevalent the use of this extremely
>>> backend-specific name is, I think the use of the name should be much
>>> less frequent throughout projects and only surrounding logic that is
>>> purely to do with the ndb backend and no others.   At the datatype
>>> level, the chance of future naming conflicts is very high and we
>>> should fix this mistake (my mistake) before it gets committed
>>> throughout many downstream projects.
>>>
>>>
>>> [1] https://review.openstack.org/#/c/427970/
>>>
>>> [2] https://review.openstack.org/#/c/446643/
>>>
>>> [3] https://review.openstack.org/#/c/446136/
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list