[openstack-dev] UTF-8 required charset/encoding for openstack database?
Zhi Yan Liu
lzy.dev at gmail.com
Wed Mar 19 03:16:56 UTC 2014
On Wed, Mar 19, 2014 at 6:08 AM, Doug Hellmann
<doug.hellmann at dreamhost.com> wrote:
> On Mon, Mar 10, 2014 at 4:02 PM, Ben Nemec <openstack at nemebean.com> wrote:
>> On 2014-03-10 12:24, Chris Friesen wrote:
>>> I'm using havana and recent we ran into an issue with heat related to
>>> character sets.
>>> In heat/db/sqlalchemy/api.py in user_creds_get() we call
>>> _decrypt() on an encrypted password stored in the database and then
>>> try to convert the result to unicode. Today we hit a case where this
>>> errored out with the following message:
>>> UnicodeDecodeError: 'utf8' codec can't decode byte 0xf2 in position 0:
>>> invalid continuation byte
>>> We're using postgres and currently all the databases are using
>>> SQL_ASCII as the charset.
>>> I see that in icehouse heat will complain if you're using mysql and
>>> not using UTF-8. There doesn't seem to be any checks for other
>>> databases though.
>>> It looks like devstack creates most databases as UTF-8 but uses latin1
>>> for nova/nova_bm/nova_cell. I assume this is because nova expects to
>>> migrate the db to UTF-8 later. Given that those migrations specify a
>>> character set only for mysql, when using postgres should we explicitly
>>> default to UTF-8 for everything?
>> We just had a discussion about this in #openstack-oslo too. See the
>> discussion starting at 2014-03-10T16:32:26
>> While it seems Heat does require utf8 (or at least matching character
>> sets) across all tables, I'm not sure the current solution is good. It
>> seems like we may want a migration to help with this for anyone who might
>> already have mismatched tables. There's a lot of overlap between that
>> discussion and how to handle Postgres with this, I think.
>> I don't have a definite answer for any of this yet but I think it is
>> something we need to figure out, so hopefully we can get some input from
>> people who know more about the encoding requirements of the Heat and other
>> projects' databases.
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
> Based on the discussion from the project meeting today , the Glance team
> is going to write a migration to fix the database as the other projects have
> (we have not seen issues with corrupted data, so we believe this to be
> safe). However, there is one snag. In a follow-up conversation with Ben in
> #openstack-oslo, he pointed out that no migrations will run until the
> encoding is correct, so we do need to make some changes to the db code in
This is exactly right and that's why I proposed
> Here's what I think we need to do:
> 1. In oslo, db_sync() needs a boolean to control whether
> _db_schema_sanity_check() is called. This is an all-or-nothing flag (not the
> "for some tables" implementation that was proposed).
I'd like to use https://review.openstack.org/#/c/75356/ to handle this.
Doug it will be cool if you like remove -2 from it, thanks.
> 2. Glance needs a migration to change the encoding of their tables.
I'm going to to use https://review.openstack.org/#/c/75898/ to cover this.
> 3. In glance-manage, the code that calls upgrade migrations needs to look at
> the current state and figure out if the requested state is before or after
> the migration created in step 2. If it is before, it passes False to disable
> the sanity check. If it is after, it passes True to enforce the sanity
I will use https://review.openstack.org/#/c/75865/ to take this.
And what do you think if I expose sanity-check-skipping flag to glance
deployer instead of do it in db_sync internal?
I think it will be more flexible to help deployer get the correct
finial DB migration target as he needed.
> Ben, did I miss any details?
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
More information about the OpenStack-dev