[openstack-dev] [oslo] i18n Message improvements

Luis A. Garcia luis at linux.vnet.ibm.com
Thu Oct 17 19:29:52 UTC 2013


On 10/17/2013 12:24 PM, John Dennis wrote:
> On 10/17/2013 12:22 PM,  Luis A. Garcia wrote:
>> On 10/16/2013 1:11 PM, Doug Hellmann wrote:
>>>
>>> [snip]
> What you're describing sounds a lot like problems that result from the
> fact Python's default encoding is ASCII as opposed to the more sensible
> UTF-8. I have a long write up on this issue from a few years ago but
> I'll cut to the chase. Python will attempt to automatically encode
> Unicode objects into ASCII during output which will fail if there are
> non-ASCII code points in the Unicode. Python does this is in two
> distinct contexts depending on whether destination of the output is a
> file or terminal. If it's a terminal it attempts to use the encoding
> associated with the TTY. Hence you can different results if you output
> to a TTY or a file handle.
>

Hi John, yeah that is pretty much what is happening. Text is encoded to 
a 'utf-8' string (that encoding is hardcoded which is bad in this case) 
and then the log.Formatter tries to decode it using the default which is 
'ascii'. Bad stuff.

We documented some specific details about the problem with some proposed 
solutions here: https://etherpad.openstack.org/p/bug-1225099

> The simple solution to many of the encoding exceptions that Python will
> throw is to override the default encoding and change it to UTF-8. But
> the default encoding is locked by site.py due to internal Python string
> optimizations which cache the default encoded version of the string so
> the encoding happens only once. Changing the default encoding would
> invalidate cached strings and there is no mechanism to deal with that,
> that's why the default encoding is locked. But you can change the
> default encoding using this trick if you do early enough during the
> module loading process:
>

This is something we considered adding to the list of proposed solutions 
in the etherpad above, but we though it was too drastic given the 
problem and alternatives, and afaik it is discouraged to change the 
default encoding. Would be good to hear your comments on the proposed 
solutions.

Thank you,

-- 
Luis A. García
Cloud Solutions & OpenStack Development
IBM Systems and Technology Group
Ph: (915) 307-6568 | T/L: 363-6276

"Everything should be made as simple as possible, but not simpler."
                                         - Albert Einstein

"Simple can be harder than complex: You have to work hard to get
your thinking clean to make it simple."
                                         – Steve Jobs




More information about the OpenStack-dev mailing list