[openstack-dev] [Heat][oslo-incubator][oslo-log] Logging Unicode characters

Ben Nemec openstack at nemebean.com
Wed Dec 24 16:50:48 UTC 2014


On 12/24/2014 03:48 AM, Qiming Teng wrote:
> Hi,
> 
> When trying to enable stack names in Heat to use unicode strings, I am
> stuck by a weird behavior of logging.
> 
> Suppose I have a stack name assigned some non-ASCII string, then when
> stack tries to log something here:
> 
> heat/engine/stack.py:
> 
>  536     LOG.info(_LI('Stack %(action)s %(status)s (%(name)s): '
>  537                  '%(reason)s'),
>  538              {'action': action,
>  539               'status': status,
>  540               'name': self.name,   # type(self.name)==unicode here
>  541               'reason': reason})
> 
> I'm seeing the following errors from h-eng session:
> 
> Traceback (most recent call last):
>   File "/usr/lib64/python2.6/logging/__init__.py", line 799, in emit
>     stream.write(fs % msg.decode('utf-8'))
>   File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
>     return codecs.utf_8_decode(input, errors, True)
> UnicodeEncodeError: 'ascii' codec can't encode characters in position 114-115: 
>  ordinal not in range(128)
> 
> This means logging cannot handle Unicode correctly?  No.  I did the
> following experiments:
> 
> $ cat logtest
> 
> #!/usr/bin/env python
> 
> import sys
> 
> from oslo.utils import encodeutils
> from oslo import i18n
> 
> from heat.common.i18n import _LI
> from heat.openstack.common import log as logging
> 
> i18n.enable_lazy()
> 
> LOG = logging.getLogger('logtest')
> logging.setup('heat')
> 
> print('sys.stdin.encoding: %s' % sys.stdin.encoding)
> print('sys.getdefaultencoding: %s' % sys.getdefaultencoding())
> 
> s = sys.argv[1]
> print('s is: %s' % type(s))
> 
> stack_name = encodeutils.safe_decode(unis)

I think you may have a typo in your sample here because unis isn't
defined as far as I can tell.

In any case, I suspect this line is why your example works and Heat
doesn't.  I can reproduce the same error if I stuff some unicode data
into a unicode string without decoding it first:

>>> test = u'\xe2\x82\xac'
>>> test.decode('utf8')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
    return codecs.utf_8_decode(input, errors, True)
UnicodeEncodeError: 'ascii' codec can't encode characters in position
0-2: ordinal not in range(128)
>>> test = '\xe2\x82\xac'
>>> test.decode('utf8')
u'\u20ac'

Whether that's what is going on here I can't say for sure though.
Trying to figure out unicode in Python usually gives me a headache. :-)

> print('stack_name is: %s' % type(stack_name))
> 
> # stack_name is unicode here
> LOG.error(_LI('stack name: %(name)s') % {'name': stack_name})
> 
> $ ./logtest <some Chinese here>
> 
> [tengqm at node1 heat]$ ./logtest 中文
> sys.stdin.encoding: UTF-8
> sys.getdefaultencoding: ascii
> s is: <type 'str'>
> stack_name is: <type 'unicode'>
> 2014-12-24 17:51:13.799 29194 ERROR logtest [-] stack name: 中文
> 
> It worked.  
> 
> After spending more than one day on this, I'm seeking help from people
> here.  What's wrong with Unicode stack names here?
> 
> Any hints are appreciated.
> 
> Regards,
>   - Qiming
> 
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 




More information about the OpenStack-dev mailing list