[openstack-dev] [Heat][oslo-incubator][oslo-log] Logging Unicode characters
Qiming Teng
tengqim at linux.vnet.ibm.com
Thu Dec 25 13:54:09 UTC 2014
On Wed, Dec 24, 2014 at 10:50:48AM -0600, Ben Nemec wrote:
> On 12/24/2014 03:48 AM, Qiming Teng wrote:
> > Hi,
> >
> > When trying to enable stack names in Heat to use unicode strings, I am
> > stuck by a weird behavior of logging.
> >
> > Suppose I have a stack name assigned some non-ASCII string, then when
> > stack tries to log something here:
> >
> > heat/engine/stack.py:
> >
> > 536 LOG.info(_LI('Stack %(action)s %(status)s (%(name)s): '
> > 537 '%(reason)s'),
> > 538 {'action': action,
> > 539 'status': status,
> > 540 'name': self.name, # type(self.name)==unicode here
> > 541 'reason': reason})
> >
> > I'm seeing the following errors from h-eng session:
> >
> > Traceback (most recent call last):
> > File "/usr/lib64/python2.6/logging/__init__.py", line 799, in emit
> > stream.write(fs % msg.decode('utf-8'))
> > File "/usr/lib64/python2.6/encodings/utf_8.py", line 16, in decode
> > return codecs.utf_8_decode(input, errors, True)
> > UnicodeEncodeError: 'ascii' codec can't encode characters in position 114-115:
> > ordinal not in range(128)
> >
> > This means logging cannot handle Unicode correctly? No. I did the
> > following experiments:
> >
> > $ cat logtest
> >
> > #!/usr/bin/env python
> >
> > import sys
> >
> > from oslo.utils import encodeutils
> > from oslo import i18n
> >
> > from heat.common.i18n import _LI
> > from heat.openstack.common import log as logging
> >
> > i18n.enable_lazy()
> >
> > LOG = logging.getLogger('logtest')
> > logging.setup('heat')
> >
> > print('sys.stdin.encoding: %s' % sys.stdin.encoding)
> > print('sys.getdefaultencoding: %s' % sys.getdefaultencoding())
> >
> > s = sys.argv[1]
> > print('s is: %s' % type(s))
> >
> > stack_name = encodeutils.safe_decode(unis)
>
> I think you may have a typo in your sample here because unis isn't
> defined as far as I can tell.
You are right, it was a typo. It should be s here.
> In any case, I suspect this line is why your example works and Heat
> doesn't. I can reproduce the same error if I stuff some unicode data
> into a unicode string without decoding it first:
>
> >>> test = u'\xe2\x82\xac'
> >>> test.decode('utf8')
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/usr/lib64/python2.7/encodings/utf_8.py", line 16, in decode
> return codecs.utf_8_decode(input, errors, True)
> UnicodeEncodeError: 'ascii' codec can't encode characters in position
> 0-2: ordinal not in range(128)
The above didn't work because test is already declared to be Unicode,
decoding it won't work....
> >>> test = '\xe2\x82\xac'
> >>> test.decode('utf8')
> u'\u20ac'
This one works because test is 'str' type.
> Whether that's what is going on here I can't say for sure though.
> Trying to figure out unicode in Python usually gives me a headache. :-)
Right. Not just unicode conversion, in Heat's case, it also involves
quoting. The test above needs to be quoted when being part of an URI.
That is further more complicating the whole process.
> > print('stack_name is: %s' % type(stack_name))
> >
> > # stack_name is unicode here
> > LOG.error(_LI('stack name: %(name)s') % {'name': stack_name})
> >
> > $ ./logtest <some Chinese here>
> >
> > [tengqm at node1 heat]$ ./logtest 中文
> > sys.stdin.encoding: UTF-8
> > sys.getdefaultencoding: ascii
> > s is: <type 'str'>
> > stack_name is: <type 'unicode'>
> > 2014-12-24 17:51:13.799 29194 ERROR logtest [-] stack name: 中文
> >
> > It worked.
> >
> > After spending more than one day on this, I'm seeking help from people
> > here. What's wrong with Unicode stack names here?
> >
> > Any hints are appreciated.
> >
> > Regards,
> > - Qiming
> >
More information about the OpenStack-dev
mailing list