[openstack-dev] [oslo] log message translations

Doug Hellmann doug.hellmann at dreamhost.com
Mon Jan 27 17:42:28 UTC 2014

We have a blueprint open for separating translated log messages into
different domains so the translation team can prioritize them differently
(focusing on errors and warnings before debug messages, for example) [1].
Some concerns were raised related to the review [2], and I would like to
address those in this thread and see if we can reach consensus about how to

The implementation in [2] provides a set of new marker functions similar to
_(), one for each log level (we have _LE, LW, _LI, _LD, etc.). These would
be used in conjunction with _(), and reserved for log messages. Exceptions,
API messages, and other user-facing messages all would still be marked for
translation with _() and would (I assume) receive the highest priority work
from the translation team.

When the string extraction CI job is updated, we will have one "main"
catalog for each app or library, and additional catalogs for the log
levels. Those show up in transifex separately, but will be named in a way
that they are obviously related. Each translation team will be able to
decide, based on the requirements of their users, how to set priorities for
translating the different catalogs.

Existing strings being sent to the log and marked with _() will be removed
from the main catalog and moved to the appropriate log-level-specific
catalog when their marker function is changed. My understanding is that
transifex is smart enough to recognize the same string from more than one
source, and to suggest previous translations when it sees the same text.
This should make it easier for the translation teams to "catch up" by
reusing the translations they have already done, in the new catalogs.

One concern that was raised was the need to mark all of the log messages by
hand. I investigated using extraction patterns like "LOG.debug(" and
"LOG.info(", but because of the way the translation actually works
internally we cannot do that. There are a few related reasons.

In other applications, the function _() translates a string at the point
where it is invoked, and returns a new string object. OpenStack has a
requirement that messages be translated multiple times, whether in the API
or the LOG (there is already support for logging in more than one language,
to different log files). This requirement means we delay the translation
operation until right before the string is output, at which time we know
the target language. We could update the log functions to create Message
objects dynamically, except...

Each app or library that uses the translation code will need its own
"domain" for the message catalogs. We get around that right now by not
translating many messages from the libraries, but that's obviously not what
we want long term (we at least want exceptions translated). If we had a
special version of a logger in oslo.log that knew how to create Message
objects for the format strings used in logging (the first argument to
LOG.debug for example), it would also have to know what translation domain
to use so the proper catalog could be loaded. The wrapper functions defined
in the patch [2] include this information, and can be updated to be
application or library specific when oslo.log eventually becomes its own

Further, as part of moving the logging code from oslo-incubator to
oslo.log, and making our logging something we can use from other OpenStack
libraries, we are trying to change the implementation of the logging code
so it is no longer necessary to create loggers with our special wrapper
function. That would mean that oslo.log will be a library for *configuring*
logging, but the actual log calls can be handled with Python's standard
library, eliminating a dependency between new libraries and oslo.log. (This
is a longer, and separate, discussion, but I mention it here as backround.
We don't want to change the API of the logger in oslo.log because we don't
want to be using it directly in the first place.)

Another concern raised was the use of a prefix _L for these functions,
since it ties the priority definitions to "logs." I chose that prefix as an
explicit indicate that these *are* just for logs. I am not associating any
actual priority with them. The translators want us to move the log messages
out of the main catalog. Having them all in separate catalogs is a
refinement that gives them what they want -- some translators don't care
about log messages at all, some only care about errors, etc. We decided
that the translators should set priorities, and we would make that possible
by separating the catalogs into logical groups. Everything marked with _()
will still go into the main catalog, but beyond that it isn't up to the
developers to indicate "priority" for translations.

The alternative approach of using babel translator comments would, under
other circumstances, help because each message could have some indication
of its relative importance. However, it does not meet the requirement that
the translators (and not the developers) set those priorities. It also
doesn't help the translators because the main catalog does not shrink to
hold only the user-facing messages. So the comments might be useful in
addition to this proposed change, but they doesn't solve the original

If we all agree on the approach, I think the patches already in progress
should be pretty easy to land in the incubator. The next step is to update
the CI jobs that extract the messages and interact with transifex. After
that, changes to the applications and existing libraries are likely to take
longer, and could be done in batches. They may not happen until the next
cycle, but I would like to have the infrastructure in place by the end of
this one.



[2] https://review.openstack.org/#/c/65518/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140127/ddf9e866/attachment.html>

More information about the OpenStack-dev mailing list