<div dir="ltr"><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Wed, Jan 29, 2014 at 11:52 AM, Ben Nemec <span dir="ltr"><<a href="mailto:openstack@nemebean.com" target="_blank">openstack@nemebean.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><u></u>
<div style="font-family:Verdana,Geneva,sans-serif">
<p>Okay, I think you've convinced me. Specific comments below.</p>
<p>-Ben</p><div class="im">
<p>On 2014-01-29 07:05, Doug Hellmann wrote:</p>
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">
<div dir="ltr">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Jan 28, 2014 at 8:47 PM, Ben Nemec <span><<a href="mailto:openstack@nemebean.com" target="_blank">openstack@nemebean.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-family:Verdana,Geneva,sans-serif">
<div>
<div>
<p>On 2014-01-27 11:42, Doug Hellmann wrote:</p>
<blockquote style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">
<div dir="ltr">
<div style="font-size:small">We have a blueprint open for separating translated log messages into different domains so the translation team can prioritize them differently (focusing on errors and warnings before debug messages, for example) [1]. Some concerns were raised related to the review [2], and I would like to address those in this thread and see if we can reach consensus about how to proceed.</div>
<div style="font-size:small">The implementation in [2] provides a set of new marker functions similar to _(), one for each log level (we have _LE, LW, _LI, _LD, etc.). These would be used in conjunction with _(), and reserved for log messages. Exceptions, API messages, and other user-facing messages all would still be marked for translation with _() and would (I assume) receive the highest priority work from the translation team.</div>
<div style="font-size:small">When the string extraction CI job is updated, we will have one "main" catalog for each app or library, and additional catalogs for the log levels. Those show up in transifex separately, but will be named in a way that they are obviously related. Each translation team will be able to decide, based on the requirements of their users, how to set priorities for translating the different catalogs.</div>
<div style="font-size:small">Existing strings being sent to the log and marked with _() will be removed from the main catalog and moved to the appropriate log-level-specific catalog when their marker function is changed. My understanding is that transifex is smart enough to recognize the same string from more than one source, and to suggest previous translations when it sees the same text. This should make it easier for the translation teams to "catch up" by reusing the translations they have already done, in the new catalogs.</div>
<div style="font-size:small">One concern that was raised was the need to mark all of the log messages by hand. I investigated using extraction patterns like "LOG.debug(" and "LOG.info(", but because of the way the translation actually works internally we cannot do that. There are a few related reasons.</div>
<div style="font-size:small">In other applications, the function _() translates a string at the point where it is invoked, and returns a new string object. OpenStack has a requirement that messages be translated multiple times, whether in the API or the LOG (there is already support for logging in more than one language, to different log files). This requirement means we delay the translation operation until right before the string is output, at which time we know the target language. We could update the log functions to create Message objects dynamically, except...</div>
<div style="font-size:small">Each app or library that uses the translation code will need its own "domain" for the message catalogs. We get around that right now by not translating many messages from the libraries, but that's obviously not what we want long term (we at least want exceptions translated). If we had a special version of a logger in oslo.log that knew how to create Message objects for the format strings used in logging (the first argument to LOG.debug for example), it would also have to know what translation domain to use so the proper catalog could be loaded. The wrapper functions defined in the patch [2] include this information, and can be updated to be application or library specific when oslo.log eventually becomes its own library.</div>
<div style="font-size:small">Further, as part of moving the logging code from oslo-incubator to oslo.log, and making our logging something we can use from other OpenStack libraries, we are trying to change the implementation of the logging code so it is no longer necessary to create loggers with our special wrapper function. That would mean that oslo.log will be a library for *configuring* logging, but the actual log calls can be handled with Python's standard library, eliminating a dependency between new libraries and oslo.log. (This is a longer, and separate, discussion, but I mention it here as backround. We don't want to change the API of the logger in oslo.log because we don't want to be using it directly in the first place.)</div>
<div style="font-size:small">Another concern raised was the use of a prefix _L for these functions, since it ties the priority definitions to "logs." I chose that prefix as an explicit indicate that these *are* just for logs. I am not associating any actual priority with them. The translators want us to move the log messages out of the main catalog. Having them all in separate catalogs is a refinement that gives them what they want -- some translators don't care about log messages at all, some only care about errors, etc. We decided that the translators should set priorities, and we would make that possible by separating the catalogs into logical groups. Everything marked with _() will still go into the main catalog, but beyond that it isn't up to the developers to indicate "priority" for translations.</div>
<div style="font-size:small">The alternative approach of using babel translator comments would, under other circumstances, help because each message could have some indication of its relative importance. However, it does not meet the requirement that the translators (and not the developers) set those priorities. It also doesn't help the translators because the main catalog does not shrink to hold only the user-facing messages. So the comments might be useful in addition to this proposed change, but they doesn't solve the original problem.</div>
<div style="font-size:small">If we all agree on the approach, I think the patches already in progress should be pretty easy to land in the incubator. The next step is to update the CI jobs that extract the messages and interact with transifex. After that, changes to the applications and existing libraries are likely to take longer, and could be done in batches. They may not happen until the next cycle, but I would like to have the infrastructure in place by the end of this one.</div>
<div style="font-size:small">Feedback?</div>
<div style="font-size:small">Doug</div>
<div style="font-size:small">[1] <a href="https://blueprints.launchpad.net/oslo/+spec/log-messages-translation-domain" target="_blank">https://blueprints.launchpad.net/oslo/+spec/log-messages-translation-domain</a></div>
<div style="font-size:small">[2] <a href="https://review.openstack.org/#/c/65518/" target="_blank">https://review.openstack.org/#/c/65518/</a></div>
</div>
</blockquote>
</div>
</div>
<p>I guess my thoughts are still largely the same as on the original review. This is already going to be an additional burden on developers and reviewers (who love i18n so much already ;-) and ideally I'd prefer that we be a little less granular with our designations. Something like _IMPORTANT and _OPTIONAL instead of separate translation domains for each individual log level. Maybe that can't get the translation load down to a manageable level though. I'm kind of guessing on that point.</p>
</div>
</blockquote>
<div> </div>
<div>
<div style="font-size:small">We did consider something like that at the summit, IIRC. However, we wanted to leave the job of setting the priority for doing the translation up to the translators, rather than the developers, because the priorities vary by language. Using designators that match the log output level lowers the review burden, because you don't have to think about the importance of translation, only whether or not the translator tag matches the log function. </div>
<div style="font-size:small"> </div>
</div>
</div>
</div>
</div>
</blockquote>
</div><div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div style="font-size:small">Hmm, hadn't thought about it that way, but it does actually make more work for reviewers. I guess that means I'm good with the 1:1 log level:translation domain mapping. :-)</div>
<div style="font-size:small"> </div>
<div style="font-size:small">I wonder if we could add something into our log wrappers to check that Message domains match the log level in use. It wouldn't be able to catch everything, but maybe we could turn it on in the gate and at least verify anything that gets logged during those runs. Something to consider once we've implemented this, I guess.</div>
</div></div></div></div></div></blockquote><div><br></div><div><div class="gmail_default" style="font-size:small">I like that. I found a few cases in the incubator where we have something like:</div><div class="gmail_default" style="font-size:small">
<br></div><div class="gmail_default" style="font-size:small"> msg = _('some text: %s') % args</div><div class="gmail_default" style="font-size:small"> LOG.alevel(msg)</div><div class="gmail_default" style="font-size:small">
raise Exception(msg)</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">So we do have some known cases where the levels won't match. Because the message is used in an exception, I left it wrapped in _() for now, but we can think about whether that's the best approach once we have the basic pieces all working.</div>
</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="font-family:Verdana,Geneva,sans-serif"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">
<div>
</div>
</div>
</div>
</div><div class="im">
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-family:Verdana,Geneva,sans-serif">
<p><br>For reference, I grepped the nova source to see how many times we're logging at each of the different levels. It's a very rough estimate since I'm sure I'm missing some things and there are almost certainly some dupes, but I would expect it to be relatively close to reality. Here were the results:<br>
<br>[fedora@openstack nova]$ grep -ri log.error | wc -l<br>190<br>[fedora@openstack nova]$ grep -ri log.warn | wc -l<br>286<br>[fedora@openstack nova]$ grep -ri <a href="http://log.info" target="_blank">log.info</a> | wc -l<br>
254<br>[fedora@openstack nova]$ grep -ri log.debug | wc -l<br>849<br><br>It seems like debug is the low-hanging fruit here - getting rid of that eliminates more translations than the rest of the log levels combined (since it looks like Nova is translating the vast majority of their debug messages). I don't know if that's helpful (enough) though.</p>
</div>
</blockquote>
<div>
<div style="font-size:small">I'm not sure either. Daisy, would it solve your team's needs if we just removed translation markers from debug log messages and left everything in the same catalog? It's not what we talked about at the summit, but maybe it's an alternative?</div>
<div style="font-size:small"> </div>
</div>
</div>
</div>
</div>
</blockquote>
</div><div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div>
<div style="font-size:small">A lot of my motivation for getting these numbers was finding a "simpler" way to break down translation domains, but since I seem to have changed my mind on that I'm not as hung up on this. If we can accomplish what we need by dropping debug translations that would be great, but since those numbers don't include non-log translations I'm guessing it won't be enough. Still interested to hear from Daisy though.</div>
</div></div></div></div></div></blockquote><div><br></div><div><div class="gmail_default" style="font-size:small">Yes, I'd like to make sure our plans meet their needs. We could take both approaches -- not translate debug messages at all, and mark the others based on their level.</div>
</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div style="font-family:Verdana,Geneva,sans-serif"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">
<div>
</div>
</div>
</div>
</div><div class="im">
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-family:Verdana,Geneva,sans-serif">
<p><br>I suppose my biggest concern is getting reviewers to buy in to whatever we do. It's going to be some additional workload for them since we likely can't enforce this through a hacking rule, and some people basically refuse to touch anything to do with translation as it is. It's also one more hurdle for new contributors since it's a non-standard way of handling translation. And, as I noted on the review, it's almost certainly going to get out of sync over time as people adjust log message priorities and such. Maybe those are all issues we just have to accept, but they are issues.</p>
</div>
</blockquote>
<div>
<div style="font-size:small">I expect we'll need to set some project-wide standards, as Sean is doing with the meanings of the various log levels.</div>
</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-family:Verdana,Geneva,sans-serif">
<p><br>Oh, one other thing I wanted to ask about was what the status of Transifex is as far as OpenStack is concerned. My understanding was that we were looking for alternatives because Transifex had pretty much abandoned their open source version. Does that have any impact on this?</p>
</div>
</blockquote>
<div style="font-size:small">If we replace it, we will replace it with another tool. The file formats are standardized, so I wouldn't expect a tool change at that level to affect our decision on this question.</div>
<div style="font-size:small"> </div>
</div>
</div>
</div>
</blockquote>
</div><div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style="font-size:small">Fair enough. Handling this gracefully would just become a requirement on any new tool we adopted.</div></div></div></div></div></blockquote><div><br></div><div class="gmail_default" style="font-size:small">
Agreed.</div><div class="gmail_default" style="font-size:small"><br></div><div class="gmail_default" style="font-size:small">Doug</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-family:Verdana,Geneva,sans-serif"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">
</div>
</div>
</div><div class="im">
<blockquote type="cite" style="padding-left:5px;border-left:#1010ff 2px solid;margin-left:5px">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div style="font-size:small"> </div>
<div style="font-size:small">Doug</div>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="font-family:Verdana,Geneva,sans-serif">
<p><br>Anyway, it's getting late and my driveway won't shovel itself, so those are my slightly rambling thoughts on this. :-)<span><span style="color:#888888"><br><br>-Ben</span></span></p>
</div>
<br>_______________________________________________<br> OpenStack-dev mailing list<br><a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.org</a><br><a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br></blockquote>
</div>
</div>
</div>
</blockquote>
<p> </p>
<div> </div>
</div></div>
</blockquote></div><br></div></div>