I've come across this review for automated translations coming into Horizon https://review.openstack.org/#/c/91523/ It concerns me that it has over a half million LOC in it, and many of these languages have near 0 translation. (I checked here: https://www.transifex.com/projects/p/horizon/). 0 translation languages have just as many lines as full translations; they are just shorter. I can't give you a specific well-defined issue about why this concerns me, but I'm worried about things like how big it might make our distributions, or adding time to how long it takes to clone a new Horizon instance, or use devstack. I've chatted a bit about this in #openstack-horizon. ajaeger has pointed out to me that this isn't the biggest patch we've had for translations. This one, for example, is 5 million lines: https://review.openstack.org/#/c/67521/ This doesn't entirely satisfy me because git clone https://review.openstack.org/openstack/openstack-manuals is taking several minutes for me this morning (full disclosure: I am having some network flakiness today). I'd hate for Horizon to take anywhere near that long to clone. Also in #openstack-horizon jpich has shared that we intend to prune some of these langauges out of Horizon as we approach the release. Maybe it would be best to begin that pruning now, to avoid late dev cycle churn? Does anyone else share my concern about the large size of these translations? Particularly the translations that have near 0 translated strings. Doug Fish
On 08/05/14 16:46, Douglas Fish wrote:
I've come across this review for automated translations coming into Horizon https://review.openstack.org/#/c/91523/ It concerns me that it has over a half million LOC in it, and many of these languages have near 0 translation. (I checked here: https://www.transifex.com/projects/p/horizon/). 0 translation languages have just as many lines as full translations; they are just shorter.
I can't give you a specific well-defined issue about why this concerns me, but I'm worried about things like how big it might make our distributions, or adding time to how long it takes to clone a new Horizon instance, or use devstack.
I've chatted a bit about this in #openstack-horizon. ajaeger has pointed out to me that this isn't the biggest patch we've had for translations. This one, for example, is 5 million lines: https://review.openstack.org/#/c/67521/ This doesn't entirely satisfy me because git clone https://review.openstack.org/openstack/openstack-manuals is taking several minutes for me this morning (full disclosure: I am having some network flakiness today). I'd hate for Horizon to take anywhere near that long to clone.
Also in #openstack-horizon jpich has shared that we intend to prune some of these langauges out of Horizon as we approach the release. Maybe it would be best to begin that pruning now, to avoid late dev cycle churn?
One of the suggestions that came up on IRC was to maybe only pull translations for languages that are at least 50% complete, or some other arbitrary number (not only for Horizon but for all projects). We've been careful about which languages we add the po files for in Horizon, so it does seem strange to suddenly add back everything, especially if we will remove them again at release time. I wonder if there might be a way to only update languages for which we already have the po files in the repo, rather than pull all po files? Or if that is desirable at all. I guess I'd also like to understand what would be the preferred way to proceed, from the translation team's perspective :) Is there any use in pulling in languages that only have a few unreviewed translated words? Then we can try to figure out the technical concerns around the string churn and repo size. Thank you, Julie
Does anyone else share my concern about the large size of these translations? Particularly the translations that have near 0 translated strings.
Doug Fish
_______________________________________________ Openstack-i18n mailing list Openstack-i18n@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-i18n
On 05/08/2014 05:59 PM, Julie Pichon wrote:
On 08/05/14 16:46, Douglas Fish wrote:
I've come across this review for automated translations coming into Horizon https://review.openstack.org/#/c/91523/ It concerns me that it has over a half million LOC in it, and many of these languages have near 0 translation. (I checked here: https://www.transifex.com/projects/p/horizon/). 0 translation languages have just as many lines as full translations; they are just shorter.
I can't give you a specific well-defined issue about why this concerns me, but I'm worried about things like how big it might make our distributions, or adding time to how long it takes to clone a new Horizon instance, or use devstack.
I've chatted a bit about this in #openstack-horizon. ajaeger has pointed out to me that this isn't the biggest patch we've had for translations. This one, for example, is 5 million lines: https://review.openstack.org/#/c/67521/ This doesn't entirely satisfy me because git clone https://review.openstack.org/openstack/openstack-manuals is taking several minutes for me this morning (full disclosure: I am having some network flakiness today). I'd hate for Horizon to take anywhere near that long to clone.
Also in #openstack-horizon jpich has shared that we intend to prune some of these langauges out of Horizon as we approach the release. Maybe it would be best to begin that pruning now, to avoid late dev cycle churn?
One of the suggestions that came up on IRC was to maybe only pull translations for languages that are at least 50% complete, or some other arbitrary number (not only for Horizon but for all projects).
We've been careful about which languages we add the po files for in Horizon, so it does seem strange to suddenly add back everything, especially if we will remove them again at release time.
None of the other projects removes the files. With horizon, you can just change "settings.py" to only enable those languages that are translated properly.
I wonder if there might be a way to only update languages for which we already have the po files in the repo, rather than pull all po files? Or if that is desirable at all.
Yes, that is possible as well but then we would need to regularly check which languages should go in and which not. This is IMO to much manual work.
I guess I'd also like to understand what would be the preferred way to proceed, from the translation team's perspective :) Is there any use in pulling in languages that only have a few unreviewed translated words? Then we can try to figure out the technical concerns around the string churn and repo size.
Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On 08/05/14 17:08, Andreas Jaeger wrote:
On 05/08/2014 05:59 PM, Julie Pichon wrote:
We've been careful about which languages we add the po files for in Horizon, so it does seem strange to suddenly add back everything, especially if we will remove them again at release time.
None of the other projects removes the files. With horizon, you can just change "settings.py" to only enable those languages that are translated properly.
I thought so too initially, but the language setting is only used if the user explicitly sets one after logging in. Horizon first uses the browser locale to determine which language to display. So if you are an Italian user, if the MO files for Italian are present, Horizon will use them instead of English, even though the translation is only 16% complete and has not been reviewed yet according to Transifex. This is not an ideal experience.
I wonder if there might be a way to only update languages for which we already have the po files in the repo, rather than pull all po files? Or if that is desirable at all.
Yes, that is possible as well but then we would need to regularly check which languages should go in and which not. This is IMO to much manual work.
This is what has been going on so far though (in Horizon): Daisy checks with the language coordinators for languages that are 100% complete or close enough to see if they are happy with the quality, and if the language should be included in the release. The translation reviews and quality checks need to happen somewhere. Developers don't have the skills to review a huge patch update in multiple languages, however Transifex seems to let you review the translations. I'm not sure that it makes sense to ignore the translation status and quality review percentages, and just pull everything into the repository, even if it was a half-hearted effort abandoned early. Translators want to show off their best work too. I'm expressing my concerns but in the end I defer entirely to the translation/i18n team. The process is set up for you and to enable you all to work, and if I'm missing the point I'm happy to be enlightened and leave it at that :) Thanks, Julie
On 05/08/2014 06:41 PM, Julie Pichon wrote:
On 08/05/14 17:08, Andreas Jaeger wrote:
On 05/08/2014 05:59 PM, Julie Pichon wrote:
We've been careful about which languages we add the po files for in Horizon, so it does seem strange to suddenly add back everything, especially if we will remove them again at release time.
None of the other projects removes the files. With horizon, you can just change "settings.py" to only enable those languages that are translated properly.
I thought so too initially, but the language setting is only used if the user explicitly sets one after logging in. Horizon first uses the browser locale to determine which language to display. So if you are an Italian user, if the MO files for Italian are present, Horizon will use them instead of English, even though the translation is only 16% complete and has not been reviewed yet according to Transifex. This is not an ideal experience.
Ah, good to know - wasn't aware of it.
I wonder if there might be a way to only update languages for which we already have the po files in the repo, rather than pull all po files? Or if that is desirable at all.
Yes, that is possible as well but then we would need to regularly check which languages should go in and which not. This is IMO to much manual work.
This is what has been going on so far though (in Horizon): Daisy checks with the language coordinators for languages that are 100% complete or close enough to see if they are happy with the quality, and if the language should be included in the release.
The translation reviews and quality checks need to happen somewhere. Developers don't have the skills to review a huge patch update in multiple languages, however Transifex seems to let you review the translations. I'm not sure that it makes sense to ignore the translation
Not that only a few translation teams do the review in transifex.
status and quality review percentages, and just pull everything into the repository, even if it was a half-hearted effort abandoned early. Translators want to show off their best work too.
I'm expressing my concerns but in the end I defer entirely to the translation/i18n team. The process is set up for you and to enable you all to work, and if I'm missing the point I'm happy to be enlightened and leave it at that :)
I'm also new to some of this - and helped with the German horizon translation and fixing the infrastructure to run again. I'm happy to change it so that it meets all our needs. My main lesson learned is that we need to make it work with as little manual interaction as possible... Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
The one question use case that comes to mind why untranslated files might be nice, is if you translate the files locally and check out. BUT this checkout can be done also with the transifex client into a fresh directory. Do you have any use cases where the untranslated or partially translated files would be beneficial? Btw. just have a look at the unmerged translation proposal patches to see how large these patches are - and how many untranslated files are in their: https://review.openstack.org/#/q/status:open++branch:master+topic:transifex/... Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On 05/08/2014 07:05 PM, Andreas Jaeger wrote:
The one question use case that comes to mind why untranslated files might be nice, is if you translate the files locally and check out.
BUT this checkout can be done also with the transifex client into a fresh directory.
Do you have any use cases where the untranslated or partially translated files would be beneficial?
Btw. just have a look at the unmerged translation proposal patches to see how large these patches are - and how many untranslated files are in their:
https://review.openstack.org/#/q/status:open++branch:master+topic:transifex/...
FYI, Patches for all projects to remove "mostly untranslated files" are merged now, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On 05/08/2014 05:46 PM, Douglas Fish wrote:
I've come across this review for automated translations coming into Horizon https://review.openstack.org/#/c/91523/ It concerns me that it has over a half million LOC in it, and many of these languages have near 0 translation. (I checked here: https://www.transifex.com/projects/p/horizon/). 0 translation languages have just as many lines as full translations; they are just shorter.
I can't give you a specific well-defined issue about why this concerns me, but I'm worried about things like how big it might make our distributions, or adding time to how long it takes to clone a new Horizon instance, or use devstack.
I've chatted a bit about this in #openstack-horizon. ajaeger has pointed out to me that this isn't the biggest patch we've had for translations. This one, for example, is 5 million lines: https://review.openstack.org/#/c/67521/ This doesn't entirely satisfy me because git clone https://review.openstack.org/openstack/openstack-manuals is taking several minutes for me this morning (full disclosure: I am having some network flakiness today). I'd hate for Horizon to take anywhere near that long to clone.
Also in #openstack-horizon jpich has shared that we intend to prune some of these langauges out of Horizon as we approach the release. Maybe it would be best to begin that pruning now, to avoid late dev cycle churn?
Does anyone else share my concern about the large size of these translations? Particularly the translations that have near 0 translated strings.
I was wondering as well whether it makes sense to download and track completely untranslated files. What about downloading only files that have are at least 50 % translated? I can enhance the scripts for all projects if that is consensus moving forward, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Hmm, I was under the impression that we have decided not to include translation below a certain threshold in Horizon for Havana, so it may be wise to download only those above that. If only transifex would allow to download translations without login we could include that in run_tests or tox, for those who want to have complete ones... Best regards, -- Łukasz [DeeJay1] Jernas On Thu, May 8, 2014 at 6:01 PM, Andreas Jaeger <aj@suse.com> wrote:
On 05/08/2014 05:46 PM, Douglas Fish wrote:
I've come across this review for automated translations coming into Horizon https://review.openstack.org/#/c/91523/ It concerns me that it has over a half million LOC in it, and many of these languages have near 0 translation. (I checked here: https://www.transifex.com/projects/p/horizon/). 0 translation languages have just as many lines as full translations; they are just shorter.
I can't give you a specific well-defined issue about why this concerns me, but I'm worried about things like how big it might make our distributions, or adding time to how long it takes to clone a new Horizon instance, or use devstack.
I've chatted a bit about this in #openstack-horizon. ajaeger has pointed out to me that this isn't the biggest patch we've had for translations. This one, for example, is 5 million lines: https://review.openstack.org/#/c/67521/ This doesn't entirely satisfy me because git clone https://review.openstack.org/openstack/openstack-manuals is taking several minutes for me this morning (full disclosure: I am having some network flakiness today). I'd hate for Horizon to take anywhere near that long to clone.
Also in #openstack-horizon jpich has shared that we intend to prune some of these langauges out of Horizon as we approach the release. Maybe it would be best to begin that pruning now, to avoid late dev cycle churn?
Does anyone else share my concern about the large size of these translations? Particularly the translations that have near 0 translated strings.
I was wondering as well whether it makes sense to download and track completely untranslated files.
What about downloading only files that have are at least 50 % translated? I can enhance the scripts for all projects if that is consensus moving forward,
Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
_______________________________________________ Openstack-i18n mailing list Openstack-i18n@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-i18n
On 05/08/2014 07:22 PM, Łukasz Jernaś wrote:
Hmm, I was under the impression that we have decided not to include translation below a certain threshold in Horizon for Havana, so it may be wise to download only those above that.
I can change our jobs to download only files that are at least X % translated. My proposal would be for X=50. IMO we should keep our translation setup accross projects the same, so I suggest to do this change not only for horizon but also for all other projects - and remove files that are currently below the threshold.
If only transifex would allow to download translations without login we could include that in run_tests or tox, for those who want to have complete ones...
But those can also create an account... Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Yes, downloading files that are at least 50% translated would alleviate my concern. FWIW I had hoped this might become the process, but I was thinking more along the lines of 75% - it turns out for horizon both 50% and 75% would choose the same languages anyway! Would this determination be on a per file basis? Is there any concern that projects could pick up different translations? (that is, en_US_mn might be 80% translated in Horizon, but not translated at all in other languages). I don't think its a problem - just making an observation. Doug Fish From: Andreas Jaeger <aj@suse.com> To: Łukasz Jernaś <deejay1@srem.org>, Cc: Douglas Fish/Rochester/IBM@IBMUS, "openstack-i18n@lists.openstack.org" <openstack-i18n@lists.openstack.org>, Clark Boylan <clark.boylan@gmail.com> Date: 05/08/2014 12:32 PM Subject: Re: [Openstack-i18n] Complete translations are big On 05/08/2014 07:22 PM, Łukasz Jernaś wrote:
Hmm, I was under the impression that we have decided not to include translation below a certain threshold in Horizon for Havana, so it may be wise to download only those above that.
I can change our jobs to download only files that are at least X % translated. My proposal would be for X=50. IMO we should keep our translation setup accross projects the same, so I suggest to do this change not only for horizon but also for all other projects - and remove files that are currently below the threshold.
If only transifex would allow to download translations without login we could include that in run_tests or tox, for those who want to have complete ones...
But those can also create an account... Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On 05/08/2014 08:56 PM, Douglas Fish wrote:
Yes, downloading files that are at least 50% translated would alleviate my concern.
FWIW I had hoped this might become the process, but I was thinking more along the lines of 75% - it turns out for horizon both 50% and 75% would choose the same languages anyway!
Would be fine for me as well.
Would this determination be on a per file basis? Is there any concern that projects could pick up different translations? (that is, en_US_mn might be 80% translated in Horizon, but not translated at all in other languages). I don't think its a problem - just making an observation.
Yes, it would be on a per file basis. We currently have no language that translates all projects completely, so this wouldn't be a change, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Hi, +1 for the Douglas proposal. The limit should be more 75% - 80% than 50%. Best regards, François Bureau E-mail : francois.bureau@cloudwatt.com Mobile : 06 40 76 18 01 http://www.cloudwatt.com -----Message d'origine----- De : Andreas Jaeger [mailto:aj@suse.com] Envoyé : jeudi 8 mai 2014 21:08 À : Douglas Fish Cc : Clark Boylan; openstack-i18n@lists.openstack.org Objet : Re: [Openstack-i18n] Complete translations are big On 05/08/2014 08:56 PM, Douglas Fish wrote:
Yes, downloading files that are at least 50% translated would alleviate my concern.
FWIW I had hoped this might become the process, but I was thinking more along the lines of 75% - it turns out for horizon both 50% and 75% would choose the same languages anyway!
Would be fine for me as well.
Would this determination be on a per file basis? Is there any concern that projects could pick up different translations? (that is, en_US_mn might be 80% translated in Horizon, but not translated at all in other languages). I don't think its a problem - just making an observation.
Yes, it would be on a per file basis. We currently have no language that translates all projects completely, so this wouldn't be a change, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 _______________________________________________ Openstack-i18n mailing list Openstack-i18n@lists.openstack.org http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-i18n
So, here's what I heard so far as suggestion: 1) Our translation jobs will only download files that have at least 75 % translated Patch at: https://review.openstack.org/92997 2) Projects can delete (and I'm going to propose patches) files that currently have less than 75 % translated To track this, I've filed a bug: https://bugs.launchpad.net/horizon/+bug/1317794 Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On 05/09/2014 10:25 AM, Andreas Jaeger wrote:
So, here's what I heard so far as suggestion:
1) Our translation jobs will only download files that have at least 75 % translated
Patch at: https://review.openstack.org/92997
The patch is merged.
2) Projects can delete (and I'm going to propose patches) files that currently have less than 75 % translated
To track this, I've filed a bug: https://bugs.launchpad.net/horizon/+bug/1317794
I've send a first few patches for these - for openstack-manuals, operations-guide and keystone. Note that removing the non-translated files will leave some projects without any po files for now, Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
Douglas Fish <drfish@us.ibm.com> writes:
This doesn't entirely satisfy me because git clone https://review.openstack.org/openstack/openstack-manuals is taking several minutes for me this morning (full disclosure: I am having some network flakiness today). I'd hate for Horizon to take anywhere near that long to clone.
This seems like a strange comparison to make. I would be very hesitant to jump to the conclusion that translated strings were responsible for the bulk of the transfer time on a repository that has historically had massive binary files (including fonts and entire rendered PDFs) checked into it. -Jim
participants (6)
-
Andreas Jaeger
-
Douglas Fish
-
François Bureau
-
jeblair@openstack.org
-
Julie Pichon
-
Łukasz Jernaś