The problem
We have released the Polish version of Ututi a week ago, and that taught me a couple of lessons:
- I18n in Pylons support is lacking
- You must have tests for your translations just as you test your code
There are some flaws in Pylons I18n that got me longing for Zope3 I18n.
-
There is no ‘default’ translation. When I used Zope3 I used to be able to say
_('ok-button-text', default='OK'). With Pylons I have to have an English to English translation, which means that translators cannot see the default text, which leads to mistakes like'ok-mygtuko-tekstas'instead of'Gerai'or'OK' -
Using Python formatting directives leads to tracebacks if there are
bugs in the translations. If someone translates
'Hi %(fullname)s!'into'Labas %(fullname)', all the pages that try showing this message will end up as error pages, because of the missing's'. -
Mako templates are very very very translation unfriendly. Simple
translatable texts look like:
${_('Hi!')}and more complex texts in our templates end up looking like:${_('A new file %(link_to_file)s was uploaded for the subject %(subject_title)s') % dict( filename=h.link_to(c.filename, c.file_url), subject_title=c.subject_title)}It looks like perl already, I am not even talking about what happens when you have something like an email, with multiple lines of text and multiple embedded links, to translate. Though Zope’ish<span i18n:translate="">A new file <a tal:attributes="href view/file_url" i18n:name="link_to_file" tal:content="view/filename" /> was uploaded for subject <tal:block i18n:name="subject_title" tal:content="view/subject_title" /> </span>
is just as ugly for short pieces of text I would surely prefer it for something that is more than 2 lines of text. An Emacs macro that wraps any selected text in${_('<text here>')}helps to reduce the strain, but I’d prefer dedicated markers for translatable text, ones that would be easier on the Shift-pressing hand than what we have now. -
Babel has problems extracting some strings, some times, from Mako
templates. The workaround -
- run tests (yay almost 100% coverage),
- copy template cache into your ’src’,
- extract the translations,
- remove the copy.
- And then remove all the fuzzy markers from plural strings
that are marked as
# ,python-formatby babel, which as you can guess is all the plural strings. We don’t want to guess the position of the hours in a sentence like'uploaded %(hours)s ago'
The solution
Now that we’re done explaining what’s wrong, let’s talk about something more constructive – making sure that tracebacks do not happen, because of typos in translations. First – we need a nice translation tester. Candidates:
- potest
- gettext-lint
- pofilter from translate-toolkit
potest
Last commit 78 weeks ago. Can’t parse plural forms. Verdict — unusable.
gettext-lint
Seems to be written in Python, but packaging uses autoconf !? which generates a Makefile that does nothing. Seems cumbersome to use, can’t check html tags (some translators get this idea of translating <strong> into the target language), does not handle %(foo)s syntax.
pofilter
The tarball has all parts that are needed to package this tool as an egg, but it is not easy_installable. So what do we do? The same thing we do every night,
extract,python setup.py sdist,scp dist/translate-toolkit.tar.gz pow.lt:~/www/eggs/
Now we just make a new virtualenv and easy_install translate-toolkit in it:
virtualenv translations cd translations bin/easy_install translate-toolkit --find-links=http://pow.lt/eggs
And test it on one of the projects in my src that has a lot of translations — SchoolTool:
bin/pofilter ../trunk/schooltool/src/schooltool/locales \
-o ./out -t printf -t xmltags -t variables --openoffice
(I pass the --openoffice parameter, so that it would recognize
Zope3 translation markers, like ${calendar_title}, as variables)
This results in a bunch of PO files in the ./out, each file containing the errors for the corresponding translation file.
# (pofilter) variables: do not translate: ${event_title}
#: /src/schooltool/app/browser/templates/recevent_delete.pt:4
#: /src/schooltool/app/browser/templates/recevent_delete.pt:14
msgid "Deleting a repeating event (${event_title})"
msgstr "Šalinamas pasikartojantis įvykis (${event title}"
Pretty cool, eh? pofilter has most of the functions I need, I just have to integrate it into my sandbox and extend it a little bit. So first I add:
[test_translations]
find-links = http://pow.lt/eggs/
recipe = zc.recipe.egg
eggs = translate-toolkit
lxml
entry-points = pofilter=translate.filters.pofilter:main
to my buildout.cfg.
(I added an entry point to the [test_translations] section, because all the translate-toolkit scripts seem to be defined as plain scripts and not registered as console_script entry points)
Customizing pofilter is slightly difficult. I could not find any defined hooks that would allow me to customize the functionality. And “xmltags” seems to be picking up all the translated “title” attributes on links, which is annoying. So after reporting this as a bug I just find the “main” function in translate.filters.pofilter, copy it and produce – this:
from translate.filters.pofilter import cmdlineparser
from translate.filters.checks import StandardChecker
from translate.filters.checks import CheckerConfig
ututiconfig = CheckerConfig(
canchangetags = [("a", "title", None)]
)
class UtutiChecker(StandardChecker):
def __init__(self, **kwargs):
checkerconfig = kwargs.get("checkerconfig", None)
if checkerconfig is None:
checkerconfig = CheckerConfig()
kwargs["checkerconfig"] = checkerconfig
checkerconfig.update(ututiconfig)
StandardChecker.__init__(self, **kwargs)
def main():
parser = cmdlineparser()
parser.add_option("", "--ututi", dest="filterclass",
action="store_const", default=None, const=UtutiChecker,
help="use the standard checks for Ututi translations")
parser.run()
Then registered this new function as an entry point instead of the old one:
[test_translations] find-links = http://pow.lt/eggs/ recipe = zc.recipe.egg eggs = ututi entry-points = pofilter=ututi.tests.translations:main
Now if I will pass “–ututi” to pofilter it will not raise warnings for title attributes anymore.
Icing on the cake
Tests are pretty useless if they are not run, and we want to run our tests after every modification to the code, and after every commit to our git server. As I am using make as my tool to run everything, I just added these two targets to the Makefile.
.PHONY: test_translations test_translations: bin/pofilter bin/pofilter --progress=none -t xmltags -t printf --ututi src/ututi/i18n/ -o parts/test_translations/ diff -r -u src/ututi/tests/expected_i18n_errors/ parts/test_translations/ .PHONY: update_expected_translations update_expected_translations: bin/pofilter bin/pofilter --progress=none -t xmltags -t printf --ututi src/ututi/i18n/ -o parts/test_translations/ rm -rf src/ututi/tests/expected_i18n_errors/ mv parts/test_translations/ src/ututi/tests/expected_i18n_errors/
Even after changes pofilter is still reporting 3-4 false positives, that I will have to resolve with our translators, so instead of expecting absolutely no output, I am just asking for the output to be identical to the old one. If it is a known/accepted failure – we let it be.
And of course – made our dear Hudson run this after every commit, for when I forget to do it myself.
Fin!


