Talk:Pootle
Preliminary
The Goal is to integrate Pootle with CVS and an automatic converter between XML files and PO at contribs.org :) Bug http://bugs.contribs.org/show_bug.cgi?id=3782
Once we have everything worked out how to interact with pootle and the formmagick stuff then I'll get something up on contribs.org that everyone can use. It would be really nice if we could automate the extraction/import of files that need to be translated into pootle but first things first. — Slords (talk • contribs). 18:23, 24 January 2008 (MST)
Test pootle site: http://www.unixlan.com.ar:8888
Formagick
Normando suggested a few tools, I'm using XML2PO, see the others in the history http://wiki.contribs.org/index.php?title=Talk:Pootle&oldid=7649#I_need_your_help
XML2PO
( http://linux.die.net/man/1/xml2po )
I have packaged for a better installation. You can download from
http://mirror.contribs.org/smeserver/contribs/nhall/sme7/contribs/pootle/rpm/gnome-doc-utils-0.12.0-1.noarch.rpm
Before try, you must edit a few lines.
/usr/share/xml2po/empty.py
Line 27 from "return []" to "return ['base']" Line 31 from "return []" to "return ['trans']" Line 35 leave "return []" Line 39 from "return []" to "return ['trans']"
Create .po and export xml
To test the lexicons
xml2po -m empty -e -o backup.po backup
View the new bakup.po file in the new PO format. Excellent. Now you can translate PO with pootle, and return again to formmagick panel with this command:
xml2po -p backup.po backup > backup.new
As you can see, if you not translate backup.po, new_backup file is equal to original backup file, BUT with one difference, backup.new file has added a line at the header: Do we have to remove this line. ?
<?xml version="1.0" encoding="utf-8"?>
bugs
xml2po ignores tags such as CDATA, CDATA does not appear at the PO file.
The workaround is to find and replace the problem code
this needs work, when creating the first po files
- br maybe shouldn't be used testing
newpo
#!/bin/bash # function usage { echo "" echo "Create po" echo "" echo "Not enough parameters provided." echo "Usage: $0 LexiconFilename" echo "" echo "Optional: tail end of file" echo "Usage: $0 filename check" echo "" } #check for required parameters if [ ${#1} -gt 0 ] then cp $1 $1.bak #echo "Remove CDATA in $1" perl -pi -e 's/<!\[CDATA\[/STARTCDATA/g' $1 perl -pi -e 's/\]\]>/ENDCDATA/g' $1 perl -pi -e 's/\&/AMP/g' $1 perl -pi -e 's/P\>/p\>/g' $1 perl -pi -e 's/A\>/a\>/g' $1 #echo "Create $1.po" xml2po -m empty -e -o $1.po $1 #echo "Replacing CDATA in $1" perl -pi -e 's/STARTCDATA/<!\[CDATA\[/g' $1.po perl -pi -e 's/ENDCDATA/\]\]>/g' $1.po perl -pi -e 's/AMP/\&/g' $1.po mv $1.bak $1 if [ ${#2} -gt 0 ] then #echo "#tail $1.po" tail $1.po fi else #print usage informamtion usage fi
newxml
#!/bin/bash # function usage { echo "" echo "Create .xml from .po" echo "" echo "Not enough parameters provided." echo "Usage: $0 filename (dont add .po)" echo "" echo "Optional: compare against original" echo "Usage: $0 filename check" echo "" } #check for required parameters if [ ${#1} -gt 0 ] then #echo "Remove CDATA in $1.po" cp $1.po $1.bak perl -pi -e 's/<!\[CDATA\[/STARTCDATA/g' $1.bak perl -pi -e 's/\]\]>/ENDCDATA/g' $1.bak perl -pi -e 's/\&/AMP/g' $1.bak #echo "Create xml" xml2po -p $1.bak $1 > $1.xml #echo "Replacing CDATA" perl -pi -e 's/STARTCDATA/<!\[CDATA\[/g' $1.xml perl -pi -e 's/ENDCDATA/\]\]>/g' $1.xml perl -pi -e 's/AMP/\&/g' $1.xml #this is added at line 367 xml2po, it needs to be removed or better not added #perl -pi -e 's/\<\?xml version="1.0" encoding="utf-8"\?\>//' $1.xml #remove first line perl -i.old -ne 'print unless 1 .. 1' $1.xml #rm $1.bak $1.xml.old if [ ${#2} -gt 0 ] then #echo "#diff -n $1 $1.xml" diff -n $1 $1.xml fi else #print usage informamtion usage fi
yum
- newxml needs a parameter to set: lang="fr"
- why doesn't this work ?
- xml2po -l fr -p yum.po yum > yum.xml
- -l --language=LANG Set language of the translation to LANG
- you can't have a base equal to a trans, needs a new bug on yum lexicon (if pootle goes ahead)
<lexicon lang="en-us"> <entry> <base>FORM_TITLE</base> <trans>Software installer</trans> </entry> <entry> <base>Configuration</base> <trans>Configuration</trans> </entry> <entry> <base>Software installer</base> <trans>Software installer</trans> </entry>
<lexicon lang="en-us"> <entry> <base>FORM_TITLE</base> <trans>Mise à jour logicielle</trans> </entry> <entry> <base>Configuration</base> <trans>Configuration</trans> </entry> <entry> <base>Mise à jour logicielle</base> <trans>Mise à jour logicielle</trans> </entry>
subversion
This is a test case,
something is wrong with the encoding in some places, eg view .po
After you have a clean english .po you can use that as a template and copy and paste the translation in, maybe do it on pootle, you get fewer errors, at least use pootle for multiline and cdata. I don't see an easier way
Console
update all .po files
- Templates have been implemented with up to date strings,
- now it's over to users to translate them
.po file names are inconsistent, SV & FR add .tmpl.po
- .tmpl.po is correct, other languages have been updated
revision control using CVS/SVN
Today I had a quick look at the pootle pages and found a wiki as well, which has some valuable information like for instance revision control using CVS/SVN: http://translate.sourceforge.net/wiki/pootle/version_control - Cactus 03:26, 25 January 2008 (MST)
progress
we'll look at this later, perhaps starting with contribs
Pootle Usage
untranslated words
click "Show Editing Functions" and finally at "Quick Translate" for each file or whole language.
Also you can see the suggestions clicking at "Review Suggestions".
checks
click "show checks" to see a list of syntax errors
acronyms 6 strings (2%) failed brackets 17 strings (6%) failed doublequoting 4 strings (1%) failed doublespacing 1 string (0%) failed endpunc 11 strings (3%) failed endwhitespace 4 strings (1%) failed numbers 2 strings (0%) failed puncspacing 2 strings (0%) failed sentencecount 2 strings (0%) failed simplecaps 16 strings (5%) failed startcaps 6 strings (2%) failed unchanged 7 strings (2%) failed untranslated 53 strings (19%) failed
click on one of the links offered and fix them
See http://translate.sourceforge.net/wiki/guide/pofilter_examples and http://translate.sourceforge.net/wiki/guide/translation/commonerrors
Merging new strings
Add, remove or modify strings in the template/*.pot file then click 'update from template'.
- New strings are added
- modified strings use existing data and made fuzzy (i think)
- deleted strings are moved to the bottom of the file and commented out with #~
Remarks
duplicate translation work
Because some phrases are in multiple panels
Lets create a list and add them to 'general' All panels check general if the tag isn't found in its lexicon
- Yes
- No
- Save
- Success
Pootle issues
UTF-8
Bug created, 'Use UTF-8 for console .po files' http://bugs.contribs.org/show_bug.cgi?id=3858
Testing Pootle I have found a few issues. I will try to describe.
Pootle (and Translation Toolkit used by Pootle) use as default a charset UTF-8. Everytime I have create a new language translation, Pootle parse from templates encoding as UTF-8. Everytime I have "Update from templates" the target PO files are updated as UTF-8, and if target has set charset=iso-8859-1, it is then corrupted. I can't found a method to set Pootle to use ISO-8859-1 as default. Because this, and to avoid these issues I have decide to use UTF-8 as charset in Pootle. Please see http://translate.sourceforge.net/wiki/guide/locales/glibc?s=charset#editing
Not only Pootle. Every desktop translation tool set UTF-8 as default charset.
If you want to merged or override a PO file, convert to UTF-8 (if not yet). Then you can upload without corrupt the target file.
One note for admins when Pootle run at contribs.org: I suggest to use a tool for convert from UTF-8 to ISO-8859-1 before make a new language rpm update package (if gettext can't support utf-8 encoding).
I have use this great script to convert between any charset files in a directory:
#!/bin/bash #./dir_iconv.sh dir cp1251 utf8 - converts all files from directory dir .. cp1251 (windows-1251) to utf8. ICONVBIN='/usr/bin/iconv' # path to iconv binary if [ $# -lt 3 ] then echo "$0 dir from_charset to_charset" exit fi for f in $1/* do if test -f $f then echo -e "\nConverting $f" /bin/mv $f $f.old $ICONVBIN -f $2 -t $3 $f.old > $f else echo -e "\nSkipping $f - not a regular file"; fi done
This apply also to panels (if formmagick can't support UTF-8 encoding)
There is an open posibility to modify Pootle or translation toolkit to make this automatically, but I don't know how to do. --Normando Hall 12:45, 2 February 2008 (MST)
- . Formmagick supports UTF-8: http://search.cpan.org/~mitel/CGI-FormMagick-0.89/lib/CGI/FormMagick.pm
- . GetText does support UTF-8 as well: http://www.gnu.org/software/gettext/manual/gettext.html#Charset-conversion
- It is quite common for translations/lexicons to be in UTF8 as this is a well known locale independent character set, probably also the reason why Pootle is sticking to UTF-8. - Cactus 15:01, 2 February 2008 (MST)