Changes

Jump to navigation Jump to search
11,947 bytes removed ,  18:44, 10 March 2008
no edit summary
Line 1: Line 1: −
== Preliminary ==
+
Managing SME translations with pootle have been moved to [[Translations]]
*http://bugs.contribs.org/show_bug.cgi?id=3782
  −
The Goal is to manage SME translations with Pootle hosted on contribs.org.
     −
Once we have everything worked out how to interact with pootle and the formmagick stuff then I'll get something up on contribs.org that everyone can use.  It would be really nice if we could automate the extraction/import of files that need to be translated into pootle but first things first. 
+
== Please use common packages and default package names (where possible) ==
   −
[[User:Slords|Slords]] 18:23, 24 January 2008 (MST)
+
A lot of the RPM's are in the DAG repository, no use of packaging them yourself or sharing them by copying them from the providing website and sharing them under a different name.
   −
== Pootle Usage ==
+
:What packages exactly? All packages are build from the sources code with the command "python setup.py bdist_rpm" (under python 2.4), and the packge itself create the RPM. The only packages I have changed the name is smeserver-pylucene, because it has a lot of problem to build at the same time of build rpm. Instead, I have only package the installed files. But in the next release of "Translate Toolkin", support for PyLucene 2.X and I have this package correctly builded.
   −
Test pootle site: http://www.unixlan.com.ar:8888
+
Please point users to download python-kid (replacing you kid RPM), python-lxml (replacing your lxml RPM), python-sqllite (replacing pysqlite RPM), python_Levenshtein and python-elementtree (which you called elementtree) which is also available in the DAG repository and AFAIK is already installed on SME Server by default.  
   −
===untranslated words===
+
:These packages don't run under python2.4.
click "Show Editing Functions" and finally at "Quick Translate" for each file or whole language.  
     −
Also you can see the suggestions clicking at "Review Suggestions".
+
On top of that not everything is required to install Pootle, only requirements AFAIK could find on the Pootle site are listed [http://translate.sourceforge.net/wiki/pootle/installation#pre-requisite_software here] and because of the jToolkit requiring pythonabi-2.4 you have problems installing against pythonabi-2.3 which is installed on SME Server 7.3, perhaps you can find an older jToolkit, or recompile jToolkit from source and see if it will also work with pythonabi-2.3, this would drop the hack you have to do to make it work on SME Server 7.x.
   −
===checks===
+
:Yes, requirements say python 2.3 is supported, but preferable 2.4. This page is out of date. If you search through pootle mailing lists you will found a lot of problems with python 2.3. Of couser I was preferable python 2.3, and make my own packages and test with python 2.3 but without sucess. I have a lot of headache with python 2.3 to try (only try) to run pootle. So, python 2.4 is mandatory. Also has a superb efficiently above 2.3 with some new functions that pootle use.
click "show checks" to see a list of syntax errors
+
:See http://translate.svn.sourceforge.net/viewvc/translate/src/trunk/Pootle/README?r1=6098&r2=6144
   −
acronyms  6 strings (2%) failed
+
:I am not finish this howto yet. Sorry for not warn about that. Now I put a template box with a warn.
brackets 17 strings (6%) failed
+
:Only when I have finished this howto, I will included in smecontribs.
doublequoting 4 strings (1%) failed
  −
doublespacing 1 string (0%) failed
  −
endpunc 11 strings (3%) failed
  −
endwhitespace 4 strings (1%) failed
  −
numbers 2 strings (0%) failed
  −
puncspacing 2 strings (0%) failed
  −
sentencecount 2 strings (0%) failed
  −
simplecaps 16 strings (5%) failed
  −
startcaps 6 strings (2%) failed
  −
unchanged 7 strings (2%) failed
  −
untranslated 53 strings (19%) failed
     −
click on one of the links offered and fix them
     −
See http://translate.sourceforge.net/wiki/guide/pofilter_examples and http://translate.sourceforge.net/wiki/guide/translation/commonerrors
+
- [[User:Cactus|Cactus]] 14:09, 19 February 2008 (MST)
 
  −
===Merging new strings===
  −
Add, remove or modify strings in the template/*.pot file then click 'update from template'.
  −
 
  −
:New strings are added
  −
:modified strings use existing data and made fuzzy (i think)
  −
:deleted strings are moved to the bottom of the file and commented out with #~
  −
 
  −
== Formagick ==
  −
Normando suggested a few tools, I'm using XML2PO, see the others in the history
  −
http://wiki.contribs.org/index.php?title=Talk:Pootle&oldid=7649#I_need_your_help
  −
 
  −
===XML2PO===
  −
( http://linux.die.net/man/1/xml2po )
  −
 
  −
I have packaged for a better installation. You can download from
  −
http://mirror.contribs.org/smeserver/contribs/nhall/sme7/contribs/pootle/rpm/gnome-doc-utils-0.12.0-1.noarch.rpm
  −
 
  −
Before try, you must edit a few lines.
  −
 
  −
/usr/share/xml2po/empty.py
  −
Line 27 from "return ['base', 'A', 'a', 'i', 'I', 'B', 'b' ,'P' ,'p' ,'h2' ,'H2', 'div', 'DIV', 'font']
  −
Line 31 from "return []" to "return ['trans']"
  −
Line 35 leave "return []"
  −
Line 39 from "return []" to "return ['trans']" // or ['lang'] testing
  −
 
  −
===Create .po and export xml===
  −
To test the lexicons
  −
xml2po -m empty -e -o backup.po backup
  −
 
  −
View the new bakup.po file in the new PO format. Excellent. Now you can translate PO with pootle, and return again to formmagick panel with this command:
  −
 
  −
xml2po -p backup.po backup > backup.new
  −
 
  −
As you can see, if you not translate backup.po, new_backup file is equal to original backup file, BUT with one difference, backup.new file has added a line at the header: We remove this line with newxml
  −
<?xml version="1.0" encoding="utf-8"?>
  −
 
  −
===shell scripts===
  −
xml2po ignores tags such as CDATA, CDATA does not appear at the PO file.
  −
 
  −
The workaround is to find and replace the problem code
  −
 
  −
Create a clean .po file with newpo, either edit your lexicon or add s///g commands to workaround new problems
  −
 
  −
After you have a clean english .po you could use that as a template and copy and paste the translation in. A better idea is to write a merge tool see below
  −
 
  −
====newpo====
  −
#!/bin/bash
  −
#
  −
#SME Server Create lexocon .po
  −
 
  −
if  [ -f $1 ]
  −
then
  −
  #convert to UTF-8 while working on pootle
  −
  mv $1 $1.bak
  −
  /usr/bin/iconv -f ISO-8859-1 -t UTF-8 $1.bak > $1
  −
  −
  #echo "Remove CDATA and reformat problem codes in $1"
  −
  perl -pi -e 's/<!\[CDATA\[/STARTCDATA/g' $1
  −
  perl -pi -e 's/\]\]>/ENDCDATA/g' $1
  −
  perl -pi -e 's/\&/AMP/g' $1
  −
  perl -pi -e 's/P\>/p\>/g' $1
  −
  perl -pi -e 's/A\>/a\>/g' $1
  −
  perl -pi -e 's/\<(br|BR)\>/breeak/g' $1
  −
  perl -pi -e 's/\<\/(font|FONT)\>//g' $1
  −
  −
  #echo "Create $1.po"
  −
  xml2po -m empty -e -o $1.po $1
  −
  −
  #echo "Replacing CDATA in $1"
  −
  perl -pi -e 's/STARTCDATA/<!\[CDATA\[/g' $1.po
  −
  perl -pi -e 's/ENDCDATA/\]\]>/g' $1.po
  −
  perl -pi -e 's/AMP/\&/g' $1.po
  −
  perl -pi -e 's/breeak/\<bXr\>/g' $1.po  #wiki display error, fixme
  −
  −
  #basic testing
  −
  A=`cat $1.bak |grep '<entry>' |wc -l`
  −
  B=`cat $1.po |grep msgid |wc -l`
  −
  C=`expr $B - 1`
  −
  echo "entries $A, msgid $C"
  −
  −
  if  [ $A -ne $C ]
  −
  then
  −
    echo "Errors in formatting"
  −
    tail $1.po
  −
  fi
  −
 
  −
  #restore original
  −
  mv $1.bak $1
  −
  −
  if  [ ${#2} -gt 0 ]
  −
  then
  −
    echo "<base> entries"
  −
    cat $1 |grep base |sort
  −
  fi
  −
  −
  else
  −
  #print usage informamtion
  −
  echo "Usage: $0 LexiconFilename"
  −
  echo "Usage: $0 LexiconFilename check"
  −
fi
  −
 
  −
:I am wondering why we are using perl to replace as we can also use sed and if readability is not an issue we can do the whole CDATA tag conversin at once
  −
: <pre>sed -e 's/<!\[CDATA\[/STARTCDATA/g' -e 's/\]\]>/ENDCDATA/g' -e 's/\&>/AMP/g' < original.file > conversion.file</pre>
  −
: And back
  −
: <pre>sed -e 's/STARTCDATA/<!\[CDATA\[/g' -e 's/ENDCDATA/\]\]>/g' -e 's/AMP/\&/g' < conversion.file > new.file</pre>
  −
 
  −
::yes but there are more conversions, so it was easier to operate on the one file, efficiency isn't a concern
  −
 
  −
====newpomerge====
  −
 
  −
The aim of this is to take a blank en .po and merge in the values for a translation .po
  −
 
  −
en.po has
  −
msgid "Yes"
  −
msgstr ""
  −
 
  −
fr.po has
  −
msgid "Oui"
  −
msgstr ""
  −
 
  −
we want fr.po to look like
  −
msgid "Yes"
  −
msgstr "Oui"
  −
 
  −
We can't assume the translation will be in the same order,
  −
so it involves looking back at the lexicon base field to do the match
  −
 
  −
or rewriting the fr lexicon first so it is in the same order, with empty fields if necessary,
  −
then you can just go through in order picking out records.
  −
 
  −
Sample .po output attached, it will always be in this format (but translation msgid's may be missing or out of order)
  −
msgid ""
  −
msgstr ""
  −
"Project-Id-Version: PACKAGE VERSION\n"
  −
"POT-Creation-Date: 2008-02-04 02:31+1100\n"
  −
"PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
  −
"Last-Translator: FULL NAME <EMAIL@ADDRESS>\n"
  −
"Language-Team: LANGUAGE <LL@li.org>\n"
  −
"MIME-Version: 1.0\n"
  −
"Content-Type: text/plain; charset=UTF-8\n"
  −
"Content-Transfer-Encoding: 8bit\n"
  −
  −
#: functions/useraccounts:6(trans)
  −
msgid "Create, modify, or remove user accounts"
  −
msgstr ""
  −
  −
#: functions/useraccounts:33(trans)
  −
msgid "Create or modify"
  −
msgstr ""
  −
  −
#: functions/useraccounts:66(trans)
  −
msgid "Modify the admin account"
  −
msgstr ""
  −
 
  −
====newxml====
  −
#!/bin/bash
  −
#
  −
 
  −
  function usage {
  −
  echo ""
  −
  echo "Create .xml from .po"
  −
  echo ""
  −
  echo "Not enough parameters provided."
  −
  echo "Usage: $0 filename (dont add .po)"
  −
  echo ""
  −
  echo "Optional: compare against original"
  −
  echo "Usage: $0 filename check"
  −
  echo ""
  −
  }
  −
  −
  #check for required parameters
  −
  if  [ ${#1} -gt 0  ]
  −
  then
  −
  −
  #echo "Remove CDATA in $1.po"
  −
  cp $1.po $1.bak
  −
  perl -pi -e 's/<!\[CDATA\[/STARTCDATA/g' $1.bak
  −
  perl -pi -e 's/\]\]>/ENDCDATA/g' $1.bak
  −
  perl -pi -e 's/\&/AMP/g' $1.bak
  −
  −
  #echo "Create xml"
  −
  xml2po -p $1.bak $1 > $1.xml
  −
  −
  #echo "Replacing CDATA"
  −
  perl -pi -e 's/STARTCDATA/<!\[CDATA\[/g' $1.xml
  −
  perl -pi -e 's/ENDCDATA/\]\]>/g' $1.xml
  −
  perl -pi -e 's/AMP/\&/g' $1.xml
  −
  −
  #this is added at line 367 xml2po, it needs to be removed or better not added
  −
  #perl -pi -e 's/\<\?xml version="1.0" encoding="utf-8"\?\>//' $1.xml
  −
  −
  #remove first line
  −
  perl -i.old -ne 'print unless 1 .. 1' $1.xml
  −
  −
  rm $1.bak $1.xml.old
  −
  −
  if  [ ${#2} -gt 0  ]
  −
  then
  −
    #echo "#diff -n $1 $1.xml"
  −
    diff -n $1 $1.xml
  −
  fi
  −
  −
  else
  −
  #print usage informamtion
  −
  usage
  −
fi
  −
 
  −
===yum===
  −
:newxml needs a parameter to set:  lang="fr"
  −
:: why doesn't this work ?
  −
:: xml2po -l fr -p yum.po yum > yum.xml
  −
::    -l    --language=LANG      Set language of the translation to LANG
  −
 
  −
:you can't have a base equal to a trans, needs a new bug (if pootle goes ahead)
  −
:most en base lexicons do this,
  −
:easy fix is to tweak the trans for Form_Title, capitalise or punctuate
  −
:-<trans>Software installer</trans>
  −
:+<trans>Software Installer</trans>
  −
 
  −
<lexicon lang="en-us">
  −
    <entry>
  −
        <base>FORM_TITLE</base>
  −
        <trans>Software installer</trans>
  −
    </entry>
  −
 
  −
    <entry>
  −
      <base>Configuration</base>
  −
      <trans>Configuration</trans>
  −
    </entry>
  −
 
  −
    <entry>
  −
      <base>Software installer</base>
  −
      <trans>Software installer</trans>
  −
    </entry>
  −
 
  −
<lexicon lang="en-us">
  −
    <entry>
  −
      <base>FORM_TITLE</base>
  −
      <trans>Mise à jour logicielle</trans>
  −
    </entry>
  −
  −
    <entry>
  −
      <base>Configuration</base>
  −
      <trans>Configuration</trans>
  −
    </entry>
  −
  −
    <entry>
  −
      <base>Mise à jour logicielle</base>
  −
      <trans>Mise à jour logicielle</trans>
  −
    </entry>
  −
 
  −
==Console==
  −
 
  −
*http://bugs.contribs.org/show_bug.cgi?id=3833
  −
update all .po files
  −
: Templates have been implemented with up to date strings,
  −
: now it's over to users to translate them
  −
 
  −
 
  −
*http://bugs.contribs.org/show_bug.cgi?id=3834
  −
.po file names are inconsistent, SV & FR add .tmpl.po
  −
: .tmpl.po is correct, other languages have been updated
  −
 
  −
 
  −
*http://bugs.contribs.org/show_bug.cgi?id=3858
  −
Use UTF-8 for console .po files
  −
: Pootle has problems with the current charset=iso-8859-1
  −
: we now use UTF-8
  −
 
  −
== revision control using CVS/SVN ==
  −
 
  −
Today I had a quick look at the pootle pages and found a wiki as well, which has some valuable information like for instance revision control using CVS/SVN: http://translate.sourceforge.net/wiki/pootle/version_control - [[User:Cactus|Cactus]] 03:26, 25 January 2008 (MST)
  −
 
  −
===progress===
  −
we'll look at this later, perhaps starting with contribs
  −
:A few links for future implementation:
  −
:http://subversion.tigris.org/tools_contrib.html#po_update_sh
  −
:http://subversion.tigris.org/tools_contrib.html#verify_po_py
  −
:http://subversion.tigris.org/tools_contrib.html#svnmerge_py
  −
:http://subversion.tigris.org/tools_contrib.html#svnmerge_sh
  −
 
  −
== Remarks ==
  −
 
  −
===Translation Workflow===
  −
 
  −
Draft suggestions ...
  −
 
  −
The original files are in cvs.
  −
 
  −
====.po files====
  −
*gettext strings are in various files in cvs, currently we think we have them all in pootle as .pot template files
  −
: we need a script to search a maintained list of files to extract current gettext strings
  −
: this may find missing (or modified) strings or remove old ones saving on pointless translations
  −
 
  −
*existing translations have been placed in pootle and merged with the templates
  −
:this shows missing translations very nicely and is ready for user testing
  −
 
  −
*all .po files have been converted to UTF-8, they can be converted back easily if necessary
  −
 
  −
*changes to original files, ie changed gettext strings, have to be tracked as usual in the bug tracker
  −
:these are imported into the templates, other languages are updated with a click
  −
 
  −
*at some point we diff against cvs po files, the patch is checked and applied.
  −
 
  −
 
  −
====FormMagick files====
  −
*these are XML files with a <base> and <trans> pair of strings, these can be converted to .po files with some work
  −
 
  −
*some rough scripts are being worked on above, there may be others on the net waiting to be found...
  −
: we need to clean up some inconsistencies in the english lexicons and they will be our .pot templates
  −
: we need to script the conversion of existing translations into .po files, discussed above
  −
 
  −
*all .po files have been converted to UTF-8, they can be converted back easily if necessary
  −
 
  −
*changes to original files, have to be tracked as usual in the bug tracker
  −
:the following is speculative ...
  −
:these are imported into the .pot templates, other languages are updated with a click
  −
:I think any new <base> strings have to be added to the cvs xml file for each language
  −
 
  −
*at some point we run xml2po to apply the new translation back to the xml file,
  −
:then diff against cvs, make patch, check and apply
  −
 
  −
== Pootle issues==
  −
=== UTF-8 ===
  −
Bug created, 'Use UTF-8 for console .po files'
  −
http://bugs.contribs.org/show_bug.cgi?id=3858
  −
:: SME Translation files currently use iso-8859-1
  −
:: we currently convert any file in pootle to UTF-8
  −
:: we can convert back when returning to SME if necessary, by adding one line in a script.
  −
 
  −
Testing Pootle I have found a few issues. I will try to describe.
  −
 
  −
Pootle (and Translation Toolkit used by Pootle) use as default a charset UTF-8. Everytime I have create a new language translation, Pootle parse from templates encoding as UTF-8. Everytime I have "Update from templates" the target PO files are updated as UTF-8, and if target has set charset=iso-8859-1, it is then corrupted. I can't found a method to set Pootle to use ISO-8859-1 as default. Because this, and to avoid these issues I have decide to use UTF-8 as charset in Pootle. Please see http://translate.sourceforge.net/wiki/guide/locales/glibc?s=charset#editing
  −
 
  −
Not only Pootle. Every desktop translation tool set UTF-8 as default charset.
  −
 
  −
If you want to merged or override a PO file, convert to UTF-8 (if not yet). Then you can upload without corrupt the target file.
  −
 
  −
One note for admins when Pootle run at contribs.org: I suggest to use a tool for convert from UTF-8 to ISO-8859-1 before make a new language rpm update package (if gettext can't support utf-8 encoding).
  −
 
  −
I have use this great script to convert between any charset files in a directory:
  −
 
  −
#!/bin/bash
  −
  −
#./dir_iconv.sh dir cp1251 utf8 - converts all files from directory dir .. cp1251 (windows-1251) to utf8.
  −
  −
ICONVBIN='/usr/local/bin/iconv' # path to iconv binary
  −
  −
if [ $# -lt 3 ]
  −
then
  −
    echo "$0 dir from_charset to_charset"
  −
    exit
  −
fi
  −
  −
for f in $1/*
  −
do
  −
    if test -f $f
  −
    then
  −
        echo -e "\nConverting $f"
  −
        /bin/mv $f $f.old
  −
        $ICONVBIN -f $2 -t $3 $f.old > $f
  −
    else
  −
        echo -e "\nSkipping $f - not a regular file";
  −
    fi
  −
done
  −
 
  −
This apply also to panels (if formmagick can't support UTF-8 encoding)
  −
 
  −
There is an open posibility to modify Pootle or translation toolkit to make this automatically, but I don't know how to do.
  −
--[[User:PicsOne|Normando Hall]] 12:45, 2 February 2008 (MST)
  −
 
  −
:#. Formmagick supports UTF-8: http://search.cpan.org/~mitel/CGI-FormMagick-0.89/lib/CGI/FormMagick.pm
  −
:#. GetText does support UTF-8 as well: http://www.gnu.org/software/gettext/manual/gettext.html#Charset-conversion
  −
 
  −
:It is quite common for translations/lexicons to be in UTF8 as this is a well known locale independent character set, probably also the reason why Pootle is sticking to UTF-8. - [[User:Cactus|Cactus]] 15:01, 2 February 2008 (MST)
 
985

edits

Navigation menu