Bugzilla – Bug 100116
encoding bug
Last modified: 2005-11-18 15:39:36 UTC
I'm on Mandriva 10.2 using ISO-8859-15 When my tjp file is encoding in UTF-8 when I write my task in french such as "Amélioration du contrôle" I've got "Amélioration du contrÃŽle" in the html report moreover in the section rawhead I can write with some accentuated character and everything OK When my tjp file is encoding in ISO-8859-15 when I write my task in french such as "Amélioration du contrôle" I've got "Amélioration du contrôle" in the html report but in the section rawhead I can write with some accentuated character and buggy So I can have in the same time task and rawhead in the right format
TaskJuggler does not try to detect the encoding of your .tjp files. It assumes that the file is encoded in the same encoding as your locale. So if your file is UTF-8, you need to set your locale to UTF-8 as well. Then the encoding problems should be gone.
When encoding my .tjp file in ISO-8859-15 I still have a bug I can't write rawhead '<table border="1"> <tr> <td><a href="Tasks-Overview.html">Tâches hebdo</a></td> ... ' but I need to write rawhead '<table border="1"> <tr> <td><a href="Tasks-Overview.html">Tâches hebdo</a></td> ...' Otherwise the word "Tâches" appears like that "T�ches hebdo" witch is the display by the browser using UTF8, The browser is force to use this encoding, because you write the following at the top of the HTML reports. <head> <title>Task Report</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> </head> But the task name is well translated : you translate "contrôle" into "contrôle" that will display correct in UTF-8, or any encoding . for my task : task CT "contrôle" { }
Can you send me the output of 'locate' and 'file <yourproject.tjp>'?
Created attachment 46126 [details] My tjp file
Well I suppose that it was "locale" and not "locate" so [Max]$ locale LANG=fr_FR LC_CTYPE=fr_FR LC_NUMERIC=fr_FR LC_TIME=fr_FR LC_COLLATE=fr_FR LC_MONETARY=fr_FR LC_MESSAGES=fr_FR LC_PAPER=fr_FR LC_NAME=fr_FR LC_ADDRESS=fr_FR LC_TELEPHONE=fr_FR LC_MEASUREMENT=fr_FR LC_IDENTIFICATION=fr_FR LC_ALL=
That all looks fine and I can't reproduce the problem here on my SUSE Linux 9.3. When I set LC_ALL to fr_FR and process the ISO-8859-15 (attachement id 46126) the HTML reports look fine. I could not find an accented character that was wrong. When I try the UTF-8 file and set LC_ALL to fr_FR.UTF-8 again the HTML reports look fine. So as long as your locale matches the file encoding you should be good.
Created attachment 46590 [details] the right one I remove the html entities from the rawhead section
I think I've run into this problem when using Russian (koi8-r) encoding for project. This problem is caused by using latin-1 output transformations at least in macros and HTML report code. Try "grep -i latin1 *.cpp" to see what is going on. I've hacked around this problem by replacing - s.setEncoding(QTextStream::Latin1); with + s.setEncoding(QTextStream::UnicodeUTF8); all around in reports/report elements. This is not a clean solution (some HTML post-processing is needed), but it let's me use cyrillic in tjp-files and get readable reports.
The HTML files are always latin1 since non-ASCII characters are hex encoded. If you want to use non-ASCII characters you have to use a UTF8 locale and encode your files in UTF8. Other encodings will not work properly.
This bug is not closed ! You make mistake, I've your read Comment #2 If your produce html report whith <meta http-equiv="Content-Type" content="text/html; charset=utf-8"/> You must check that user' data follow this charset, but when you work in a locale machine with ISO-8859-15 , user'data are used this charset ! You must provide conversion Otherwise, your software is broken one, I'm afraid. Of course they are way to do with it, but that's not the right way.