Bug 121943 - pine: broken encoding handling (mail headers truncated in UTF-8 mode)
Summary: pine: broken encoding handling (mail headers truncated in UTF-8 mode)
Status: RESOLVED FIXED
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: Other (show other bugs)
Version: Beta 1
Hardware: Other All
: P5 - None : Normal (vote)
Target Milestone: Beta 4
Assignee: Gary Ekker
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-10-10 12:07 UTC by Michal Svec
Modified: 2006-02-11 09:16 UTC (History)
2 users (show)

See Also:
Found By: Development
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
screenshot (24.00 KB, image/png)
2005-12-13 12:59 UTC, Michal Svec
Details
Pine decoding sample email (196 bytes, text/plain)
2006-01-18 09:38 UTC, Björn Voigt
Details
patches diff (1.42 KB, text/plain)
2006-01-20 09:55 UTC, Michal Svec
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Michal Svec 2005-10-10 12:07:58 UTC
It seems that in some parts the encoding handling got broken since 9.3, I
noticed that in mail subjects and From names. Seems like some length problem,
the text "Český text" (Cesky text) in subject is displayed as "Český t" (Cesky
t). Original in mailbox is 'Subject: =?iso-8859-2?Q?=C8esk=FD_text?='.
Comment 1 Gary Ekker 2005-12-12 17:46:57 UTC
Michal, do you still see this with pine-4.64-5 in STABLE? Or Rudi, i know you reported it to me as well.

I am unable to duplicate it with the info you have provided.
Comment 2 Michal Svec 2005-12-13 12:57:19 UTC
I built current pine sources for 10.0-i386 (pine-4.64-4.1.rpm) and the problem still persists.
Comment 3 Michal Svec 2005-12-13 12:59:57 UTC
Created attachment 60404 [details]
screenshot

Here is a screenshot of the problem. Note a subject in the window, it should have been '?eský text', but the last three letter got truncated.

The same applies if the international letter are used in a sender's user name.

I could send you a testing mail, if you like.
Comment 4 Gary Ekker 2005-12-13 16:30:41 UTC
Yes please send me a test mail. I copyied the subject of the email you mentioned above and couldn't duplicate.
Comment 5 Michal Svec 2005-12-14 10:41:00 UTC
Done.

BTW I have in .pinerc:
  character-set="utf-8"
Comment 6 Ruediger Oertel 2005-12-14 11:55:04 UTC
here:
        no-pass-control-characters-as-is,
        no-pass-c1-control-characters-as-is,
character-set=UTF-8
Comment 7 Walter Haidinger 2005-12-23 17:29:03 UTC
I can confirm this bug with German umlauts and pine-4.63-9 after upgrading 
to 10.0.

FYI, I did not have any obvious Problems with pine-4.61 from
http://www.suse.de/~bk/pine/FAQ.html and 9.3.
Comment 8 Michal Svec 2006-01-09 12:02:25 UTC
Any news on this?
Comment 9 Gary Ekker 2006-01-10 22:18:59 UTC
No news yet, pine is my last priority, and right now I'm swamped with the feature freeze upon us. I'm hoping to take a look in the next couple of weeks.
Comment 10 Walter Haidinger 2006-01-10 22:30:17 UTC
No news is bad news for my _primary_ MUA...
I wonder what changes broke it, as it worked before 
(see my previous comment #7).
Comment 11 Björn Voigt 2006-01-18 09:38:30 UTC
Created attachment 63743 [details]
Pine decoding sample email
Comment 12 Björn Voigt 2006-01-18 09:41:09 UTC
I noticed the problem too, but with more recent software:
- SuSE Linux 10.1 Beta1
- pine-4.64-5 

Here is a sample mail header:

  Subject: =?ISO-8859-1?Q?Hall=F6chen?=

Correct decoding should be:

  Subject: [ISO-8859-1] Hallöchen
  (from Pine 4.64 on FreeBSD ports, ISO-8859-1 terminal)

SuSE Linux instead cuts the last letter "n" on a ISO-8859-1 terminal:

  Subject: *** GMX Spamverdacht *** Hallöche
  (pine-4.64-5, SuSE Linux, ISO-8859-1 terminal)

With UTF-8 terminal the last 2 letters "en" are missing:

  Subject: *** GMX Spamverdacht *** Hallöch
  (pine-4.64-5, SuSE Linux, UTF-8 terminal)

Because the decoding on the nearly unpatched Pine 4.64 version on FreeBSD is correct, I fear that the decoding problem on SuSE Linux is caused by a SuSE patch.

I attached a the sample mail (changed content, e-mail addresses) for further testing.
Comment 13 Walter Haidinger 2006-01-18 13:11:43 UTC
The specs file of pine-4.64-5.src.rpm shows a long patch list:

Patch10:      http://www.math.washington.edu/~chappa/pine/patches/pine4.64/all.patch.gz
Patch11:      charset-editorial.diff
Patch12:      all.patch-recallpat.diff
Patch14:      pico-ucs4all.patch
Patch15:      pico-ucs4GetKey.patch
Patch16:      pico-ucs4doublewidthchars.diff
Patch18:      pico-ucs4isspace.patch
Patch21:      pine-utf8-1b.patch
Patch22:      pine-utf8-1a-pine.h.patch
Patch23:      pine-utf8-1a-GFHP_HANDLES.patch
Patch24:      send-charset.patch
Patch26:      config-options.patch
Patch28:      iconv-no-explain.patch
Patch29:      utf8-mailindx.patch
Patch30:      mailindx-plusdraw.patch
Patch31:      gf_wrap-UTF8.patch
Patch32:      pine-no-stripwhitespace.patch
Patch33:      strings-iconv.patch
Patch34:      filter-iconv.patch
Patch35:      rfc1522_decode.patch
Patch36:      rfc1522_valid.patch
Patch41:      warnings.patch
# SuSE-specific patches last:
Patch50:      pine4.61.dif
Patch51:      pine-passfile.patch
Patch52:      pine-urlquote.patch
Patch53:      pine-body.patch
Patch55:      quell-displaying-flowed-text.patch.4.60
Patch56:      quell-flowed-text-default.patch
Patch59:      pine-talk_disallow.patch
Patch60:      pine-gcc4.patch
Patch61:      pine-missing-protos.patch
Patch62:      pine-few_arguments.patch
Patch63:      pine-use-rpm_opt_flags.patch

Which patches are included in the FreeBSD release?
Perhaps we can work out a list of "working" patches.

Or, does somebody has a hint/clue/whatever which patch of the ones above 
might cause the truncation in the Subject line?
Comment 14 Michal Svec 2006-01-20 09:44:40 UTC
BTW it affects also attachments, when you try to save an attachment with international characters in its name, the name is also truncated.
Comment 15 Michal Svec 2006-01-20 09:55:53 UTC
Created attachment 64166 [details]
patches diff

Moreover as it worked OK in 9.3 and does not in 10.0 something either changed in pine, but more likely in our patches. Here's what's changed if anyone wants to have a look.
Comment 16 Walter Haidinger 2006-01-20 14:15:47 UTC
Just curious: Is there any problem compiling the 9.3 source rpm for SuSE 10?
Comment 17 Michal Svec 2006-01-20 14:52:32 UTC
Yeah, 10.0 contains GCC4 which would require additional patches/fixes.
Comment 18 Ruediger Oertel 2006-01-27 14:18:29 UTC
hi, bk was just here. he has a patch for the problem and will supply it next week.
Comment 19 Walter Haidinger 2006-01-27 14:44:47 UTC
Great news! However, if he'd post the patch here, we could beta-test it!
Comment 20 Bernhard Kaindl 2006-02-09 14:17:17 UTC
The fix in in the current tree for the next betas/releases.
Until you install one of them, it's there:

Binary RPM (i386-10.1) plus source for building with rpmbuild -bb is at ftp://ftp.suse.com/pub/people/bk/pine/4.64/2006-02-07/

The same is also currently OpenSuSe mirrors at:
http://ftp.tu-chemnitz.de/pub/linux/opensuse/distribution/SL-OSS-factory/inst-source/suse/i586/pine-4.64-6.i586.rpm

Have fun! (And of course report any errors - should not happen, but if)
Comment 21 Walter Haidinger 2006-02-10 10:37:40 UTC
Thanks! Got the sources and built a binary RPM for 10.0:
http://members.kstp.at/wh/suse-10.0/pine-4.64-6.i586.rpm

Btw, pine.specs still has release=5. 
I think you've simply forgotten to bump it to 6, right?

Anyways, the Subject isn't truncated anymore and shows umlaut characters. :-)
Saving attachments having umlaut characters also works, but only when the 
filename is not modified:
When editing the filename in the save attachment dialog, the cursor is too far to the right, i.e. has a wrong offset. I guess because the multi-byte unicode characters are not counted as a single character.
Comment 22 Bernhard Kaindl 2006-02-10 14:51:29 UTC
Right.

Regarding editing of file names which contain umlauts:

Thanks for pointing it out, your guess is right for all european characters which I know. I addition, most asian characters are double-with, so they take up two columns on screen.

Unfortunatly, I had no chance to touch the functions which are used for this, which are used for __all__ editing in the status line area. This means that there is no awareness for UTF-8 in these functions.

Changing this, adding awareness of UTF-8 to this function would be a major undertaking, with a possible complexity similar to adding the UTF-8 awareness to the mail header editor of the composer, touching possibly up to 800 lines of code, of I don't know yet how it works in detail (looked at the functions for the first time...) and which is called from 98 places inside pine...

The function is optionally_enter() with it's lower half line_paint() in pine/osdep/termin.gen

The function also has to allow to edit strings which are longer than the editable area available on screen, so it wraps the text horizontally.

Putting this all together with the fact that we are quite late in the beta phase and considering that it could need some test generations to mature, it makes no sense for me to attempt to engage in this during the relase cycle.

It would be nice if you open a new follow-up bug (since the description of this bug is about the truncated RFC2047/1522 mail headers and not about status line editing.

You can enter me as assignee (have taken over maintaining pine now again), with a severity of enhancement, to be considered for further research and implementation after the current release, and I would like to have this comment referred to in the bug or pasted into it.
Comment 23 Walter Haidinger 2006-02-11 09:16:44 UTC
Created a follow-up bug regarding the editing problem: see bug #150076