Bug 150076

Summary: pine: editing in status line is not UTF-8 aware, cursor moves more than one position if entering non-ascii (e.g. umlaut) characters
Product: [openSUSE] SUSE Linux 10.1 Reporter: Walter Haidinger <walter.haidinger>
Component: OtherAssignee: Bernhard Kaindl <bk>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None    
Version: Beta 3   
Target Milestone: Beta 4   
Hardware: All   
OS: SuSE Linux 10.1   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Walter Haidinger 2006-02-10 21:29:54 UTC
This is a follow-up of bug #121943, still existing in pine-4.64-6 for SuSE 10.1.

If you edit in the status line (e.g. when editing a filename for an attachment to save), the cursor jumps for more than one if non-ascii characters (e.g. umlauts).

Quote from https://bugzilla.novell.com/show_bug.cgi?id=121943#c22

"Regarding editing of file names which contain umlauts:

Thanks for pointing it out, your guess is right for all european characters
which I know. I addition, most asian characters are double-with, so they take
up two columns on screen.

Unfortunatly, I had no chance to touch the functions which are used for this, 
which are used for __all__ editing in the status line area. This means that
there is no awareness for UTF-8 in these functions.

Changing this, adding awareness of UTF-8 to this function would be a major 
undertaking, with a possible complexity similar to adding the UTF-8 awareness 
to the mail header editor of the composer, touching possibly up to 800 lines of
code, of I don't know yet how it works in detail (looked at the functions for
the first time...) and which is called from 98 places inside pine...

The function is optionally_enter() with it's lower half line_paint() in
pine/osdep/termin.gen

The function also has to allow to edit strings which are longer than the
editable area available on screen, so it wraps the text horizontally."

Also from bug #121943, the binary RPM and source to verify this bug:

Binary RPM (i386-10.1) plus source for building with rpmbuild -bb is at
ftp://ftp.suse.com/pub/people/bk/pine/4.64/2006-02-07/

Binary RPM built from sources above for SuSE 10.0:
http://members.kstp.at/wh/suse-10.0/pine-4.64-6.i586.rpm
Comment 1 Bernhard Kaindl 2006-02-14 14:24:23 UTC
An apology first: I think I was wrong by asking for this follow-up
bug to have severity enhancement (since it's not a new feature or so)
so I have set it to normal.

I had a chance to look closely at the two concerned functions and
found that a clean approach was possible (which I didn't expect at
first look) without having to rewrite the whole thing.

This clean approach allowed to the UTF-8 support in the same way
as I did for pico, pine's integrated (and standalone) editor.

I could also use the same UTF-8/UCS4 functions which I implemented
for pico, so number of code lines which I had to implement from
scratch turned out to be pretty low. It was mostly applying the
same concepts and doing the similar changes.

Basically, because the main screen redraw function addressed
the editor buffer only as an array of characters (not relying
on string-specific functions like strlen, strcpy, strcmp too
heavvily) I could turn the character array into an integer
array which stores 32-bit UCS-4 characters instead of 8-bit
bytes. Using the routined which I developed for pico, I could
quickly convert all input to the editor from UTF-8 to UCS-4
and all output from UCS-4 to UTF-8, for the rest of pine.

I could use valgrind to verify that I converted all places
from character-based accessed to integer-based UCS-4 code,
so it seems to work well, and I even dared to submit it
for the next beta, but additional testing would of course
be beneficial to have, just to confirm that there are no
bugs left.

Note: The newly-to-Unicode-convered functions are called
from all places where the cursor enters the 3rd line from
the bottom (status line) for a prompt where you can edit
a string, also for password entries for example and even
when you enter "W" (for Whereis) to search the currently
displayed mail index (or other pine screen) for a certain
string which you enter a the prompt, it's a generic promting
functions, so all places where you enter something at the
prompt line (or elsewhere, the line could be also somewhere
else in theory) should support unicode and also if you use
other charsets, this also works here.

Basically, you should notice nothing wrong, but please report
immediately if anything happens which should not happen, even
if you are not able to reproduce completely. Again, it appears
to work for me, it's better if more people test it.

Here is the source and a binary RPM for 10.0-i386:
http://suse.de/~bk/pine/4.64/2006-02-14

It also contains a quick fix for bug #150774, where the
mail recipeients are replaced with
UNEXPECTED_DATA_AFTER_ADDRESS@.SYNTAX-ERROR.
on attempts to send mail. More on that in bug #150774.

Source without the quick fix for 150774 is in:
http://suse.de/~bk/pine/4.64/2006-02-13

I leave the bug open until we know that the fix indeed works.

Please report anything which you find during use/test.
Comment 2 Walter Haidinger 2006-02-14 22:25:22 UTC
Well... what can I say?
Nothing, other than: Wow, everying works! :-)

Installed http://www.suse.de/~bk/pine/4.64/2006-02-14/10.0-i386/pine-4.64-5.1.i586.rpm
and tested mails having umlaut characters in subject and attachment names.
Even searching for umlauts works!

IMHO really great work, should really deserve more than a tiny bump to -5.1 (!) release, by far!

Thanks, Walter
Comment 3 Bernhard Kaindl 2006-02-27 14:36:46 UTC
Fix is integrated into current codebase for SUSE Linux 10.1, it's the
changelog last changelog entry in this quoted changelog:
-------------------------------------------------------------------
rpm -q --changelog pine | head
* Mon Feb 20 2006 - bk@suse.de
- fix missing conversion when forwarding multipart/alternative.

* Tue Feb 14 2006 - bk@suse.de
- fix missing chars in base64-encoded headers (leftover of 121943)
- add UTF-8 support to the single-line editor which is used for
  prompts in the status line, e.g. for editing filenames (150076)
-------------------------------------------------------------------
Tests and feedback from tester have been positive, closing as fixed. thanks!