Bugzilla – Bug 150076
pine: editing in status line is not UTF-8 aware, cursor moves more than one position if entering non-ascii (e.g. umlaut) characters
Last modified: 2006-02-27 14:36:46 UTC
This is a follow-up of bug #121943, still existing in pine-4.64-6 for SuSE 10.1. If you edit in the status line (e.g. when editing a filename for an attachment to save), the cursor jumps for more than one if non-ascii characters (e.g. umlauts). Quote from https://bugzilla.novell.com/show_bug.cgi?id=121943#c22 "Regarding editing of file names which contain umlauts: Thanks for pointing it out, your guess is right for all european characters which I know. I addition, most asian characters are double-with, so they take up two columns on screen. Unfortunatly, I had no chance to touch the functions which are used for this, which are used for __all__ editing in the status line area. This means that there is no awareness for UTF-8 in these functions. Changing this, adding awareness of UTF-8 to this function would be a major undertaking, with a possible complexity similar to adding the UTF-8 awareness to the mail header editor of the composer, touching possibly up to 800 lines of code, of I don't know yet how it works in detail (looked at the functions for the first time...) and which is called from 98 places inside pine... The function is optionally_enter() with it's lower half line_paint() in pine/osdep/termin.gen The function also has to allow to edit strings which are longer than the editable area available on screen, so it wraps the text horizontally." Also from bug #121943, the binary RPM and source to verify this bug: Binary RPM (i386-10.1) plus source for building with rpmbuild -bb is at ftp://ftp.suse.com/pub/people/bk/pine/4.64/2006-02-07/ Binary RPM built from sources above for SuSE 10.0: http://members.kstp.at/wh/suse-10.0/pine-4.64-6.i586.rpm
An apology first: I think I was wrong by asking for this follow-up bug to have severity enhancement (since it's not a new feature or so) so I have set it to normal. I had a chance to look closely at the two concerned functions and found that a clean approach was possible (which I didn't expect at first look) without having to rewrite the whole thing. This clean approach allowed to the UTF-8 support in the same way as I did for pico, pine's integrated (and standalone) editor. I could also use the same UTF-8/UCS4 functions which I implemented for pico, so number of code lines which I had to implement from scratch turned out to be pretty low. It was mostly applying the same concepts and doing the similar changes. Basically, because the main screen redraw function addressed the editor buffer only as an array of characters (not relying on string-specific functions like strlen, strcpy, strcmp too heavvily) I could turn the character array into an integer array which stores 32-bit UCS-4 characters instead of 8-bit bytes. Using the routined which I developed for pico, I could quickly convert all input to the editor from UTF-8 to UCS-4 and all output from UCS-4 to UTF-8, for the rest of pine. I could use valgrind to verify that I converted all places from character-based accessed to integer-based UCS-4 code, so it seems to work well, and I even dared to submit it for the next beta, but additional testing would of course be beneficial to have, just to confirm that there are no bugs left. Note: The newly-to-Unicode-convered functions are called from all places where the cursor enters the 3rd line from the bottom (status line) for a prompt where you can edit a string, also for password entries for example and even when you enter "W" (for Whereis) to search the currently displayed mail index (or other pine screen) for a certain string which you enter a the prompt, it's a generic promting functions, so all places where you enter something at the prompt line (or elsewhere, the line could be also somewhere else in theory) should support unicode and also if you use other charsets, this also works here. Basically, you should notice nothing wrong, but please report immediately if anything happens which should not happen, even if you are not able to reproduce completely. Again, it appears to work for me, it's better if more people test it. Here is the source and a binary RPM for 10.0-i386: http://suse.de/~bk/pine/4.64/2006-02-14 It also contains a quick fix for bug #150774, where the mail recipeients are replaced with UNEXPECTED_DATA_AFTER_ADDRESS@.SYNTAX-ERROR. on attempts to send mail. More on that in bug #150774. Source without the quick fix for 150774 is in: http://suse.de/~bk/pine/4.64/2006-02-13 I leave the bug open until we know that the fix indeed works. Please report anything which you find during use/test.
Well... what can I say? Nothing, other than: Wow, everying works! :-) Installed http://www.suse.de/~bk/pine/4.64/2006-02-14/10.0-i386/pine-4.64-5.1.i586.rpm and tested mails having umlaut characters in subject and attachment names. Even searching for umlauts works! IMHO really great work, should really deserve more than a tiny bump to -5.1 (!) release, by far! Thanks, Walter
Fix is integrated into current codebase for SUSE Linux 10.1, it's the changelog last changelog entry in this quoted changelog: ------------------------------------------------------------------- rpm -q --changelog pine | head * Mon Feb 20 2006 - bk@suse.de - fix missing conversion when forwarding multipart/alternative. * Tue Feb 14 2006 - bk@suse.de - fix missing chars in base64-encoded headers (leftover of 121943) - add UTF-8 support to the single-line editor which is used for prompts in the status line, e.g. for editing filenames (150076) ------------------------------------------------------------------- Tests and feedback from tester have been positive, closing as fixed. thanks!