Bug 145069

Summary: xterm || glibc || vi + KDE = Crash
Product: [openSUSE] SUSE Linux 10.1 Reporter: Dave Jarvis <dave>
Component: BasesystemAssignee: Michael Matz <matz>
Status: RESOLVED INVALID QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P5 - None CC: kukuk
Version: Beta 1   
Target Milestone: ---   
Hardware: i586   
OS: SuSE Linux 10.0   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Dave Jarvis 2006-01-24 01:16:45 UTC
This is going to be a bit long-winded, but please bear with me.

------------------------------------------------
THE NIGHTMARISH INTRO
------------------------------------------------
I upgraded from Linux SUSE 9.3 to SUSE 10.0 a while back because my Linux box kept crashing (usually while editing a syntax-highlighted file with vim, or doing a colourful "ls -laR /"). I can also segfault "ls" sometimes, but that's another matter entirely.

The upgrade did not help. Whenever I scroll, type, or anything short of stare at Konsole, it disappears, leaving me, at best, with a KDE debug report window.

At worst, the computer either locks up entirely, or X/KDE does a warm restart. When this happens three times a day it seriously hampers my development progress. In the ten years I have used Linux, I have never had to reboot so many times as I have these last two years. Depressing!

So I switched from Konsole to xterm and another clue surfaced ...

------------------------------------------------
THE SORDID CONSOLES
------------------------------------------------
Although xterm never disappears, the computer still locks up, or X/KDE does a warm restart. The xterm app. is set to approximately the same window size, the same font, font sizes, and colours as Konsole. However, I do not believe either xterm or Konsole are at fault. Since both applications produce the same end-result, I believe the error may be at a lower level.

Two side issues:
    1. I cannot crash xterm as I can Konsole, using "ls -laR /".
    2. Sometimes "ls" crashes during "ls -laR /".

I received a clue that partially confirms my suspicions about the error.

------------------------------------------------
THE MYSTERIOUS CLUES
------------------------------------------------
A) At first, I believed it was Konsole that was causing the computer to go astray. As I am still seeing this behaviour with xterm, I believe it is safe to eliminate Konsole as the source. (Konsole still has some bugs, but no nasty-bads.)

B) Today, as I was launching vim, my computer locked up. BUT! It locked up scant milliseconds after vim cleared the screen to load the source file. Before vim could do anything more, the lock up occurred, and I read this at the bottom of the terminal window:

*** glibc detected free() ***: invalid next size (fast): 0x08164c58

C) In summary, here are the clues to this mystery:

    a. Konsole crashes regularly for me.
    b. xterm does not crash on me.
    c. Computer locks up (or restarts X/KDE) ~thrice daily.
    d. Lock up now occurs primarily when using vim.
    e. Error message was captured.

D) This "glibc" error message leaves me with a feeling that it might not be konsole, xterm, or vim with a nasty-bad. If glibc has the bug, that may might help explain the lock ups. As neither xterm nor vim are directly tied to KDE (though xterm is tied indirectly via X), having the glibc library fail could cause a bad-nasty, no?

------------------------------------------------
THE DISPARATE CONCLUSION
------------------------------------------------
I have no idea how to look for this bug, how to concisely describe this bug, where to submit this bug, how to start debugging it, or who to ask for help to track it down.

If any of you gents could point me in the right direction, I'd be mighty, mighty, mighty glad to help rid the world of this nasty-bad.

Find below the versions of the software packages I am using.

Sincerely,
Dave Jarvis

P.S.
I could not figure out how to compile glibc 2.3.6 with the linuxthreads option enabled. This should be clarified in the documentation. Like steps and examples would be wonderful ...

------------------------------------------------
THE SOFTWARE VERSIONS
------------------------------------------------
$ konsole --version
Qt: 3.3.5
KDE: 3.5.0 Level "a"
Konsole: 1.6

$ xterm -version
X.Org 6.8.2(208)

$ vim --version
VIM - Vi IMproved 6.3 (2004 June 7, compiled Jan 23 2006 14:59:06)
Compiled by jarvisd@jaguar
Normal version with GTK2 GUI.  Features included (+) or not (-):
-arabic +autocmd +balloon_eval +browse +builtin_terms +byte_offset +cindent +clientserver +clipboard +cmdline_compl +cmdline_hist +cmdline_info +comments +cryptv -cscope +dialog_con_gui +diff +digraphs +dnd -ebcdic -emacs_tags +eval +ex_extra +extra_search -farsi +file_in_path +find_in_path +folding -footer +fork() -gettext -hangul_input +iconv +insert_expand +jumplist -keymap -langmap +libcall +linebreak +lispindent +listcmds +localmap +menu +mksession +modify_fname +mouse +mouseshape -mouse_dec +mouse_gpm -mouse_jsbterm -mouse_netterm +mouse_xterm +multi_byte +multi_lang +netbeans_intg -osfiletype +path_extra -perl +postscript +printer -python +quickfix -rightleft -ruby +scrollbind +signs +smartindent -sniff +statusline -sun_workshop +syntax +tag_binary +tag_old_static -tag_any_white -tcl +terminfo +termresponse +textobjects +title +toolbar +user_commands +vertsplit +virtualedit +visual +visualextra +viminfo +vreplace +wildignore +wildmenu +windows +writebackup +X11 -xfontset +xim +xsmp_interact +xterm_clipboard -xterm_save
   system vimrc file: "$VIM/vimrc"
     user vimrc file: "$HOME/.vimrc"
      user exrc file: "$HOME/.exrc"
  system gvimrc file: "$VIM/gvimrc"
    user gvimrc file: "$HOME/.gvimrc"
    system menu file: "$VIMRUNTIME/menu.vim"
  fall-back for $VIM: "/usr/share/vim"
Compilation: gcc -c -I. -Iproto -DHAVE_CONFIG_H -DFEAT_GUI_GTK -I/usr/include/cairo -I/usr/include/freetype2 -I/usr/X11R6/include -I/usr/include/libpng12 -I/opt/gnome/include/gtk-2.0 -I/opt/gnome/lib/gtk-2.0/include -I/opt/gnome/include/atk-1.0 -I/opt/gnome/include/pango-1.0 -I/opt/gnome/include/glib-2.0 -I/opt/gnome/lib/glib-2.0/include     -g -O2  -I/usr/X11R6/include
Linking: gcc  -L/usr/X11R6/lib   -L/usr/local/lib -o vim -L/usr/X11R6/lib -L/opt/gnome/lib -lgtk-x11-2.0 -lgdk-x11-2.0 -latk-1.0 -lgdk_pixbuf-2.0 -lpangocairo-1.0 -lpango-1.0 -lcairo -lgobject-2.0 -lgmodule-2.0 -lglib-2.0 -lfreetype -lfontconfig -lXrender -lXext -lpng12 -lz -lglitz -lm   -lXt -lncurses -lgpm

# ldconfig -v | grep glibc
/lib:
        libc.so.6 -> libc-2.3.5.so
/lib/tls: (hwcap: 0x8000000000000000)
        libc.so.6 -> libc-2.3.5.so
/lib/i686: (hwcap: 0x8000000000000)
        libc.so.6 -> libc-2.3.5.so

$ kde --version
KDE seems to be already running on this display.

DOH! Running KDE version 3.5

------------------------------------------------
THE HARDWARE STUFF
------------------------------------------------
CPU: AMD 1.4GHz
LAN: ndiswrapper for D-Link 650+ Wireless NIC
AUDIO: Onboard SoundBlaster compatible
VIDEO: Radeon 7500 LE
MONITOR: ViewSonic VX2000 20.1" LCD
MOUSE: Logitech MarbleMouse (USB -> PS/2)
Comment 1 Thorsten Kukuk 2006-01-24 02:34:37 UTC
The glibc message means, that your memory was messed up. This could be a application or memory problem. Since nobody else is seeing and/or reporting this and from the kind of the message it looks like a hardware problem, most likely bad memory.
Comment 2 Dave Jarvis 2006-01-24 10:27:06 UTC
After running memtest86 (version 3.2) for 1.5 hours, my computer showed no problems of any kind with memory.

I have used this computer, without changing the RAM, to run a few different operating systems and Linux flavours, including:

  FreeBSD
  Mandrake Linux
  RedHat Linux
  SUSE 9.3 Professional
  SUSE 10.1 Personal

I have had it for a few years now, and none of the previously mentioned operating systems, except the SUSE distros, would lock up.

Also, the fact that the lock up only occurs while I'm editing in 'vi', or scrolling an xterm/Konsole window could be a strong indicator of a software issue, not a hardware problem. This very computer was running 24/7 while I was in Japan last year, with Mandrake Linux. I could VNC in and out to my heart's content. Every night it would reset the network connection, but in the three weeks I was there, I never once had to remotely reboot the machine.

So, the fact that I never had these issues before installing SUSE and the fact that memtest86 ran for 1.5 hours without detecting a hardware problem leads me to conclude there is a bug.

Where do we go from here?
Comment 3 Dr. Werner Fink 2006-01-24 10:59:57 UTC
Hmmm ... beside hardware problems it could also a problem
with gcc which should produce code for i586 but tune it
for i686. Beside this also a problem of the string/memory
assembler functions could be a problem if there is only one
which is for i686 only.  But AFAIK there is no such
problem today.
Comment 4 Andreas Kleen 2006-01-24 12:43:32 UTC
It sounds very much like a hardware problem to me. There can be many hardware problems that are not detected by memtest86. 

The glibc overflow problem likely isn't, but shouldn't cause kernel hangs.

Comment 5 Michael Matz 2006-01-24 14:51:24 UTC
I'm sorry, but this relatively clearly is a hardware problem.  It might
have surfaced only recently which would explain why the machine was working
fine with earlier operating systems.  Additionally it might also be a hardware
problem which can only be triggered by certain hardware settings, i.e.
theoretically a work-around could exist, which could have been used in older
software by accident (for instance by simply not using newer settings),
but that's really just speculation.

If it were a problem in either the compiler or the glibc, or any software
problem at all, then it would crash/break/lock-up deterministically, each time
at the same place.  So this is nothing we can help with, except giving advice
to buy new hardware, as unhelpful as it may sound :-/