Bugzilla – Bug 340459
fglrx Driver Locks System
Last modified: 2008-06-19 20:53:16 UTC
Ok Stefan, here goes: system: Toshiba P35-S629, P4 3.33Ghz, 1G RAM (64 Meg shared video) video card: ATI MOBILITY RADEON 9600/9700 Series kernel: 2.6.22.12-0.1-default Xorg 7.3: Yast install, download.opensuse.org/repositories/xorg73/openSUSE_10.3 Related Bugs: 338930; 338947 The fglrx driver will run and drive the display beautifully the first time. Any subsequent attempt to load the graphic system hardlocks the system. With xorg 7.2, kdm login was possible, but logout would hardlock the system prior to return to the kdm login screen. Compiz would run with fusion-icon. With xorg 7.3, kdm login is not possible, the screen freezes before the login screen is reached. Booting to runlevel 3 and then issuing 'startx' will start kde and kde will run in a stable manner. Compiz will not run on xorg 7.3. Exiting kde to runlevel 3 yields the following error: QApplication::postEvent: Unexpected null receiver startkde: Shutting down... warning: leaving MCOP Dispatcher and still 9 object references alive. - Arts::SampleStorage - Arts::Synth_MULTI_ADD - Arts::Synth_MULTI_ADD - Arts::Synth_PLAY - Arts::StereoVolumeControl - Arts::StereoEffectStack - Arts::Synth_BUS_DOWNLINK - Arts::SoundServerV2 - Arts::MidiManager warning: leaving MCOP Dispatcher and still 75 types alive. sound server terminated klauncher: Exiting on signal 1 startkde: Running shutdown scripts... startkde: Done. waiting for X server to shut down Synaptics DeviceOff called (EE) fglrx(0): [drm] failed to remove DRM signal handler 21:48 Rankin-P35a~/linux> The fglrx driver gives good frame rates. 150% of that given by the stock radeon driver: 21:45 Rankin-P35a~> glxgears 13317 frames in 5.0 seconds = 2663.297 FPS 21:45 Rankin-P35a~> fgl_glxgears Using GLX_SGIX_pbuffer 2623 frames in 5.0 seconds = 524.600 FPS That is a fair summary of the symptoms and scenarios involved. Bugs: 338930; 338947 contain additional detail about the libraries involved and problems with /usr/lib/libIndirectGL.so.1.2, /usr/lib/libGL.so.1.2 and /usr/X11R6/lib/libGL.so.1.2. I'm attaching the Xorg.0.log and xorg.conf under a separate post. Take a look and let me know what you want me to send and test. I'm more than ready to help you get to the bottom of this one too. Also, if it becomes necessary, I'll be glad to provide you direct access to this box via remote X or ssh, your call. Thanks for your hard work.
Created attachment 182740 [details] xorg.conf for fglrx driver
Created attachment 182741 [details] Xorg.0.log for fglrx driver
The symptoms are exactly the same regardless of whether the fglrx driver used is from the Yast install of the www2.ati.com drivers: 22:07 Rankin-P35a~/linux> rpm -qa | grep fglrx ati-fglrxG01-kmp-default-8.42.3_2.6.22.9_0.4-1.1 x11-video-fglrxG01-8.42.3-2.1 or the driver built with: sh ati-driver-installer-8.42.3-x86.x86_64.run --buildpkg SuSE/SUSE103-IA32 --- Additionally, ATI Control Center does not work with either the radeon or fglrx driver.
Here is an additional tidbit. While trying to start compiz from the command line, I get the following error: # compiz --replace --sm-disable --ignore-desktop-hints ccp --indirect-rendering --no-libgl-fallback compiz (core) - Fatal: GLX_EXT_texture_from_pixmap is missing compiz (core) - Error: Failed to manage screen: 0 compiz (core) - Fatal: No manageable screens found on display :0.0 Evidently this is a libGL issues. Just thought I would pass it along.
This is what happens when you specify --no-libgl-fallback. Background: Bug #234154.
(In reply to comment #5 from Stefan Dirsch) > This is what happens when you specify --no-libgl-fallback. Background: Bug > #234154. > I did. See comment #4 and the command line I was using. It really doesn't matter though. Seg faults or error no matter what: # compiz --replace ccp & 12:31 Rankin-P35a~> compiz --replace ccp & [1] 4518 12:34 Rankin-P35a~> compiz: Trying '/usr/$LIB/libIndirectGL.so.1' ERROR: ld.so: object '/usr/$LIB/libIndirectGL.so.1' from LD_PRELOAD cannot be preloaded: ignored. [1]+ Segmentation fault compiz --replace ccp 12:37 Rankin-P35a~> (keyboard input is dead, only mouse will work, fusion-icon fails on first attemp, second attempt using run dialog (with fusion-icon in history list) restores kwin and keyboard) -- NEXT TEST -- 12:37 Rankin-P35a~> compiz --replace ccp --no-libgl-fallback & [1] 4566 12:40 Rankin-P35a~> compiz (core) - Fatal: GLX_EXT_texture_from_pixmap is missing compiz (core) - Error: Failed to manage screen: 0 compiz (core) - Fatal: No manageable screens found on display :0.0 [1]+ Exit 1 compiz --replace ccp --no-libgl-fallback -- NEXT TEST -- 12:41 Rankin-P35a~> compiz --replace ccp --indirect-rendering --no-libgl-fallback compiz (core) - Fatal: GLX_EXT_texture_from_pixmap is missing compiz (core) - Error: Failed to manage screen: 0 compiz (core) - Fatal: No manageable screens found on display :0.0 12:42 Rankin-P35a~> -- NEXT TEST -- 12:42 Rankin-P35a~> compiz --replace --sm-disable ccp --no-libgl-fallback compiz (core) - Fatal: GLX_EXT_texture_from_pixmap is missing compiz (core) - Error: Failed to manage screen: 0 compiz (core) - Fatal: No manageable screens found on display :0.0 -- NEXT TEST -- 12:44 Rankin-P35a~> compiz --replace --sm-disable --ignore-desktop-hints ccp --no-libgl-fallback compiz (core) - Fatal: GLX_EXT_texture_from_pixmap is missing compiz (core) - Error: Failed to manage screen: 0 compiz (core) - Fatal: No manageable screens found on display :0.0 -- NEXT TEST -- 12:46 Rankin-P35a~> compiz --replace ccp --indirect-rendering compiz: Trying '/usr/$LIB/libIndirectGL.so.1' ERROR: ld.so: object '/usr/$LIB/libIndirectGL.so.1' from LD_PRELOAD cannot be preloaded: ignored. Segmentation fault -- NEXT TEST -- # compiz --replace --sm-disable --ignore-desktop-hints ccp --indirect-rendering --no-libgl-fallback This command line was derived from looking at the output of 'ps ax' when compiz was running under xorg 7.2. Do you want me to try rearranging the options in another fashion? Both the radeon and ati driver are broken and were just as broken when I was running xorg 7.2. The big mystery to me is why I could get comipz running under xorg 7.2 and it chokes under 7.3.
Sorry, I misread your post, but the additional examples are hopefully helpful. The point I wanted to make was that THIS is the command line that WORKED with xorg 7.2: compiz --replace --sm-disable --ignore-desktop-hints ccp --indirect-rendering --no-libgl-fallback
I think you switched (better say tried to ...) from Xgl to AIGLX with X.Org 7.2. Xgl is still available on openSUSE 10.3. Anyway, this issue should be investigated in Bug #338947.
Stefan, As usual your are right on the AIGLX. However, (as my limited synapses connect to recent memory) XGL never worked on my system. It wasn't until the 8.42 driver/or xorg 7.2 came out the allowed using AIGLX that I ever got compiz to work. No matter what I tried with XGL and no matter how much coaching I got on the list (Ben and CyberOrg particularly), XGL would never work on this Toshiba laptop. With respect to the fglrx issue, I saw that the bug was forwarded to ATI on November 8, 2007, but to date I have neither seen or received any post of inquiry from the ATI side of the house. Is this the standard modus-operandi for ATI? Do I need to send them something directly to help foster a closer look. Lastly, can you thing of any other output I could post or any additional hoop I can jump through to dig further into the fglrx issue?
ATI is working on 5 bugs in Novell's bugzilla at a maximum. Given that currently there are about 45 bugs assigned to ATI it can take some time that your issue gets adressed. It's unlikely that you will ever see a comment by ATI on this bugreport. Best you can do is trying different driver versions to get one of them working and of course always giving latest version a try and update this bugreport from time to time.
*** Bug 341701 has been marked as a duplicate of this bug. ***
Could you verify, if this issue is still reproducable with release 8.44? Driver download: http://ati.amd.com/support/drivers/linux/linux-radeon.html Installation instructions: http://www.suse.de/~sndirsch/ati-installer-HOWTO.html#manual
I downloaded ati-driver-installer-8.443.1-x86.x86_64.run and made the rpm fglrx64_7_1_0_SUSE103-8.443.1-1.x86_64.rpm After installation and a reboot to level 3 I ran "sax2 -r -m 0=fglrx". I got a error message telling to look in /var/log/Xorg.99.log This file will be attached.
Created attachment 188612 [details] file is /var/log/Xorg.99.log To make it all clear. This driver completely fails.
Indeed. [...] (II) fglrx(0): POWERplay version 3. 1 power state available: (II) fglrx(0): 1. 250/196MHz @ 50Hz [enable load balancing] Backtrace: 0: /usr/sbin/xw(xf86SigHandler+0x6d) [0x492aad] 1: /lib64/libc.so.6 [0x2b2dafa5dbd0] Fatal server error: Caught signal 11. Server aborting
Oh well, the bug has been hijacked. Setting NEEDINFO again to original reporter.
We have downgraded xorg to 7.2 and install the fglrx 8.43 drivers and compiz is working again. However the SONAME problem is still with us. The config for getting compiz to work remains the same requiring altering libGL and libIndirect. The following script is needed each time suseConfig is run (a pain, but better than entering it manually every time: #!/bin/bash echo -e "\n *** /usr/lib/libGL Config \n" ls -al /usr/lib/libGL.so* echo -e "\n *** /usr/lib/libIndirect Config \n" ls -al /usr/lib/libIn* echo -e '\n' read -p "Alter libGL and libIndirect to fix SONAME? [y/n]: " key echo -e "\n" if [ "$key" = "y" ]; then thepwd=`pwd` cd /usr/lib unlink libGL.so.1 unlink libGL.so.1.2 ln -s libGL.so.1.2 libGL.so.1 ln -s /usr/X11R6/lib/libGL.so.1.2 libGL.so.1.2 if [ -h libIndirectGL.so.1 ]; then unlink libIndirectGL.so.1 fi cd $thepwd echo -e "\n *** /usr/lib/libGL Config \n" ls -al /usr/lib/libGL.so* echo -e "\n *** /usr/lib/libIndirect Config \n" ls -al /usr/lib/libIn* tail -n24 /etc/X11/xorg.conf else echo -e "libGL and libIndirect remain UNCHANGED \n" fi After adjusting the libraries, compiz works just fine if started with fusion-icon. (A slight pain, but benefit in it keeps all the taskbar icons from being scattered all over multiple desktops) The driver lock issue remains the exact same. However, with the 8.43 driver I can set runlevel 5 as default in inittab and boot to the kdm menu and then launch kde or kde/compiz. You can NOT log out without the display hardlocking the system, but you can shutdown cleanly if you remember to "shutdown" all session and never "logout" of a session. Hopefully this will be fixed with the 8.44 driver. Where are we on this issue? Can I send you anything else?? Keep up th egreat work, we are getting there. Thanks!
David, I asked you to test 8.44. And now you send me the results for 8.43. :-(
Sorry Stefan, I'll go install and try right now. I've been busy assembling Pink Barbie cars, airsoft guns, Butterscotch the mechanical pony, etc.... I'll have results within the hour!
Stefan, 2 steps forward, but 3 steps back. The results of the 8.44 installs (installed both by generating the packages with --buildpkg and also with automatic install). The 2 improvements: catalyst control center works!, and I can log in and log out without a lockup (only after using the automated install). The 3 steps back: The driver will not run at 1440x900 even though ccc sees 1440x900 as the max resolution for the monitor on my laptop. (see: http://www.3111skyline.com/download/screenshot/compiz/ccc_displayInfo.jpg ) However, the driver resolution is now limited to 1280x800. (see: http://www.3111skyline.com/download/screenshot/compiz/ccc_display1280.jpg ) Another step back is that the new driver install removes /usr/X11R6/lib/libGL.so.1.2 and all other libs leaving only the "modules" directory under /usr/X11R6/lib/. I don't know if this is a bug or feature, but it doesn't work. I have reverted back to 8.43 and I have 1440x900 and compiz working as usual. Let me know what tests you want me to run to find out more. You guys are definitively on the right track with the lockup issue, but now we need to get the 1440x900 resolution back. At 1280x800, the fonts are all squished and all window sizes and positions are hosed. Give me your thoughts and any requests and I'll get you whatever info you would like. Thanks!
Ok. So the initial issue has been fixed. The resolution problem is a known issue. See http://ati.cchtml.com/show_bug.cgi?id=939. Let's track it there. >Another step back is that the new driver install removes >/usr/X11R6/lib/libGL.so.1.2 and all other libs leaving only the "modules" >directory under /usr/X11R6/lib/. I don't know if this is a bug or feature, >but it doesn't work. I don't think this is the SUSE RPM install. It's not recommended to use the non-SUSE RPM install on openSUSE.
David Rankin wrote: >There were no 8.44 SuSE rpms. That's why I had to download the ATI >installer for the test. I would have much preferred the new rpms. Do you >have a link to where they are hidden? They are not on www2. I'm talking about the SUSE RPMs you can create with the installer.
Stefan, I guess it was using the ATI installer that wiped the libs out. Recall I "(installed both by generating the packages with --buildpkg and also with automatic install)" I built the suse rpms for the first install and then removed them and did the auto install to see if it made any difference on the resolution issue. No difference, but I bet you are right and it was the auto install that caused the problems.
The driver lock issue remains in the 8.45 fglrx driver. Why logout worked in 8.44 and not 8.45 is a mystery unless fixing the resolution bug reintroduced the lock issue. There is good news to report though, the SONAME problem is fixed! I can now run with all libs in place: *** /usr/lib/libGL Config lrwxrwxrwx 1 root root 19 2008-01-26 12:31 /usr/lib/libGL.so -> /usr/lib/libGL.so.1 lrwxrwxrwx 1 root root 12 2008-01-26 12:47 /usr/lib/libGL.so.1 -> libGL.so.1.2 lrwxrwxrwx 1 root root 27 2008-01-29 12:55 /usr/lib/libGL.so.1.2 -> /usr/X11R6/lib/libGL.so.1.2 -rwxr-xr-x 1 root root 391344 2008-01-29 11:53 /usr/lib/libGL.so.1.2.sav *** /usr/lib/libIndirect Config lrwxrwxrwx 1 root root 20 2008-01-26 13:50 /usr/lib/libIndirectGL.so.1 -> libIndirectGL.so.1.2 -rwxr-xr-x 1 root root 440676 2007-09-21 20:34 /usr/lib/libIndirectGL.so.1.2 Similar to the 8.43 driver, I can boot to runlevel 5 and then launch kde or kde/compiz. but the hardlock problem remains. You can NOT log out without the display hardlocking the system just before the kdm menu appears, but you can shutdown cleanly if you remember to always "shutdown" and never "logout" of a session. The hardlock occurs: (1) regardless of whether only a default kde session is started before logout or whether compiz is started during the session. It makes absolutely no difference; and (2) regardless of whether /usr/lib/libIndirectGL.so.1.2 or /usr/X11R6/lib/libGL.so.1.2 is used. Of significant note, both the original /usr/lib/libGL.so.1.2 and the /usr/X11R6/lib/libGL.so.1.2 provide virtually the exact same performance/frame rates and compiz will happily run under each. I guess this is due to resolution of the soname bug. I have marked the this bug as reopened and I would will be glad to work with you to get to the bottom of this issue. As before, I'll send you anything you need and perform any test you think is needed. Thank you and the rest of the team for all the good work!
Created attachment 192249 [details] Current Xorg.0.log 20080129
Created attachment 192250 [details] Current xorg.conf 20080129
Guys, Just a follow-up with tests on the latest drivers: ati-fglrxG01-kmp-default-8.45.5_2.6.22.17_0.1-1.1 x11-video-fglrxG01-8.45.5-1.1 Same lockup issue still present. Let me know if I can provide anything else regarding the latest drivers.
Could you verify, if this issue is still reproducable with Catalyst 8.4? Driver download: http://ati.amd.com/support/drivers/linux/linux-radeon.html Installation instructions: http://www.suse.de/~sndirsch/ati-installer-HOWTO.html#manual
Thanks Stefan, Already done on two different machines with Radeon 9600 series cards. The lockup is still present on the Tosiba P35 Laptop with either an "end-session" to logout of kde or on "reboot". Both conditions result in the hard lock that originally started this bug. The driver on the laptop: 15:32 Rankin-P35a~> rpm -qa | grep fglrx fglrx_7_1_0_SUSE103-8.476-1 The hwinfo associated with the laptop card is: 15:27 Rankin-P35a~> sudo hwinfo --gfxcard 20: PCI(AGP) 105.0: 0300 VGA compatible controller (VGA) [Created at pci.301] UDI: /org/freedesktop/Hal/devices/pci_1002_4e50 Unique ID: ul7N.nAcBsmLlWn2 Parent ID: vSkL.klZjF5pEvx9 SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:05.0 SysFS BusID: 0000:01:05.0 Hardware Class: graphics card Model: "Toshiba America Info RV350 NP" Vendor: pci 0x1002 "ATI Technologies Inc" Device: pci 0x4e50 "RV350 NP" SubVendor: pci 0x1179 "Toshiba America Info Systems" SubDevice: pci 0xff01 Driver: "fglrx_pci" Driver Modules: "fglrx" Memory Range: 0xf0000000-0xf7ffffff (rw,prefetchable) I/O Ports: 0x9000-0x9fff (rw) Memory Range: 0xe8100000-0xe810ffff (rw,non-prefetchable) Memory Range: 0xe8120000-0xe813ffff (ro,prefetchable,disabled) IRQ: 16 (4 events) I/O Ports: 0x3c0-0x3df (rw) Module Alias: "pci:v00001002d00004E50sv00001179sd0000FF01bc03sc00i00" Driver Info #0: XFree86 v4 Server Module: radeon Driver Info #1: XFree86 v4 Server Module: radeon 3D Support: yes Extensions: dri Config Status: cfg=no, avail=yes, need=no, active=unknown Attached to: #10 (PCI bridge) Primary display adapter: #20 The second machine is a desktop box running an MSI KM2M motherboard and with a virtually identical Radeon 9600 series card. This machine works FINE with the fglrx driver and has since at least 8.3. The driver on the desktop (the same): 15:29 trinity~> rpm -qa | grep fglrx fglrx_7_1_0_SUSE103-8.476-1 The hwinfo for the desktop card is: 15:29 trinity~> sudo hwinfo --gfxcard 21: PCI(AGP) 100.0: 0300 VGA compatible controller (VGA) [Created at pci.301] UDI: /org/freedesktop/Hal/devices/pci_1002_4e51 Unique ID: VCu0.qkQfUxyGw61 Parent ID: vSkL.ClyJU5Ffrr5 SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0 SysFS BusID: 0000:01:00.0 Hardware Class: graphics card Model: "PC Partner RV350 NQ" Vendor: pci 0x1002 "ATI Technologies Inc" Device: pci 0x4e51 "RV350 NQ" SubVendor: pci 0x174b "PC Partner Limited" SubDevice: pci 0x0200 Driver: "fglrx_pci" Driver Modules: "fglrx" Memory Range: 0xc0000000-0xcfffffff (rw,prefetchable) I/O Ports: 0x9000-0x9fff (rw) Memory Range: 0xe8020000-0xe802ffff (rw,non-prefetchable) Memory Range: 0xe8000000-0xe801ffff (ro,prefetchable,disabled) IRQ: 20 (no events) I/O Ports: 0x3c0-0x3df (rw) Module Alias: "pci:v00001002d00004E51sv0000174Bsd00000200bc03sc00i00" Driver Info #0: XFree86 v4 Server Module: radeon Driver Info #1: XFree86 v4 Server Module: radeon 3D Support: yes Extensions: dri Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #12 (PCI bridge) 22: PCI 100.1: 0380 Display controller [Created at pci.301] UDI: /org/freedesktop/Hal/devices/pci_1002_4e71 Unique ID: NXNs.reexNDFCPo4 Parent ID: vSkL.ClyJU5Ffrr5 SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.1 SysFS BusID: 0000:01:00.1 Hardware Class: graphics card Model: "PC Partner M10 NQ [Radeon Mobility 9600] (Secondary)" Vendor: pci 0x1002 "ATI Technologies Inc" Device: pci 0x4e71 "M10 NQ [Radeon Mobility 9600] (Secondary)" SubVendor: pci 0x174b "PC Partner Limited" SubDevice: pci 0x0201 Memory Range: 0xd0000000-0xdfffffff (rw,prefetchable,disabled) Memory Range: 0xe8030000-0xe803ffff (rw,non-prefetchable,disabled) Module Alias: "pci:v00001002d00004E71sv0000174Bsd00000201bc03sc80i00" Config Status: cfg=new, avail=yes, need=no, active=unknown Attached to: #12 (PCI bridge) Primary display adapter: #21 There is still a bug that needs to be addressed. Why the issue is present with the laptop card and not the desktop would be a good place to start addressing the problem. Why is the lockup present on the [ Device: pci 0x4e50 "RV350 NP" ] and not on the [ Device: pci 0x4e51 "RV350 NQ" ] is a great clue to where the problem is. I'm just certainly not skilled enough in the driver area to make any headway. Thank you so much for your continued investigation of this issue. One of these days, I know we will find out what the culprit is.
Ok. Thanks for giving it another try.
.
Oops.
New Top5 Bug.
Stefan, Woohoo, we finally hit the big time! This bug's days are numbered. Let me know if I can send in ANY information that will help. Thanks
Stefan, 8.5 driver (fglrx_7_1_0_SUSE103-8.493-1) installed today and the same lockup issue is still present. Keep the faith. As always, let me know if I can send you anything extra. Especially since I still have both boxes available containing one of each: Model: "Toshiba America Info RV350 NP" (lockup problem) Device: pci 0x4e50 "RV350 NP" and Model: "PC Partner RV350 NQ" (no lockup problem) Device: pci 0x4e51 "RV350 NQ" Thanks for your help.
Could you verify, if this issue is still reproducable with Catalyst 8.5? Driver download: http://ati.amd.com/support/drivers/linux/linux-radeon.html Installation instructions: http://www.suse.de/~sndirsch/ati-installer-HOWTO.html#manual
(In reply to comment #36 from Stefan Dirsch) > Could you verify, if this issue is still reproducable with Catalyst 8.5? > > Driver download: > http://ati.amd.com/support/drivers/linux/linux-radeon.html > > Installation instructions: > http://www.suse.de/~sndirsch/ati-installer-HOWTO.html#manual > Sure Stefan, See comment 35 ;-) The hardlock on the RV350 NP is still just as viscous as ever...
Indeed. I'm sorry.
I finally decided to no longer track proprietary ATI/fglrx driver bugs against openSUSE. Therefore I'm closing these now as WONTFIX. In case you're using our SLES/SLED products and can reproduce this issue also on thesed products feel free to reopen. These are still tracked, since customers of these products depend on the proprietary driver for newer ATI hardware. Be aware that you need a privilleged account to track anything against our SLES/SLED products. So if this not an option for you I suggest to report the problem to the official ATI driver feedback channels (email/unofficial public bugzilla; see ATI driver download site) and refer to this bugreport.
This can't be the same Stefan that penned "New Top5 Bug." less than a month and a half ago? Novell, whatever you did with him, I want that Stefan back. How in the world does a bug go from "Top5" to swept under the rug in 40 days? Is Novell the problem? This would never have happened with the SuSE I remember. I never shoot the messenger. So, thank you Stefan for your hard individual work on this bug, that was greatly appreciated. Your type of ethic and dedication to solving problems and making openSuSE better for all is what has always set SuSE apart from the crowd. To the powers-that-be that made the decision to "give up", this is a poor reflection on what SuSE is becoming. I certainly hope this is "the exception" and not the "new norm" at bugzilla.novell.
Great work Stefan, It is FIXED !!! in 8.501! This bug can honestly be closed now!
Credits go to ATI, not me. Apparently you became beta tester. Didn't know this.