Bug 390272

Summary: nvidia: X11 hangs in busy loop (GeForce 6200), system unusable
Product: [openSUSE] openSUSE 11.0 Reporter: Klaus Kämpf <kkaempf>
Component: X11 3rd Party DriverAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: Stefan Dirsch <sndirsch>
Severity: Blocker    
Priority: P2 - High CC: andras.barna, bob_l_lewis, forgotten_PJpAC5DKqq, hmuelle, marcelovborro
Version: Beta 2   
Target Milestone: ---   
Hardware: x86-64   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: nvidia-debug-report
screenshot

Description Klaus Kämpf 2008-05-14 14:05:03 UTC
After update to Beta2, X11 failed to start up (monitor stayed dark).

Remote login was possible but extremely slow, 'top' showed X taking all CPU cycles.

Update to Beta3 didn't improve the situation.

It seems as if the 'G01' driver doesn't properly support this card any more:
01:00.0 VGA compatible controller: nVidia Corporation NV44A [GeForce 6200] (rev a1) (prog-if 00 [VGA controller])
01:00.0 0300: 10de:0221 (rev a1) (prog-if 00 [VGA controller])
        Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 16
        Memory at e8000000 (32-bit, non-prefetchable) [size=16M]
        Memory at d0000000 (32-bit, prefetchable) [size=256M]
        Memory at e9000000 (32-bit, non-prefetchable) [size=16M]
        Capabilities: <access denied>
        Kernel driver in use: nvidia
        Kernel modules: nvidia, nvidiafb
Comment 1 Klaus Kämpf 2008-05-14 14:06:08 UTC
Please remove this card from the device list of the nvidia binary packages.

Using 'nv' works fine.

Upgrade from previous versions to 11.0 should probably rewrite xorg.conf to use 'nv' instead of 'nvidia'
Comment 2 Stefan Dirsch 2008-05-14 17:09:01 UTC
I do not agree. Instead this issue should be fixed by NVIDIA. Please attach an nvidia-bug-report.log, which is generated by running "nvidia-bug-report.sh" - after having installed the NVIDIA driver and you tried tried running it.

> Upgrade from previous versions to 11.0 should probably rewrite xorg.conf to
> use 'nv' instead of 'nvidia'

Usually the nvidia driver is uninstalled during update, since there is no package update available for the kernel. So the nvidia driver entry will
point to the "nvidia" dummy driver, which is a copy of the nv driver. So patching xorg.conf is not required here.
Comment 3 Klaus Kämpf 2008-05-15 08:44:47 UTC
Ok, I'll try to provide this information.
But the system usually is *very* busy and only a hard reset helps :-(
Comment 4 Klaus Kämpf 2008-05-20 09:47:23 UTC
Running 'startx' as root gives me

Markers: (--) probed, (**) from config file, (==) default setting,
         (++) from command line, (!!) notice, (II) informational,
        (WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Tue May 20 11:38:49 2008
(==) Using config file: "/etc/X11/xorg.conf"
(II) Module "ramdac" already built-in

and then it hangs.
'strace' doesn't show anything.

And its pretty worse - even keyboard numlock doesn't react any more. I wonder how to get the nvidia-bug-report.sh script running ...
Comment 5 Klaus Kämpf 2008-05-20 09:51:02 UTC
And its not X11 but apparently the nvidia kernel module. Even after killing X, the system still hangs completely. I cannot run the bug report script :-(

Raising to blocker for now. 
Comment 6 Stefan Dirsch 2008-05-20 21:59:25 UTC
Ok. I will remove this device ID (10de:0221) from the list of supported devices for the openSUSE 11.0 packages.
Comment 7 Stefan Dirsch 2008-05-21 07:22:12 UTC
done.
Comment 8 Harald Mueller-Ney 2008-06-16 16:24:10 UTC
Klaus, driver is working for me, but I need to compile it manually because we blocked the pciid for this driver.

Could you check is the driver is working for you now too?
I used latest from driver from NVidia's website.
You can access it internally from my home: 

http://w3.suse.de/~hmuelle/NVIDIA-Linux-x86_64-173.14.05-pkg2.run

Would be so kind to test if this driver works for you? If it is working we could unblock PCI ID.
Comment 9 Stefan Dirsch 2008-06-17 17:45:46 UTC
*** Bug 357366 has been marked as a duplicate of this bug. ***
Comment 10 Marcelo Borro 2008-06-17 18:12:40 UTC
I guess something went wrong here:
Bug 357366 has been marked as duplicate of this bug but it is due to wrong kernel version - It is built on kernel 2.6.22.18 that was not released yet.

I do not have any of the symptons above.  The only problem is that the X server do not work with this kernel module.  I do not have any decrease in performance.  The X server just don't start.

The problem occur on OpenSuse 10.3 on all machines that have a Geforce 6200 (0x0221). And this gpu is officially supported by the latest driver:
http://us.download.nvidia.com/XFree86/Linux-x86_64/173.14.09/README/appendix-a.html

I guess bug 357366 this has nothing to do with 390272.

Could you please check that?

Comment 11 Stefan Dirsch 2008-06-17 19:01:03 UTC
I closed Bug #357366 as duplicate of this one, because this bugreport is the reason why support for GeForce 6200 device ID (10de:0221) has been removed from the driver RPMs. See also comment #6.
Comment 12 Stefan Dirsch 2008-06-17 19:04:34 UTC
*** Bug 357366 has been marked as a duplicate of this bug. ***
Comment 13 Marcelo Borro 2008-06-17 19:18:26 UTC
Thanks Stefan, but we're talkin about OpenSuse 10.3 on 357366, not 11.0.
And I've also tried the legacy driver and it have the same problem also.

Support for all 6200 GPU's is removed for all versios of Opensuse because there's  a problem with Opensuse 11.0 and Gforce 6200 ?   And who have a geforce 6200 should not use the opensuse nvidia driver rpm package and use the nvidia installer script?
Again, the 173.14.05_2.6.22.18_0.2-1.1 versions (G01 and legacy )DO NOT work on kernel 2.6.22.17.

I am waiting you to confirm these questions before reopening bug 357366.

Thnaks
Comment 14 Stefan Dirsch 2008-06-17 20:00:13 UTC
There are a couple of GeForce 6200 chips supported by the driver.

0x00F3 GeForce 6200
0x0146 GeForce Go 6600 TE/6200 TE
0x014F GeForce 6200
0x0161 GeForce 6200 TurboCache(TM)
0x0162 GeForce 6200SE TurboCache(TM)
0x0163 GeForce 6200 LE
0x0164 GeForce Go 6200
0x0167 GeForce Go 6200
0x0221 GeForce 6200
0x0222 GeForce 6200 A-LE

Only the support for the last but one has been removed. Why not blame the person, who forced me more or less to remove the support for this chip, by setting this bugreport to blocker?

The legacy driver does not support this device yet. Adding the device to the
kernel module would not help.

It's unlikely that the driver behaves different on openSUSE 10.3. Therefore
it's handled consistent from 11.0 down to SLES10.

Yes, you need to use the installer if you have the GeForce 6200 with 0x0221
Device ID. 
Comment 15 Klaus Kämpf 2008-06-18 08:28:05 UTC
(In reply to comment #8 from Harald Mueller-Ney)
> Klaus, driver is working for me, but I need to compile it manually because we
> blocked the pciid for this driver.
> 
> Could you check is the driver is working for you now too?
> I used latest from driver from NVidia's website.
> You can access it internally from my home: 
> 
> http://w3.suse.de/~hmuelle/NVIDIA-Linux-x86_64-173.14.05-pkg2.run
> 
> Would be so kind to test if this driver works for you? If it is working we
> could unblock PCI ID.
> 

Sorry, no, this still has the same effect as described in the initial comment. I have to powercycle the system.
Comment 16 Stefan Dirsch 2008-06-18 08:59:24 UTC
Thanks for testing, Klaus. Meanwhile NVIDIA released a new driver (173.14.09).

  http://www.nvidia.com/object/linux_display_amd64_173.14.09.html
  http://www.nvidia.com/object/linux_display_ia32_177.13.html

This change might be related.

  * Fixed a regression that prevented the X driver from starting
    on some GeForce FX, 6 and 7 mobile GPUs.

Would be nice if you could test this one as well.
Comment 17 Klaus Kämpf 2008-06-18 13:48:20 UTC
Nope, 173.14.09 does not improve the situation for me :-(
Comment 18 Stefan Dirsch 2008-06-18 14:00:21 UTC
>It's unlikely that the driver behaves different on openSUSE 10.3. Therefore
>it's handled consistent from 11.0 down to SLES10.

And once you're updating to openSUSE >= 11.0, you'll be bitten by this issue again. So this is not a solution either.
Comment 19 Andras Barna 2008-06-28 06:49:54 UTC
sndrish, here works OK.
please reenable it, thanks
Comment 20 Andras Barna 2008-06-28 06:50:52 UTC
Created attachment 224954 [details]
nvidia-debug-report
Comment 21 Andras Barna 2008-06-28 06:51:28 UTC
Created attachment 224955 [details]
screenshot
Comment 22 Robert Lewis 2008-06-28 14:37:47 UTC
Just FYI.  If I D/L from the NVIDIA site the driver and compile it myself then I get full support in 3D etc. for this card.  I am not sure if I should keep the card in the machine so that I can help test on 11.0 when a fix surfaces or try another card in the mean time.  What gets ugly is if I select the driver from the RPMS after adding the NVIDIA repository then I have a complete failure of graphics and only can use the command line mode from F1.  It's not to difficult to recover but I have no such problems when I sniff at UBUNTU.
Comment 23 Stefan Dirsch 2008-06-28 20:03:56 UTC
Bob, please read comment #6 and comment #11. Thanks.
Comment 24 Forgotten User PJpAC5DKqq 2008-07-08 21:19:40 UTC
I dont get this. Why did you removed support for Device ID is 0x0221? Drivers from nvidia site supports it.
Comment 25 Stefan Dirsch 2008-07-09 02:25:42 UTC
*** Bug 407270 has been marked as a duplicate of this bug. ***
Comment 26 Stefan Dirsch 2008-07-09 02:29:00 UTC
(In reply to comment #24 from Jan-Olof Eriksson)
> I dont get this. Why did you removed support for Device ID is 0x0221? 
> Drivers from nvidia site supports it.
Same for you. Please read comment #6 and comment #11. Thanks.

Comment 27 Forgotten User PJpAC5DKqq 2008-07-09 07:07:13 UTC
(In reply to comment #26 from Stefan Dirsch)
> (In reply to comment #24 from Jan-Olof Eriksson)
> > I dont get this. Why did you removed support for Device ID is 0x0221? 
> > Drivers from nvidia site supports it.
> Same for you. Please read comment #6 and comment #11. Thanks.
> 

Yes i read those, but i didnt understand are you going to fix that or have opensSUSE just dropped supporting that card?
Comment 28 Forgotten User PJpAC5DKqq 2008-07-09 07:10:07 UTC
Btw, this bug is resolved, but not fixed. End of supporting some nvidia cards isnt fixing things imo ;) 
Comment 29 Stefan Dirsch 2008-07-09 13:10:58 UTC
I cannot fix the driver. It freezes the system for the reporter. He reported this as a blocker. Therefore I needed to disable this Device ID in the kernel module. How often do I need to repeat this?
Comment 30 Andras Barna 2008-07-09 13:22:20 UTC
it seems only for him...
hardware problem?
so.. WORKSFORME..
Comment 31 Marcelo Borro 2008-07-09 13:29:46 UTC
We understood that part, but after nvidia driver updates is Klaus system still freezing?

I'm migrating several machines from 10.3 to 11.0 and all of then have the 0x0221 card.  All of then are working perfectly with latest Nvidia drivers.

Nvidia documentation claims that this card is still supported by the "g01", and the nvidia installer works ok, so I guess Klaus initial afirmation is wrong now.
Comment 32 Stefan Dirsch 2008-07-09 13:51:17 UTC
Klaus tested with the latest driver release. I opened the comments, which were marked as private unfortunately.
Comment 33 Forgotten User PJpAC5DKqq 2008-07-10 14:01:05 UTC
I dont know what is situation in Klaus PC, but those Nvidia packages works just fine for me. No freezing or anything. I dont understand this, one user has problems, openSUSE stops supporting whole card? Nvidia still supports it.
Comment 34 Klaus Kämpf 2008-07-10 14:12:38 UTC
(In reply to comment #31 from Marcelo Borro)
> 
> Nvidia documentation claims that this card is still supported by the "g01", and
> the nvidia installer works ok, so I guess Klaus initial afirmation is wrong
> now.
> 

Well, my system freezing is very real.

That said, ymmv.
Comment 35 Klaus Kämpf 2008-07-10 14:14:37 UTC
(In reply to comment #33 from Jan-Olof Eriksson)
> I dont know what is situation in Klaus PC, but those Nvidia packages works just
> fine for me. No freezing or anything. I dont understand this, one user has
> problems, openSUSE stops supporting whole card? Nvidia still supports it.
> 

So do we have a draw now. It works for you, it does not for me.
Comment 36 Klaus Kämpf 2008-07-10 14:15:50 UTC
Stefan, its your decision in the end. 

Reenabling the card might make a lot of users happy or get you a lot of bug reports ;-)
Comment 37 Stefan Dirsch 2008-07-10 14:22:50 UTC
(In reply to comment #36 from Klaus Kämpf)
> Stefan, its your decision in the end. 

Thanks. I'll enable it again for next driver release.
 
> Reenabling the card might make a lot of users happy or get you a lot of bug
> reports ;-)

Not more than now I'm rather sure. It could well be that this issue with your card only happens in your machine.