|
Bugzilla – Full Text Bug Listing |
| Summary: | Xorg freezes during boot with nv driver | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Ryan Fitzgerald <ryanfitz> |
| Component: | X.Org | Assignee: | Stefan Dirsch <sndirsch> |
| Status: | VERIFIED FIXED | QA Contact: | Stefan Dirsch <sndirsch> |
| Severity: | Normal | ||
| Priority: | P2 - High | CC: | eich, Josephine_k, lars, netllama |
| Version: | Beta 3 | ||
| Target Milestone: | --- | ||
| Hardware: | i386 | ||
| OS: | SUSE Other | ||
| Whiteboard: | |||
| Found By: | Customer | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
Xorg log file after x freeze
Xorg configuration file Updated Xorg log Patch from CVS mentioned above Patch from CVS mentioned above NVIDIA finally resolved the problem :-) prevent_endless_loop |
||
|
Description
Ryan Fitzgerald
2005-08-26 02:12:06 UTC
Please let the system crash again and reboot into runlevel 3 afterwards. Then attach current /etc/X11/xorg.conf and /var/log/Xorg.0.log. Thanks. Created attachment 47720 [details]
Xorg log file after x freeze
Created attachment 47721 [details]
Xorg configuration file
You can see that in the Xorg log file that the only error reported was about the glx extension, so I removed glx module from being loaded in xorg.conf and started X again, however it froze once again. I checked the Xorg log file and this time no error was reported, but the same warnings did still occur. If you want me to upload that Xorg log file just let me know. You're mixing a nv/nvidia driver configuration. Please try to uninstall the nvidia driver configuration first. nvidia-installer --uninstall Created attachment 47730 [details]
Updated Xorg log
I ran the nvidia-installer --uninstall and everything went fine. I tried using the nv driver and again X froze. I ran a diff between my xorg.conf files and they were the same before and after the nvidia uninstall, however my Xorg.0.log did change so I uploaded that. logfile looks ok now. nvidia driver installation has been uninstalled cleanly. Please add Option "XaaNoScreenToScreenCopy" to 'Section "Device"' of your /etc/X11/xorg.conf and try again. If this doesn't help, try Option "noaccel" instead. If ""XaaNoScreenToScreenCopy" does help try also Option "XaaNoPixmapCache" Option "XaaNoOffScreenPixmaps" Please report the results. Thanks. Using Option "XaaNoScreenToScreenCopy" worked and so did using Option "noaccel". The only problem is that scrolling is unbearably slow when using either of these options. I also added options "XaaNoPixmapCache" and "XaaNoOffScreenPixmaps" when using "XaaNoScreenToScreenCopy", but still scrolling was extremely slow. I did not notice any other problems other then that. Please try only Option "XaaNoPixmapCache" Option "XaaNoOffScreenPixmaps" If this also works it should be usable again. When using only "XaaNoPixmapCache" and "XaaNoOffScreenPixmaps" the system starts to load up gnome and gets as far as displaying the desktop, but then it freezes. Ouch. Probably the hit the livelock problem again. :-( I have experienced the same problem a couple of times with the NV driver. It seems to be a problem with using 24bit colordepth. When using 16bit the driver runs perfectly stable (even without disabling any Xaa stuff) for me. (In reply to comment #13) > I have experienced the same problem a couple of times with the NV driver. It What do you mean with 'a couple of times'? With different graphics cards, or with different SuSE versions? Do you remember when you hit it first? BTW - one way to circumvent this is to use the binary NVidia drivers, they usually work. You will want to use them anyway (otherwise a 6800 doesn't really make sense ;) I don't remember exactly when I first saw it, as SuSE 9.3 defaults to 16bit colordepth and the first thing I did afterwards was to install the commercial driver. But the nv driver is unstable in 24 bit mode at least with my Geforce 6600. No idea how good or bad it works with other HW. I experienced the hanging quite often as I'm at the moment switching back and forth between nv (to add EXA support there) and the nvidia driver (for other work). And the binary driver doesn't help you at all as long as you don't ship it together with openSuSE (which you obviously can't) and your system hangs after the default install because of this. I know how to get around it, but Joe User probably won't. I know. Unfortunately, this is a bug in the nv driver we thought present in some 6200 cards only, but it seems to be more widespread. Could both of you send us the vendor/device ids, i.e. the output of hwinfo --gfxcard Then we will add XaaNoScreenToScreenCopy to these cards. I'm sorry that there is no better solution ATM, I'm already bugging NVidia for driver improvement here, but their focus is on the binary driver, of course. Ryan, can try run your Xserver with 16bit as well? Does it work without the XaaNoScreenToScreenCopy as well? Then this would be a viable solution. dhcp234:~ # hwinfo --gfxcard
23: PCI 100.0: 0300 VGA compatible controller (VGA)
[Created at pci.277]
UDI: /org/freedesktop/Hal/devices/pci_10de_140
Unique ID: VCu0._BdnBIAclaC
Parent ID: vSkL.akG_2l700s2
SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0
SysFS BusID: 0000:01:00.0
Hardware Class: graphics card
Model: "nVidia GeForce 6600 GT"
Vendor: pci 0x10de "nVidia Corporation"
Device: pci 0x0140 "GeForce 6600 GT"
Revision: 0xa2
Driver: "nvidiafb"
Memory Range: 0xd0000000-0xd3ffffff (rw,non-prefetchable)
Memory Range: 0xc8000000-0xcfffffff (rw,prefetchable)
Memory Range: 0xd4000000-0xd4ffffff (rw,non-prefetchable)
Memory Range: 0x40000000-0x4001ffff (ro,prefetchable,disabled)
IRQ: 137 (371025 events)
I/O Ports: 0x3c0-0x3df (rw)
Module Alias: "pci:v000010DEd00000140sv00000000sd00000000bc03sc00i00"
Driver Info #0:
XFree86 v4 Server Module: nv
XF86Config Entry: Option "XaaNoPixmapCache"\nOption
"XaaNoOffScreenPixmaps"
Driver Info #1:
XFree86 v4 Server Module: nvidia
3D Support: yes
Config Status: cfg=new, avail=yes, need=no, active=unknown
Attached to: #10 (PCI bridge)
Primary display adapter: #23
Cheers,
Lars
16bit doesn't work for the card we have for reproduction. :-( This seems like a hardware race condition to me, which is obviously more often triggered when using 24bit. 23: PCI 100.0: 0300 VGA compatible controller (VGA)
[Created at pci.277]
UDI: /org/freedesktop/Hal/devices/pci_10de_42
Unique ID: VCu0.HI5+P2cWJE8
Parent ID: vSkL.K3WJKbXW3V7
SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0
SysFS BusID: 0000:01:00.0
Hardware Class: graphics card
Model: "LeadTek GeForce 6800 LE"
Vendor: pci 0x10de "nVidia Corporation"
Device: pci 0x0042 "GeForce 6800 LE"
SubVendor: pci 0x107d "LeadTek Research Inc."
SubDevice: pci 0x299b
Revision: 0xa1
Memory Range: 0xd5000000-0xd5ffffff (rw,non-prefetchable)
Memory Range: 0xd8000000-0xdfffffff (rw,prefetchable)
Memory Range: 0xd4000000-0xd4ffffff (rw,non-prefetchable)
Memory Range: 0xd7f00000-0xd7f1ffff (ro,prefetchable,disabled)
IRQ: 185 (4 events)
I/O Ports: 0x3c0-0x3df (rw)
Module Alias: "pci:v000010DEd00000042sv0000107Dsd0000299Bbc03sc00i00"
Driver Info #0:
XFree86 v4 Server Module: nv
Driver Info #1:
XFree86 v4 Server Module: nvidia
3D Support: yes
Config Status: cfg=yes, avail=yes, need=yes, active=unknown
Attached to: #9 (PCI bridge)
Primary display adapter: #23
I've added XaaNoScreenToScreenCopy now to both gfx boards. I'm afraid that the Open Source nv driver is getting more and more unusable. At least for 6x00 boards. :-( Sorry, but without ScreenToScreen copy the driver is more or less unusable, as reads from the framebuffer on NV hardware are way too slow (I get max 2 MB/sec on my hardware). So e.g moving windows is unbearable without an accelerated copy. It would be a lot better to use the ShadowFB option for the HW that makes problems. That actually works for the broken card we have here as well! Thanks, Lars. Yet another option I havn't though of WRT this bug... Stefan, please change database entries to Option "ShadowFB" "on" Shadow FB is completely unaccelerated. So it is conceivable that it works. Why don't you just send me a card with which I can reproduce this? But it is not clear that it is *that* much faster than just disabling ScreenToScreenCopy. We needed the card to reproduce several issues, I think Stefan can now send the card to you. I'll send this card to Egbert today. Since it's unclear whether we'll get a fix in time I'll change XaaNoScreenToScreenCopy to ShadwoFB for now. Trust me, as long as you're not using a Pentium 100 it's a lot faster. When you get max 2.5 MB/sec in framebuffer read speed a simple calculation shows that moving a 200x200 pixel window takes about 60ms. If you have a 500x500 window, you're up at 400ms. That's unusable. Using a shadowFB, the blit is in main memory using memcpy which has a bandwidth of more than 1GB/s on my hardware. After that you need to write the damaged region to the framebuffer (these are rather fast). Alltogether that takes probably 5ms to complete for the 500x500 window. I tried using shadowFB on my HW and with the current state the driver is in it's by far the fastest option if you have a halfway modern CPU. On a P4 or similar you can even run a composition manager on top and keep a usable desktop. That's strange, because I got 7 MB/s framebuffer read speed in userspace on a TNT2 over AGP... Back in those old days... I thought that readback is much faster in PCIe than in AGP. The 2.5 MB are measured with an AGP card. But even with my PCIe card I don't get more than 7.5 MB/s. Maybe the kernel support for PCIe is still lacking something? Egbert will investigate this. This will definitely be investigated by nvidia. Workaround for now was to set "ShadowFB" for the affected chipsets we could test and which are affected by this. The problem is also mentioned in the Release Notes. Setting to Normal. Tracked on developer.nvidia.com now as #187822. *** Bug 66744 has been marked as a duplicate of this bug. *** Looks related. Date: Tue, 13 Sep 2005 19:28:04 -0700 (PDT) From: Mark Vojkovich <mvojkovi@XFree86.Org> To: cvs-commit@xfree86.org Subject: CVS Update: xc (branch: trunk) CVSROOT: /home/x-cvs Module name: xc Changes by: mvojkovi@public.xfree86.org. 05/09/13 19:28:03 Log message: Fix a potential problem with pixmap cache corruption on GeForce 6xxx and 7xxx parts. Modified files: xc/programs/Xserver/hw/xfree86/drivers/nv/: nv_driver.c nv_hw.c nv_setup.c Revision Changes Path 1.137 +19 -7 xc/programs/Xserver/hw/xfree86/drivers/nv/nv_driver.c 1.16 +39 -12 xc/programs/Xserver/hw/xfree86/drivers/nv/nv_hw.c 1.48 +1 -2 xc/programs/Xserver/hw/xfree86/drivers/nv/nv_setup.c eich > It will at least fix the pixmap cache problem. eich > The lockup problem still remains to be looked at. Created attachment 50076 [details]
Patch from CVS mentioned above
This one looks interesting: Date: Thu, 22 Sep 2005 13:34:42 -0700 (PDT) From: Mark Vojkovich <mvojkovi@XFree86.Org> To: cvs-commit@xfree86.org Subject: CVS Update: xc (branch: trunk) CVSROOT: /home/x-cvs Module name: xc Changes by: mvojkovi@public.xfree86.org. 05/09/22 13:34:42 Log message: Fix possible cause of some acceleration instability on some GeForce6xxx parts. Modified files: xc/programs/Xserver/hw/xfree86/drivers/nv/: nv_hw.c Revision Changes Path 1.17 +14 -4 xc/programs/Xserver/hw/xfree86/drivers/nv/nv_hw.c Created attachment 50764 [details]
Patch from CVS mentioned above
WRT comment #36: Unfortunately it's unrelated. :-( aritger: "We found this change while debugging the hang problem, but this fix does not solve the problem in our tests." Created attachment 51051 [details]
NVIDIA finally resolved the problem :-)
xorg-x11 package with all patches applied submitted to STABLE and 10.0. I'll make a YOU update after testing. See also: --> https://bugs.freedesktop.org/show_bug.cgi?id=3333 BTW, a new NVIDIA developer, who's working on the Open Source driver? Created attachment 51219 [details]
prevent_endless_loop
From X.Org CVS, committed by Aaaron Plattner:
* Don't hang if j is zero. This should never happen, but it's better to be
safe than sorry.
above patch applied for 10.0. Reopen for aquiring a SWAMPID. Andreas, could you create a SWAMP entry for this? Description: - fixes "nv" video driver freeze and acceleration problems (#113203) Please fix this together with Bug# 114490. Ok. me> Ich wollte ASAP das YOU nv Treiber Update für die 10.0 machen. Dafür me> hätte ich das aber gerne erst getestet. Es ist uns gerade aufgefallen, me> dass Du inzwischen alle 6200 Karten hast, bei denen diese beiden me> Probleme (a) Livelock, b) korrupter Pixmap-Cache) auftreten. Also me> entweder Du müsstest das testen oder Du schickst uns die Karten me> zurück. egbert> Ich hatte den pixmap Patch bereits vor einiger Zeit mit einer Karte, egbert> auf der er auftrat, getestet, er hatte funktioniert. egbert> Heute habe ich noch mal mit Josephine's Karte getestet. Der neue egbert> patch hat das Problem geloest. egbert> Ich denke, das sollte an Tests reichen. released |