Bug 127757

Summary: solid x11 lockups
Product: [openSUSE] SUSE Linux 10.1 Reporter: Dirk Mueller <dmueller>
Component: X.OrgAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P2 - High CC: aj, behlert, eich, evan_ochs, forgotten_OS1JNCFbCX, gchristensen, j.glisse, marc.ruehrschneck, wwlinuxengineering
Version: unspecified   
Target Milestone: ---   
Hardware: Other   
OS: All   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: xorg.0.log
xorg.conf
xorg.0.log of 10.0's xorg-x11-server
Radeon driver CVS 2005-11-07 backported to 10.0
Radeon driver, unstripped
Radeon driver CVS 2005-11-07 backported to 10.0, stripped
Radeon driver 10.0 compiled for CVS 2005-11-02 aka STABLE aka xorg-x11-server-6.8.2-119
new log file Xorg.0.log
log of lockup when using plain STABLE xorg-x11
Patch mentioned by dmuelr
New radeon driver RPM with latest patch by Benjamin
New Radeon driver RPM for x86_64
xorg-x11-driver-video-6.9.0-14.i586.rpm
xorg-x11-driver-video-6.9.0-14.x86_64.rpm

Description Dirk Mueller 2005-10-12 07:15:28 UTC
after loading ath_pci the complete machine freezes within a few seconds.  
 
2.6.13.3-2-default here. the one before worked fine.
Comment 1 Dirk Mueller 2005-10-12 07:57:18 UTC
ok, it doesn't seem to be related to ath_pci itself.. it just randomly locks 
up 
 
Comment 3 Dirk Mueller 2005-10-13 14:50:33 UTC
seems x server related.. during startup I find this in the Xorg.0.log:  
 
(II) Loading sub module "radeon" 
(II) LoadModule: "radeon" 
(II) Reloading /usr/X11R6/lib/modules/drivers/radeon_drv.so 
(WW) ****INVALID MEM ALLOCATION**** b: 0xe0000000 e: 0xe8000000 correcting^G 
(II) window: 
        [0] -1  0       0xe0000000 - 0xe7ffffff (0x8000000) MX[B] 
(II) resSize: 
        [0] -1  0       0x00000000 - 0xffffffff (0x0) MX[B] 
(II) window fixed: 
        [0] -1  0       0xe0000000 - 0xe7ffffff (0x8000000) MX[B] 
(WW) ****INVALID IO ALLOCATION**** b: 0x3000 e: 0x3100 correcting^G 
(II) window: 
        [0] -1  0       0x00003000 - 0x000030ff (0x100) IX[B] 
        [1] -1  0       0x00003400 - 0x000034ff (0x100) IX[B] 
        [2] -1  0       0x00003800 - 0x000038ff (0x100) IX[B] 
        [3] -1  0       0x00003c00 - 0x00003cff (0x100) IX[B] 
(II) resSize: 
        [0] -1  0       0x00000000 - 0xffffffff (0x0) IX[B] 
(II) window fixed: 
        [0] -1  0       0x00003000 - 0x000030ff (0x100) IX[B] 
        [1] -1  0       0x00003400 - 0x000034ff (0x100) IX[B] 
        [2] -1  0       0x00003800 - 0x000038ff (0x100) IX[B] 
        [3] -1  0       0x00003c00 - 0x00003cff (0x100) IX[B] 
Requesting insufficient memory window!: start: 0x3000 end: 0x30ff size 
0xc0120100 
Requesting insufficient memory window!: start: 0x3400 end: 0x34ff size 
0xc0120100 
Requesting insufficient memory window!: start: 0x3800 end: 0x38ff size 
0xc0120100 
Requesting insufficient memory window!: start: 0x3c00 end: 0x3cff size 
0xc0120100 
(EE) Cannot find a replacement memory range 
 
 
 
with xorg-x11 from 10.0 but kernel from STABLE it seems to run fine 
 
 
Comment 4 Dirk Mueller 2005-10-14 13:13:21 UTC
22: PCI(AGP) 100.0: 0300 VGA compatible controller (VGA)         
  [Created at pci.277] 
  UDI: /org/freedesktop/Hal/devices/pci_1002_4e54 
  Unique ID: VCu0.Bl6sNg5xBX0 
  Parent ID: vSkL.1o+Z33xgwU4 
  SysFS ID: /devices/pci0000:00/0000:00:01.0/0000:01:00.0 
  SysFS BusID: 0000:01:00.0 
  Hardware Class: graphics card 
  Model: "IBM RV350 NT" 
  Vendor: pci 0x1002 "ATI Technologies Inc" 
  Device: pci 0x4e54 "RV350 NT" 
  SubVendor: pci 0x1014 "IBM" 
  SubDevice: pci 0x054f  
  Revision: 0x80 
  Memory Range: 0xe0000000-0xe7ffffff (rw,prefetchable) 
  I/O Ports: 0x3000-0x3fff (rw) 
  Memory Range: 0xc0100000-0xc010ffff (rw,non-prefetchable) 
  Memory Range: 0xc0120000-0xc013ffff (ro,prefetchable,disabled) 
  IRQ: 11 (4770992 events) 
  I/O Ports: 0x3c0-0x3df (rw) 
  Module Alias: "pci:v00001002d00004E54sv00001014sd0000054Fbc03sc00i00" 
  Driver Info #0: 
    XFree86 v4 Server Module: radeon 
    XF86Config Entry: Option "DynamicClocks" "on" 
  Config Status: cfg=yes, avail=yes, need=yes, active=unknown 
  Attached to: #11 (PCI bridge) 
 
 
Comment 5 Stefan Dirsch 2005-10-14 20:42:16 UTC
f122.suse.de is not available. Could you attach /etc/X11/xorg.conf and /var/log/Xorg.0.log? Thanks.
Comment 6 Dirk Mueller 2005-10-17 07:31:49 UTC
weekend :) its there again.. you can also have a look directly if you want to



Comment 7 Dirk Mueller 2005-10-17 07:32:30 UTC
Created attachment 54232 [details]
xorg.0.log
Comment 8 Dirk Mueller 2005-10-17 07:33:11 UTC
Created attachment 54233 [details]
xorg.conf
Comment 9 Stefan Dirsch 2005-10-17 08:21:20 UTC
Could you add also xorg.0.log of 10.0 xorg-x11 so I can make sure whether the freeze is related to these warnings mentioned in comment #1 at all? Commenting out the experimental DynamicClocks line would be worth a try (for STABLE xorg-x11 I mean).
Comment 10 Dirk Mueller 2005-10-17 09:20:19 UTC
Created attachment 54250 [details]
xorg.0.log of 10.0's xorg-x11-server
Comment 11 Dirk Mueller 2005-10-17 09:21:57 UTC
the only effect that disabling "DynamicClocks" has is that the machine immediately locks up solid after startup instead of randomly during running

Comment 12 Stefan Dirsch 2005-10-17 09:54:54 UTC
> diff -u Xorg.0.log.10_0 Xorg.0.log
> [...]
> +(II) RADEON(0): Will try to use DMA for Xv image transfers

Could you also attach the lines of /var/log/messages of the session, in which the random lockup happens?

Try also 

  Option "DMAForXv" "off"

Only a wild guess. :-(
Comment 13 Dirk Mueller 2005-10-17 10:23:36 UTC
it locks up.. there is nothing in /var/log/message after the cold boot..

the extra option didn't make a difference

Comment 14 Matthias Hopf 2005-10-17 16:33:14 UTC
Ok, please try

  Option "ShadowFB"

and if that one helps (will get pretty slow)

  Option "NoAccel"

, both in the Device section.
Comment 15 Dirk Mueller 2005-10-18 09:30:55 UTC
no difference, still locks up immediately after start
Comment 16 Egbert Eich 2005-10-18 10:36:39 UTC
This is plausible after looking at the log messages.
Did this work before? With 9.3?
Comment 17 Stefan Dirsch 2005-10-18 10:40:30 UTC
It even works with 10.0. Dirk is alredy using STABLE (X.Org CVS).
Comment 18 Dirk Mueller 2005-10-18 12:00:28 UTC
Egbert, is it also plausible to find out where the bug comes from? :) 

I'd really like to have a working X server.. and this is standard hardware, nothing exotic.

Comment 19 Cameron Meadors 2005-10-24 18:17:00 UTC
I have a similar issue.  I have th same hardware as listed above.  I can start X but it is using the fb driver.  I get a hard crash when I start Sax2.  I have tried the proprietary drivers from ATI as well.  They are uninstallable (different bug, but makes this one more severe).
Comment 20 Stefan Dirsch 2005-10-26 15:04:13 UTC
Is this still an issue with the latest xorg-x11-server update in STABLE?
Comment 21 Dirk Mueller 2005-10-26 15:11:29 UTC
its not in dist/install/stable-x86 yet..
Comment 22 Dirk Mueller 2005-10-28 09:42:48 UTC
ok, tried brute-force installing from /work/CDs. same issue than before, after I sorted out where to find the radeon_drv.so now..

Comment 23 Robert Love 2005-10-28 19:54:31 UTC
I also see this bug.  Random hard locks -- unable to get anything out of the system -- in 5 to 30 minutes after logging into X.  The console is fine; it is X.

Running STABLE as of today.

System is an IBM T42p, 1600x1200, Radeon Fire GL (Mobility M10 NT Fire T2).

Tried different kernels, no luck.

Logs show nothing of interest.  Similar to Dirk's.

Cameron (Comment #19) tried the frame buffer driver today and has seen no lockups, so radeon driver seems suspect.
Comment 24 Robert Love 2005-11-02 14:32:02 UTC
A data point: jpr rebuilt the SUSE 10.0 xorg-x11 packages on STABLE and they work _fine_ (as do the 10.0 packages directly).

So we now know that the problem is in the new packages (a patch?) and not gcc or some other chance.

Should be easy to solve now!
Comment 25 Dirk Mueller 2005-11-02 14:37:06 UTC
well, the suse stable package include a "superradeon" patch which are deemed very fragile and broken (at least thats what I heared from hallway chatting). 

its really getting tiresome ... can we please revert those experimental patches as long as nobody has time to fix them?

Comment 26 Stefan Dirsch 2005-11-02 14:38:31 UTC
> Should be easy to solve now!
Nice joke! Ever looked at the radeon driver source changes between SUSE 10.0 and current X.Org CVS ???
Comment 27 Stefan Dirsch 2005-11-02 14:40:23 UTC
> well, the suse stable package include a "superradeon" patch which are deemed
> very fragile and broken (at least thats what I heared from hallway 
> chatting). 

Another wrong assumption. It's plain X.Org CVS, which does not include the superradeon patch, because of the reasons you mentioned above.
Comment 28 Dirk Mueller 2005-11-02 15:01:53 UTC
ok, I didn't actually look at it myself. Anyway, can you try producing an intermediate X.org package that doesn't contain so many radeon changes at once? maybe we can isolate the issue somehow by binary search. 

I also have the hardware here (office 3.2.30) if you want to tackle it directly. 
Comment 29 Matthias Hopf 2005-11-02 15:34:19 UTC
This is almost impossible, as the new driver contains all support for newer chipsets as well. They are also from different branches (6.8.2 + patches vs. CVS), so changes are nontrivial. Especially as your card probably wasn't supported in 6.8.2 without the patches.

These patches have been back-ported features of the advancing CVS, but also introduced a lot of issues reported to not exist in the CVS at all.
Comment 30 Dirk Mueller 2005-11-02 22:34:11 UTC
well, it sounded to me to still be the best idea to track down this regression. 

Comment 31 Andreas Jaeger 2005-11-04 15:02:02 UTC
How can we move forward on this?
Comment 32 Matthias Hopf 2005-11-07 16:05:51 UTC
Created attachment 56590 [details]
Radeon driver CVS 2005-11-07 backported to 10.0

I just backported the current CVS radeon driver to 10.0.  Please try installing the 10.0 xorg packages and replace the radeon.o driver in /usr/X11R6/lib/modules/drivers with this one.

If the lockups occure again, we have a driver problem, if they don't we have an issue in the general Xserver sources.
Comment 33 Dirk Mueller 2005-11-07 16:32:37 UTC
installing that radeon_drv.o gives only this error message: 

"(EE) LoadModule: Module radeon does not have a radeonModuleData data object"

Comment 34 Matthias Hopf 2005-11-07 17:28:41 UTC
Created attachment 56606 [details]
Radeon driver, unstripped

???
Ok, please try this one, which is not stripped.
If it still fails, please provide the complete log.
Comment 35 Matthias Hopf 2005-11-07 17:33:46 UTC
Ok, forget about that, my fault...
Wait, I'll have to update the module.
Comment 36 Matthias Hopf 2005-11-07 17:45:18 UTC
Created attachment 56610 [details]
Radeon driver CVS 2005-11-07 backported to 10.0, stripped

This is the fixed driver. The unstripped one would work as well, but is much larger to download.

The first attempt stripped too many symbols from the driver.
Comment 37 Dirk Mueller 2005-11-07 18:14:23 UTC
briefly testing (10min) it seems this works fine..

Comment 38 Robert Love 2005-11-07 18:46:57 UTC
I also confirm success: Running STABLE, xorg from 10.0, with this radeon.o and no lock ups.
Comment 39 Stefan Dirsch 2005-11-07 19:08:29 UTC
Ok. So this seems to be a general Xserver problem. Do we still see these 
"INVALID MEM ALLOCATION" lines in the logfile, so is this related to this?
Comment 40 Dirk Mueller 2005-11-07 20:31:05 UTC
I don't see those log messages anymore with the hopf's radeon_drv.o
Comment 41 Stefan Dirsch 2005-11-07 22:35:34 UTC
Ok. I think that's somewhat deep in the Xserver and related to the changes in xc/programs/Xserver/hw/xfree86/common/xf86pciBus.c. To make absolutely sure we could build the old driver for the new Xserver ...
Comment 42 Matthias Hopf 2005-11-08 14:50:32 UTC
Ok, will do that.
Comment 43 Stefan Dirsch 2005-11-08 14:55:30 UTC
Or simply backout the changes in xf86pciBus.c. I did hope to see a comment about this by Egbert, since he's really very familiar with the code in PCI scan code in xf86pciBus.c.
Comment 44 Dirk Mueller 2005-11-10 09:33:27 UTC
any news?
Comment 45 Egbert Eich 2005-11-10 12:07:24 UTC
> I don't see those log messages anymore with the hopf's radeon_drv.o
This is indeed strange as they don't come from the driver.
However this seems to be wrong:
[9] -1	0	0xe0000000 - 0xe8000000 (0x8000001) MX[B](B)
The range is one byte too big.
The same here:
0x00003000 - 0x00003100 (0x101) IX[B](B)
I don't think this is related to the hang and certainly should not change when
you only exchange the driver.
What I would like to have is:
a. a log file with the old driver.
b. access to a box that exhibits these problems.
Stefan, this seems to be built from the sources in stable, right? 6.9 or 7.0?
Comment 46 Dirk Mueller 2005-11-10 12:39:49 UTC
https://bugzilla.novell.com/attachment.cgi?id=54250

this one is the logfile from the old driver (plain 10.0 unmodified)

machine is still available under the same ip
Comment 47 Matthias Hopf 2005-11-10 17:10:35 UTC
Egbert: the driver is CVS, backported to 10.0 (small interface changes, no semantical change), the xorg it has been compiled for is plain 10.0.

Sorry for the delay for reference test (STABLE with 10.0 radeon driver), have been busy... Will create that one tomorrow.
Comment 48 Matthias Hopf 2005-11-11 16:34:22 UTC
Created attachment 57134 [details]
Radeon driver 10.0 compiled for CVS 2005-11-02 aka STABLE aka xorg-x11-server-6.8.2-119 

This is the old (6.8.2) radeon driver, compiled for the current (STABLE/Alpha2) Xserver.

Please test. If you see the lockups again, we don't have a driver issue, but a general Xserver problem.

I didn't test it so far, because I don't have a machine with a radeon card equipped right now. But it should work.
Comment 49 Dirk Mueller 2005-11-11 18:00:27 UTC
hmm, right now even the x server from STABLE appears to work, but your radeon_drv.so works as well. 

*confused*



Comment 50 Dirk Mueller 2005-11-11 23:09:44 UTC
hmm, no. after reboot plain STABLE locks up as expected. STABLE + the old radeon driver works fine. 

Comment 51 JP Rosevear 2005-11-15 13:48:52 UTC
STABLE+old driver+psmisc 21.8 working for me (see also bug 132676 and bug 132180)
Comment 52 Matthias Hopf 2005-11-15 14:12:09 UTC
Dirk, could you please try this combination, as well as STABLE Xorg + STABLE driver + psmisc 21.8?

It's getting strange, as you said that STABLE + the old driver works fine w/o lockup. This shouldn't happen...
Comment 53 Dirk Mueller 2005-11-15 14:22:12 UTC
see comment #50
Comment 54 JP Rosevear 2005-11-15 23:52:20 UTC
6.9rc seems to atleast load now with the included radeon driver.
Comment 55 Stefan Dirsch 2005-11-16 10:06:30 UTC
Hmm ... someone removed me from this bugreport. :-(
Comment 56 Matthias Hopf 2005-11-16 15:04:16 UTC
JFI:

On my T42p with M10 NT [FireGL Mobility T2] I have Alpha2Plus installed, but I just realized I didn't check the fglrx driver.

I'll install Alpha3 on it now and test fglrx. As I'll be on the road on Friday and Saturday, chances are high that I will see those freezes myself if they are typical for this chipset.
Comment 57 Egbert Eich 2005-11-16 15:44:58 UTC
Created attachment 57527 [details]
new log file Xorg.0.log

This new log file shows that the problem with the resource sizes which caused remapping has been fixed in the latest 6.9 test version.
Also no nore lockups seem to occur.
Comment 58 Egbert Eich 2005-11-16 15:45:52 UTC
Closing as 'fixed'. See comment #57.
Comment 59 Matthias Hopf 2005-11-16 17:51:57 UTC
Egbert, please don't close bugs until packages have been commited.

At least I go the bug reproduced on Alpha3, so I can test fixes here. Alpha2Plus seemed to work fine, though...

Forget my comments about fglrx. Another bug report.
Comment 60 Dirk Mueller 2005-11-16 21:21:05 UTC
the package is from STABLE, so it is committed. 

anyway, the bug is still there, my laptop again locked up, and there was literally nothing running except a plain xterm where I used less to scroll thrugh the xsession-errors file. and bang its dead. 

Comment 61 Robert Love 2005-11-16 21:35:28 UTC
I too am still seeing the bug, using latest Xorg from STABLE.

Copying the radeon driver provided above fixes the problem.
Comment 62 Egbert Eich 2005-11-17 09:12:45 UTC
matthias now tells me that the problem still exists. Interestingly it goes away after an older version of the server has been started that doesn't lock up. Then not even the latest server causes it to lock up.
I therefore asked him to regenerate the log file for a server that later locks up.
this way i can see if we still see the erronous remappings.
The remappings may be the reason for the lockups if they happen *after* the drm module has been loaded.
Comment 63 Dirk Mueller 2005-11-17 10:32:42 UTC
Created attachment 57618 [details]
log of lockup when using plain STABLE xorg-x11

this is the lockup with current STABLE. it immediately locks up after ".. are not fatal to the x server". it looks like it tries to switch video mode, then nothing happens, then the LCD backlight turns off, and the machine is dead.
Comment 64 Matthias Hopf 2005-11-17 17:23:23 UTC
This is different from my machine, when I get a deadlock, the screen still shows the SUSE gdm screen, and everything is frozen.
Comment 65 Matthias Hopf 2005-11-24 17:35:32 UTC
For the sake of completeness: On the mailing list:

Actually, I just verified that the lock up occures even with dri
completely disabled. Even SysRQ doesn't do anything anymore.

I also tried to start the 6.8.2 radeon driver again, it works like a
charm. After using one session with the 6.8.2 driver, I used the CVS
driver again, and the session froze again.

Somtimes, the machine freezes immedeately (only got a black screen,
machine dead), sometimes I can work a few minutes.
Comment 68 Egbert Eich 2005-11-25 10:03:42 UTC
Let's find out what HW is invovled:

Dirk uses:
ATI Technologies Inc M10 NT [FireGL Mobility T2] rev 128 
   chip 1002,4e54 card 1014,054f
Robert uses:
Radeon Fire GL (Mobility M10 NT Fire T2) (same thing?)

Matthias: 
  - What HW do you see the problem on?
  - Have you guys tried one of the obvious things (ie disabling HW 2D accel)?
Comment 69 Dirk Mueller 2005-11-25 12:16:13 UTC
we tried Option "NoAccel" already, no difference. 

it must be something in the initialisation I think, and it only occurs when using the new X server with the new radeon driver. 

both the old x server combined with the new radeon driver as well as the new x server with the old radeon driver work fine. that means that there must be something that is only executed in the new radeon driver when the new x server is present. 

if you can give me something to test, for example by disabling one initialisation function after the other or something like that, so that we can track down where the bug comes from then that would be already a great help. 

Comment 70 Egbert Eich 2005-11-25 13:02:30 UTC
This sounds interesting. There are only a couple of calls into the driver.
I thought of EXA - but the driver that we use doesn't seem to use EXA as default yet - as far as I can tell from the logs.
Comment 71 JP Rosevear 2005-11-28 07:00:01 UTC
For #68, my thinkpad has an Radeon Fire GL (Mobility M10 NT Fire T2) like Roberts.

My desktop that started working fine in #51 is Radeon RV100 QY [Radeon 7000/VE]
Comment 72 Egbert Eich 2005-11-28 12:16:08 UTC
OK, thanks.
I would like to get resolution on this ASAP.
1. Matthias is going on vacation on Nov 30 (I think).
   Matthias, do you think you can come up with something before you leave?
2. I will be gone to the desktop arch meeting from Nov. 30 thru Dec. 5th 
   and I would like to take vacation in Dec. also.
If I need to look at this I need to be able to reproduce this. Since this only happens on M10's on mobile devices (as it looks like) I need such a box (it needs to arrive here on Dec. 6) as my flight will arrive on the 5th and I don't expect to make it here on that day.
Comment 73 Matthias Hopf 2005-11-28 14:36:54 UTC
Egbert,

I have this reproduced on a IBM T42p. This is a M10 NT FireGL Mobility T2 as well. However, reproducing might be difficult sometimes. Sometimes it will only crash after several minutes. It seems to easily crash on kdm startup if the system is warm. Acceleration helps to crash it, but it crashes with NoAccel as well.

Maybe something noteworthy is that I was *not* able to reproduce the bug on SL 10.0 with the new Xserver. At least it has run a complete x11perf run w/o lockup.

No, I don't think I will have a solution until tomorrow. I'll come up with a way to send the laptop to you.
Comment 75 Dirk Mueller 2005-11-28 14:45:56 UTC
this sounds like a gcc 4.0/4.1 issue then. did anyone try to recompile stable xorg-x11 with gcc 4.0?

Comment 76 Stefan Dirsch 2005-11-28 14:48:10 UTC
> this sounds like a gcc 4.0/4.1 issue then. did anyone try to recompile
> stable xorg-x11 with gcc 4.0?

No.
Comment 77 Matthias Hopf 2005-11-28 16:44:06 UTC
Ok, I couldn't reproduce the bug w/o acceleration.

Additionally, it seems now (with my latest tests) that it *IS* the dri driver
that's the culprit. I thought it is enough to NOT load "dri" and "glx"
in xorg.conf, but it seems the dri driver has been loaded anyway (I
remember something about using the ring buffer for command queuing if
dri is loaded)!

If I rename the radeon.ko module, I do not get any lockups!

So this narrows down the search range a lot.


Just a side remark:
The bug also occures if the Radeon driver is used after the Xserver had been started with the fglrx driver (for initialization).
Comment 78 Matthias Hopf 2005-11-28 18:01:23 UTC
I know now why the dri driver has already been loaded: I only used a statically bound Xorg for testing... Oops.

However, in DirK's config dri is not loaded, and he still had the effect.

This is getting strange.
Comment 79 Matthias Hopf 2005-11-29 14:58:05 UTC
One more comment:

This is not gcc 4.1 related, as I got the freeze reproduced with a statically linked X built on a 10.0 (gcc 4.0.2). Maybe it is a good idea to try compiling with gcc 3.4 first.

Egbert will continue investigation, he will receive the laptop on December 6th.
I'll be on vacation until December 27th.
Comment 80 Dirk Mueller 2005-11-29 15:04:52 UTC
I guess you forgot to reassign..
Comment 81 Stefan Dirsch 2005-11-29 15:11:10 UTC
Just nobody gets nervous. In case we can't resolve this problem, we could add a second radeon driver ("radeon_old"), which we use for the affected chipsets.
Comment 82 Matthias Hopf 2005-11-29 15:52:00 UTC
(In reply to comment #80)
> I guess you forgot to reassign..

Yes... :]

Ok, this seems to be the related X.org bugzilla entry:

https://bugs.freedesktop.org/show_bug.cgi?id=4847
Comment 83 Dirk Mueller 2005-11-29 16:02:22 UTC
lockup happens without radeon.ko as well. 

radeon.ko is not loaded by default.
Comment 84 Dirk Mueller 2005-11-29 16:11:31 UTC
  Option        "RenderAccel" "off"
  Option       "DynamicClocks" "on"
  Option       "AGPFastWrite" "off"

in Section "Device" fixes the issue for me.
Comment 85 Dirk Mueller 2005-11-29 16:36:23 UTC
hmm, no. just took longer to lock up. 

are there any further options to try?

Comment 86 Stefan Dirsch 2005-11-29 22:56:40 UTC
*** Bug 135959 has been marked as a duplicate of this bug. ***
Comment 87 Peter Bowen 2005-12-02 20:07:06 UTC
Running the latest fglrx driver also locks up after a time.  So it appears that using the radeon drivers is not a requirement.  Using xorg-x11-6.9rc2-9 and fglrx_6_9_0_SUSE101-8.19.10-1 on NLD 10 alpha 2.
Comment 88 Robert Love 2005-12-02 20:20:05 UTC
Well, I am fine running the older radeon driver on STABLE, but the included radeon driver locks up, so the problem _does_ seem to be the driver.

I have not tried the fglrx recently (last month or so), but it used to work fine.

Weird.
Comment 89 Forgotten User OS1JNCFbCX 2005-12-02 20:21:27 UTC
Hmm, interesting. On my system (Thinkpad T42p, ATI Technologies Inc M10 NT [FireGL Mobility T2] (rev 80)) the radeon driver also crashes the system immediately after X server start but fglrx (even with DRI) runs rock solid.
Comment 90 Dirk Mueller 2005-12-08 13:58:49 UTC
PING
Comment 92 Pete Goodall 2005-12-13 15:03:59 UTC
First, of all why is this bug still mared as NEW when it has been confirmed for over a month.  Second, why have we still not put the old radeon drivers in the xorg-x11-drivers-video package?  We are still getting consistent X lockups, and using the driver from Comment #48 works fine.  We have Preview 2 coming out on Thursday, and one again anyone with a laptop that needs the radeon driver will have a difficult time with this.  I know that Mathias is on vacation, but is there no one else that can do this?

I cannot bump the priority of this bug any higher.  It is already at "Blocker".  What does it take to get the workaround implemented until Mathias returns to fix the radeon driver?
Comment 93 Stefan Dirsch 2005-12-13 15:23:30 UTC
1) Usually Egbert never sets his bugs to assigned. This doesn't mean anything!
   He's on vacation until the end of this year as well.
2) Only a few Radeon chipsets are affected!
3) The "old" radeon driver has different bugs, which will pop up again when
   using it for Alpha4.

I agree that we have a problem here, but there is no solution I can see here
for Alpha4. I would propose to document this in the release notes of Alpha4.

---
Use "fbdev" driver instead of "radeon" driver, when you get consistent X lockups.

  --> sax2 -r -m 0=fbdev

This problem is currently investigated.
---
Comment 94 Stefan Dirsch 2005-12-14 22:59:32 UTC
I've added now a "atiold/atimiscold/r128old/radeonold" driver combo to the xorg-x11-driver-video package (this driver is a bit special when it comes to duplicate it!). You can configure the old radeon driver with 

  sax2 -r -m 0=radeonold

Of course this is to late for Alpha4.
Comment 95 Dirk Mueller 2005-12-19 11:49:44 UTC
http://lists.freedesktop.org/archives/xorg/2005-December/011678.html

Stefan, could you build an updated xorg-x11 with this patch included? It could have some effect on this bug..

Comment 96 Dirk Mueller 2005-12-19 11:50:30 UTC
ah, sndirsch is on holidays as well.. *sigh*
Comment 98 Dirk Mueller 2006-01-03 10:42:06 UTC
the upstream patch works just great here. egberts patch not tested. 

Comment 99 Stefan Dirsch 2006-01-03 14:33:20 UTC
Created attachment 61888 [details]
Patch mentioned by dmuelr

This patch is by Benjamin Herrenschmidt and has been adjusted to our sources by dmueller. See
http://lists.freedesktop.org/archives/xorg/2005-December/011678.html 
for details.
Comment 100 Stefan Dirsch 2006-01-03 15:02:36 UTC
Unfortunately Egbert's patch cannot be applied/adjusted easily on top of Benjamin's/Dirk's patch. It might be already obsolete or handles the stuff completely differently and therefore cannot/should not be merged with it I think. Egbert, could you have a look at Benjamin's/Dirk's patch? Thanks. for now I will only apply Benjamin's/Dirk's patch.
Comment 101 Egbert Eich 2006-01-03 15:21:51 UTC
This patch looks OK. It still does not explain why the engine stops working a long time after the registers have been reprogrammed.
Comment 102 Egbert Eich 2006-01-03 15:23:12 UTC
Yeah, go ahead. I will still hold the company accountable for the two days of vacation that I wasted on this.
Comment 103 Stefan Dirsch 2006-01-03 16:30:47 UTC
A new xorg-x11-driver-video package with Benjamin's/Dirk's patch is now submitted for STABLE (Beta1). Egbert, it would be good to know whether this patch fixes the problem on your notebook as well.
Comment 104 Pete Goodall 2006-01-06 15:38:05 UTC
I have been using the new drivers for about two days now, and have not had any lockups.  Afaict tell everything works great.  Dual-head mode now works very well.  Will need to test powermanagement as we were nearly there at one point.  Thank you  all very much for getting this done.  This will help tremendously in Preview 4.
Comment 105 Matthias Hopf 2006-01-11 11:49:32 UTC
Egbert, do the packages from STABLE fix the issue for your laptop as well?
Comment 106 Dirk Mueller 2006-01-17 08:42:34 UTC
lets consider the issue fixed. 
Comment 107 Stefan Dirsch 2006-01-17 09:44:14 UTC
Why? Egbert did not test it yet.
Comment 108 Egbert Eich 2006-01-20 12:39:30 UTC
I've tested but not understood yet, why it fixes the problem.
Given the limited amount of testing I can give this compared to the others here you should not wait for my word on this.
I will however dig into it to understand better what exactly it does differently.
Comment 109 Dirk Mueller 2006-01-20 13:19:18 UTC
interested parties are welcome to reopen the bug report if they find further problems. And it seems the guy I stole the patch from is pretty active upstream, so it will be solved by upstream anyway. 

Comment 110 Stefan Dirsch 2006-01-26 17:06:44 UTC
*** Bug 145735 has been marked as a duplicate of this bug. ***
Comment 111 Stefan Dirsch 2006-01-26 17:07:29 UTC
*** Bug 145596 has been marked as a duplicate of this bug. ***
Comment 112 Stefan Dirsch 2006-01-26 17:08:06 UTC
time to reopen. To many still open bugs. :-(
Comment 113 JP Rosevear 2006-01-26 20:55:56 UTC
FWIW radeonold has not hardlocked in 6+ hours.  radeon locked 8-10 times in about 4 hours of use.
Comment 114 Dirk Mueller 2006-01-27 12:49:16 UTC
sure, we know already that radeonold works around the issue. It doesn't fix it though. 
Comment 115 Stefan Dirsch 2006-01-27 13:43:08 UTC
Benjamin released some new patches, but I didn't find the time yet to look at them. :-(
Comment 116 Stefan Dirsch 2006-01-29 11:59:43 UTC
Created attachment 65552 [details]
New radeon driver RPM with latest patch by Benjamin

- radeon-memmap-7.0-2.diff 
  (http://gate.crashing.org/~benh/)
  * obsoletes p_radeon-memmap.diff/
    p_radeon-memmap-fix.diff (Bug #127757)

Please test!
Comment 117 Dirk Mueller 2006-01-30 14:02:59 UTC
looking good so far..
Comment 118 Stefan Dirsch 2006-01-30 14:22:26 UTC
Thanks. I've submitted this patch now for STABLE/Beta3.
Comment 119 Stefan Dirsch 2006-01-30 20:29:00 UTC
I'll take it again. Setting to critical ...
Comment 120 Stefan Dirsch 2006-01-30 20:30:20 UTC
Now finally ...
Comment 121 Stefan Dirsch 2006-01-31 13:42:42 UTC
Created attachment 65848 [details]
New Radeon driver RPM for x86_64
Comment 122 Stefan Dirsch 2006-02-06 20:44:32 UTC
I decided to go back to the old radeon driver fixes by B. Herrenschmidt. In case the driver still does not work for you whereas "radeonold" does, please contact me. I'll add then your gfx card to database to use this driver instead. I'll attach RPMs for testing.
Comment 123 Stefan Dirsch 2006-02-06 21:03:17 UTC
Created attachment 66610 [details]
xorg-x11-driver-video-6.9.0-14.i586.rpm
Comment 124 Stefan Dirsch 2006-02-06 21:04:01 UTC
Created attachment 66611 [details]
xorg-x11-driver-video-6.9.0-14.x86_64.rpm
Comment 125 Stefan Dirsch 2006-02-06 21:04:52 UTC
Closing as FIXED.
Comment 126 Stefan Dirsch 2006-02-11 10:55:34 UTC
I likely will try another updated patch to adress the radeon driver issues. Please subscribe to Bug #150146 if you're interested in testing the new driver. (No, I don't want reopen this bugreport (with more than 120 comments)) again.)
Comment 127 Egbert Eich 2006-02-20 11:30:01 UTC
Just for the record: 
the change that really fixed the lockups was the switch from RADEON_CONFIG_APER_SIZE to RADEON_CONFIG_MEMSIZE for the calculation of mc_fb_location. 
This can be oberved in attachment #61888 [details] for example.
Comment 128 Joe Harmon 2006-02-21 18:56:06 UTC
Seems like the old driver made it into beta 4, but that we are back to the lockups in beta 4.2 of NLD 10 again.
Comment 129 Stefan Dirsch 2006-02-21 20:56:55 UTC
Which old driver? Why do you think we have an old radeon driver in Beta4? NLD10 Beta 4.2? Never heard of such a thing ...
Comment 131 Stefan Dirsch 2006-02-21 21:08:58 UTC
what's the latest RPM changelog entry in xorg-x11-driver-video of Beta4 and Beta 4.2? I would like to see which patch could be responsible for this.
Comment 132 Joe Harmon 2006-02-21 21:19:29 UTC
Stefen

Where do I find the latest change log for these?
Comment 133 Stefan Dirsch 2006-02-21 21:24:10 UTC
"rpm --changelog -qp xorg-x11-driver-video<whatever>.rpm", but I have no idea where to find these RPMs for NLD. :-(
Comment 134 Joe Harmon 2006-02-21 22:44:35 UTC
Here you go. This is from beta 4.2. The rpm from beta 4 ended with the Feb 15th message. So everything after that was from the 4.2 rpm.

rpm --changelog -qp xorg-x11-driver-video-6.9.0-19.i586.rpm
* Fri Feb 17 2006 - sndirsch@suse.de

- p_mga.diff:
  * fixes graphics corruption on Mystique revision 2 (Bug #151306)
- xf86-video-ati-disable-mc-clients.diff:
  * fixes video corruption on ATI ES1000 (Bug #151631)

* Fri Feb 17 2006 - sndirsch@suse.de

- rn50-pixelclock-limit.diff:
  * Limit pixel clock to model memory bandwidth limits
  (Bug #139361, X.Org Bug #5766)

* Thu Feb 16 2006 - sndirsch@suse.de

- radeon-memmap-7.0-3.diff:
  * obsoletes p_radeon-memmap.diff/p_radeon-memmap-fix.diff
  (Bug #150146)

* Thu Feb 16 2006 - sndirsch@suse.de

- avoid_random_bogus_dma.diff:
  * avoids random bugs bugs (Bug #140420, comments #45/46/47/48)
  * obsoletes r300-sn2-piowrite_v7.diff
  * this also happened on non-ia64 machines, so apply it everywhere

* Wed Feb 15 2006 - sndirsch@suse.de

- p_bug148696.diff:
  * fixes regression bug #148696 (IA64 related)
Comment 135 Stefan Dirsch 2006-02-22 06:30:23 UTC
Ok. Did you ever check the RPMs I added to Bug #150146? Probably this change did break it again for you. :-(

* Thu Feb 16 2006 - sndirsch@suse.de
- radeon-memmap-7.0-3.diff:
  * obsoletes p_radeon-memmap.diff/p_radeon-memmap-fix.diff
  ( Bug #150146)
Comment 136 Joe Harmon 2006-02-22 15:38:20 UTC
The only rpm that I ever tried was the one from comment #123 in this bug, and that was with beta 2 because I was having the problems with beta 2 not working properly. But beta 2 wasn't locking up on me, it just wasn't allowing me to adjust the resoltion properly. I was told that it was locking up for other people. I was also told that this was then fixed in beta 4. I was able to work around the issue with the the rpm from comment #123. Then with beta 4 I was able to adjust and test the resolution. Then with beta 4.2 I started experiencing the lockups and could only get around it with the switch to revert back to the old radeon driver. However I didn't need to put the old rpm on before using the switch, which I did have to do with beta 2.
Comment 137 Stefan Dirsch 2006-02-22 15:51:05 UTC
Meanwhile I got a bugreport, that T41p no longer works (Bug #152473). Maybe you want to subscribe to this bugreport. I'm currently attaching RPMs for testing to this bugreport.
Comment 138 Joe Harmon 2006-02-22 17:20:07 UTC
Yep, I will continue in that bug. Thanks