Bug 327064

Summary: intel: xserver crashes machine during VT switch after having used xrandr once
Product: [openSUSE] openSUSE 11.0 Reporter: Dirk Mueller <dmueller>
Component: X.OrgAssignee: Stefan Dirsch <sndirsch>
Status: RESOLVED FIXED QA Contact: E-mail List <xorg-maintainer-bugs>
Severity: Critical    
Priority: P5 - None CC: coolo, dmueller, eich, felix, sndirsch
Version: Alpha 2Flags: coolo: SHIP_STOPPER-
Target Milestone: Alpha 0   
Hardware: i386   
OS: Other   
Whiteboard:
Found By: Development Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Bug Depends on: 282256    
Bug Blocks:    
Attachments: Xorg.0.log
commit-eecd3cc.diff

Description Dirk Mueller 2007-09-21 10:04:46 UTC
the intel driver crashes randomly just before suspending, which is due to suspend switching to text console. switching to text console is enough after having used xrandr at least once. 


+++ This bug was initially created as a clone of Bug #282256 +++
Comment 1 Stefan Dirsch 2007-09-21 10:23:16 UTC
So could you attach config and logfile? Especially the lines in logfile during VT switch?
Comment 2 Dirk Mueller 2007-09-21 11:24:30 UTC
in which logfile?
Comment 3 Luc Verhaegen 2007-09-21 12:00:38 UTC
Is this an Xorg bug or not?

If it is an Xorg bug, then the logfile referred to is /var/log/Xorg.0.log

If this is not an Xorg bug, then do reassign.
Comment 4 Dirk Mueller 2007-09-21 12:03:58 UTC
there is no logfile, the machine locks solid. 
Comment 5 Luc Verhaegen 2007-09-21 12:09:28 UTC
We are not asking for a log of the machine hanging, just a log of your X working so that we can see what it is that you are actually running and which subsystems you are using.
Comment 6 Dirk Mueller 2007-09-21 12:31:36 UTC
Created attachment 173858 [details]
Xorg.0.log
Comment 7 Felix Möller 2007-09-21 13:16:42 UTC
Dirk does this by any chance look like my screenshot of the initial bug #301101 ? (i.e. a kind of yellow screen?)
Comment 8 Stefan Dirsch 2007-09-21 14:38:55 UTC
Reassigning to Egbert.
Comment 11 Forgotten User ZhJd0F0L3x 2007-09-24 07:43:50 UTC
I gave a hp compaq nx5000 which was pretty plagued by this error to the Xorg team, so it should be in Luc's office. Egbert played around with it on Thursday.
Comment 12 Stefan Dirsch 2007-09-24 14:05:54 UTC
It is quite hard to reproduce this issue on the hp compaq nx5000 and according to Egbert it's running very unstable (unrelated to this issue).
Comment 13 Egbert Eich 2007-09-26 11:27:59 UTC
This problem seems to only surface under very special conditions. Dirk and Stefan tried to reproduce it and it looks like it only happens when the VT is switched after pulling the power plug. 
Stefan has tried several other scenarions but was unable to lock up this system.
Since the situatlions under which this error occurs are so special the current severity doesn't seem to be justified.
This bug is somewhat along the line of the infamous NX5000 bug: #290219.
Comment 14 Dirk Mueller 2007-09-26 11:33:14 UTC
so using a laptop without AC power plugged in is a special scenario?
Comment 15 Egbert Eich 2007-09-26 12:26:43 UTC
Dirk, Stafan tried that. He was only able to trigger the problem when he vt-switched within a second or so after pulling the plug.
Comment 16 Stefan Dirsch 2007-09-26 12:32:03 UTC
Dirk, the reason to set this to LATER is just the amount of time this would take to investigate. It's not possible for 10.3 any more. Please do not take this personal. This issue won't be forgotten. I appreciate the time you invested yesterday to show me how to reproduce this issue. I could reproduce this issue only once today by suspending the machine before, but no longer remember the details. :-(
Comment 17 Dirk Mueller 2007-09-28 14:13:04 UTC
this is not the only circumstance when it happens, it merely makes it easier to trigger. yesterday my machine crashed twice even though the power was connected all the time. 

this is unacceptable, so lets restart the bug for 11.0.
Comment 18 Egbert Eich 2007-10-08 09:10:31 UTC
Dirk, I understand that this one is annoying. I will eventually look at it. But it's really hard to fix without a reliable way to trigger it.
Also systems hands are difficult to understand without intimate knowledge of the system itself.
We can try to find out where it happens. Does this box still have a parallel port?
Then we can attach a tool and scatter outb()s all over the code until we have narrowed down where it happens.
Otherwise this is hard to find if the system is dead afterwards.
Comment 20 Dirk Mueller 2007-10-15 12:56:41 UTC
ok, two weeks later and still nothing has happened. 

some observations: 

a) it happens even without pluggin/unplugging AC

b) DISPLAY=:0 xrandr trashes display when run from text mode while X server is running under :0. this only happens with intel, other chipsets work fine. it doesn't seem allright that the X server trashes my display on that command. 

c) smells like something is uninitialized. some X server runs are remarkably stable. I can switch at least a dozen time between console/X without a hitch. Somtimes its extremely unstable, and the very first switch after booting already hangs the machine. 

d) running an arbitrary application that keeps  the X server saturated with drawing seems like a good way to provoke the hang during vt switch. 

i believe that b) could be related to the actual problem, given that KDE does intensive xrandr queries, and if it does that at the wrong time during switch from/to vt, it could cause this machine hang

my hardware doesn't have a parallel port anymore but it has firewire and USB, if that helps. 

given that this has more an ultrablocker state for me than being a normal bug (essentially on bad days I have to reboot my machine 10 times a day, with lots of unsaved work that is essentially lost), is there any chance that anyone could work on this pretty pretty please?



Comment 21 Egbert Eich 2007-10-16 11:11:02 UTC
Dirk,
I seem to recall from Stefan's investigations that this problem happens during VT switch (independently from suspend).
I'm not sure if b) is realated - but if it is it seems to be valuable hint. We has a bug report a while ago reporting a similar problem - but for some reason I list it out of sight.
Stefan, do you still have this laptop you tried to reproduce Dirk's problem with? If I may be able to look at this from remote (and your help).
Comment 22 Stefan Dirsch 2007-10-16 12:32:39 UTC
Currently I don't have this laptop, but I think I can get it back from seife without any problems.
Comment 23 Dirk Mueller 2007-10-16 15:57:24 UTC
I don't think its specific to this particular laptop though, it happens with any intel hardware I've seen. 
Comment 24 Dirk Mueller 2007-10-16 16:08:04 UTC
daniel gollub has the same issue, but has a totally different laptop (although also intel gfx card)

Comment 25 Felix Möller 2007-10-17 10:00:26 UTC
I updated to todays factory:
# cat /etc/SuSE-release
openSUSE 10.3.1 (i586) Alpha0
VERSION = 10.3.1

and it seems like the VT switching behavior really changed, I have successfully been switching for 20 times now.
Comment 26 Stefan Dirsch 2007-10-17 11:11:51 UTC
Ok. Maybe updating xorg-x11-server, xorg-x11-driver-video and xorg-x11-driver-input already fixes this issue for Dirk as well.
Comment 27 Stefan Dirsch 2007-10-22 23:52:06 UTC
Same issue?

  http://lists.freedesktop.org/archives/xorg/2007-October/029534.html
Comment 28 Dirk Mueller 2007-10-23 11:21:59 UTC
regarding comment #25/#26: no, the bug is not fixed, it still happends with xorg 7.3. 


Comment 29 Dirk Mueller 2007-10-23 13:04:04 UTC
regarding comment #27: one reboot later I can confirm that the patch only makes it more reliable to crash. 
Comment 30 Forgotten User ZhJd0F0L3x 2007-10-25 16:50:57 UTC
The machine is now actively used by Frank, so i'm not sure if we can give it to you.
Comment 31 Stefan Dirsch 2007-10-25 17:54:09 UTC
Ok. If he doesn't stumble across this issue it doesn't make sense to investigate this issue on this machine anyway. Otherwise he probably will be happy to switch to another machine, so we can have it again for investigating. ;-)
Comment 32 Dirk Mueller 2007-10-26 11:41:15 UTC
oh, was there any interest already in debugging the issue? I'd be happy to debug it here if I know how..
Comment 33 Dirk Mueller 2007-11-05 12:32:17 UTC
another week without reply... ping..
Comment 35 Stefan Dirsch 2007-11-09 03:30:10 UTC
Dirk, could you try current git of intel driver. I'm asking since a lot of VT switch fixes has been pushed upstream now.
Comment 36 Dirk Mueller 2007-11-09 11:16:11 UTC
do you have one that compiles? last time I tried it didn't work on the first try. 
Comment 37 Stefan Dirsch 2007-11-09 15:22:23 UTC
Hmm. I submitted 2.1.99 today to STABLE. It might already contain this patch.
Comment 38 Stefan Dirsch 2007-11-10 15:53:30 UTC
It contains the patch. If 2.1.99 / STABLE is not an option, please try git commit eecd3ccedee6c4acf101591f7e60673660379e62. I'll attach the patch for your convenience.

Comment 39 Stefan Dirsch 2007-11-10 15:54:22 UTC
Created attachment 182906 [details]
commit-eecd3cc.diff
Comment 41 Dirk Mueller 2007-11-12 15:42:34 UTC
ah, that gives a clue on why the previous patch was not working.. I'm testing the new one. 

Comment 42 Dirk Mueller 2007-11-12 17:09:36 UTC
I'm unable to get a crash after applying this typo fix to STABLE: 

--- src/i830_driver.c
+++ src/i830_driver.c
@@ -2056,8 +2056,8 @@
     * Make sure the DPLL is active and not in VGA mode or the
     * write of PIPEnCONF may cause a crash
     */
-   if ((pI830->saveDPLL_B & DPLL_VCO_ENABLE) &&
-       (pI830->saveDPLL_B & DPLL_VGA_MODE_DIS))
+   if ((pI830->saveDPLL_A & DPLL_VCO_ENABLE) &&
+       (pI830->saveDPLL_A & DPLL_VGA_MODE_DIS))
           OUTREG(PIPEACONF, pI830->savePIPEACONF);
    i830WaitForVblank(pScrn);
    OUTREG(DSPACNTR, pI830->saveDSPACNTR);

so indeed the "ubuntu fix" fixes my bug. any chance of a backport?
Comment 43 Stefan Dirsch 2007-11-12 17:55:29 UTC
I'll take care of this bugreport now.
Comment 44 Stefan Dirsch 2007-11-12 17:57:24 UTC
> any chance of a backport?
Would you test this, which means going back to the X.Org packages of openSUSE 10.3 + this patch?
Comment 45 Stefan Dirsch 2007-11-12 18:25:10 UTC
fixed for STABLE now.

xorg-x11-driver-video.changes:
-------------------------------------------------------------------
Mon Nov 12 19:00:52 CET 2007 - sndirsch@suse.de

- xf86-video-intel
  * updated to git commit 10988c5, which fixes Bug #327064
Comment 46 Dirk Mueller 2007-11-13 13:07:41 UTC
bad news, the 10.3+patch combo does not help. there must have been other fixes in the driver which are necessary in addition. 

however, stable is stable again. (yippie!)

Comment 47 Stefan Dirsch 2007-11-13 13:24:42 UTC
Ok. Closing as fixed for STABLE for now. We can still consider updating the intel driver for openSUSE 10.3 once driver release 2.2 is available. A lot of issues have been fixed since 2.1.1.