Bug 213840

Summary: Wrong module loaded for agp support
Product: [openSUSE] openSUSE 10.2 Reporter: Tamás Cservenák <t.cservenak>
Component: X.OrgAssignee: Christian Zoz <zoz>
Status: VERIFIED FIXED QA Contact: Stefan Dirsch <sndirsch>
Severity: Normal    
Priority: P2 - High CC: sndirsch
Version: Alpha 5 plus   
Target Milestone: ---   
Hardware: i686   
OS: Other   
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: X.Org config
X.Org log (current session, still running)
hwinfo -gfx
X.Org packages installed (rpm -qa | grep xorg | sort)
hwinfo-all
X.Org log (working)
dmesg of boot into init 3, with wrong module load order
lsmod after boot into init 3, with wrong module load order
dmesg after modifications (rmmod insmod in proper ordr)
lsmod after modifications (rmmod insmod in proper ordr)
modules.alias
dmesg - with Stefan comment #25
lsmod - with Stefan comment #25
Xorg.0.log - with Stefan comment #25

Description Tamás Cservenák 2006-10-19 22:34:55 UTC
X driver "Radeon" cannot initialize AGP, thus it drops DRI support for screen.

The AGP modules agpgart and intel_agp loads propely and are loaded before starting X.

(BTW: the same is in effect with latest ATI fglrx driver. AGP related modules are loaded, but fglrx cannot initialize AGP, thus enables 2D accel only, 3D not.)
Comment 1 Tamás Cservenák 2006-10-19 22:35:56 UTC
Created attachment 102088 [details]
X.Org config
Comment 2 Tamás Cservenák 2006-10-19 22:37:19 UTC
Created attachment 102089 [details]
X.Org log (current session, still running)
Comment 3 Tamás Cservenák 2006-10-19 22:41:07 UTC
Created attachment 102091 [details]
hwinfo -gfx
Comment 4 Tamás Cservenák 2006-10-19 22:43:18 UTC
Created attachment 102092 [details]
X.Org packages installed (rpm -qa | grep xorg | sort)
Comment 5 Stefan Dirsch 2006-10-20 01:47:36 UTC
> (WW) RADEON(0): [agp] AGP not available
> (EE) RADEON(0): [agp] AGP failed to initialize. Disabling the DRI.

Could you check with testgart (same package name) if agp really doesn't work?
BTW, you're setting pretty optimistic options:

  Option "AGPMode" "8"
  Option "AGPFastWrite" "on"

I would propose to try without them first.
Comment 6 Tamás Cservenák 2006-10-20 02:10:27 UTC
>  Option "AGPMode" "8"
>  Option "AGPFastWrite" "on"
>
> I would propose to try without them first.

No go, same error as in attached xorg log.


TESTGART:

five:~ # rpm -qa | grep testgart
testgart-0.1-209
five:~ #
five:~ # testgart
open: No such file or directory
five:~ #
five:~ # cat /proc/mtrr
reg00: base=0x00000000 (   0MB), size=1024MB: write-back, count=1
reg01: base=0xd0000000 (3328MB), size= 128MB: write-combining, count=1
five:~ #

I do not know testgart, and google did not helped me. Is the result NEGATIVE? Some params maybe?

Attached full hwinfo output.
Comment 7 Tamás Cservenák 2006-10-20 02:12:07 UTC
Created attachment 102098 [details]
hwinfo-all
Comment 8 Tamás Cservenák 2006-10-20 02:29:20 UTC
dmesg upon init 5 (from 3)

Linux agpgart interface v0.101 (c) Dave Jones
[drm] Initialized drm 1.0.1 20051102
[drm] Initialized radeon 1.25.0 20060524 on minor 0
mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x8000000
mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x8000000
mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x8000000
[drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held
[drm:drm_unlock] *ERROR* Process 9890 using kernel context 0


Modules loaded:
five:~ # lsmod
Module                  Size  Used by
radeon                109216  0
drm                    71828  1 radeon
agpgart                35656  1 drm
...

Module intel-agp is NOT loaded.

No "agpgart: Detected an Intel xxx Chipset." agpgart output in dmesg.
Comment 9 Tamás Cservenák 2006-10-20 02:35:50 UTC
And one final note: device

/dev/agpgart

DOES not exists.

This HW was running on openSUSE 10.1 succesfully using fglrx without problem (AGP was working). The system was UPDATED yesterday from 10.1 to 10.2 Factory.
Comment 10 Stefan Dirsch 2006-10-20 03:03:21 UTC
Hmm. Does it help to load intel-agp manually? What's the output in dmesg then?
Andy, could you help investigate here?

PCI_DEVICE_ID_INTEL_82875_HB (0x2578) is registered to be supported via intel-agp.
Comment 11 Stefan Dirsch 2006-10-20 03:11:02 UTC
Hell, what's this?

16: PCI(AGP) 00.0: 0600 Host bridge
  [Created at pci.281]
  UDI: /org/freedesktop/Hal/devices/pci_8086_2578
  Unique ID: qLht.On0fbN4EXAA
  SysFS ID: /devices/pci0000:00/0000:00:00.0
  SysFS BusID: 0000:00:00.0
  Hardware Class: bridge
  Model: "ABIT 82875P/E7210 Memory Controller Hub"
  Vendor: pci 0x8086 "Intel Corporation"
  Device: pci 0x2578 "82875P/E7210 Memory Controller Hub"
  SubVendor: pci 0x147b "ABIT Computer Corp."
  SubDevice: pci 0x1022 
  Revision: 0x02
  Driver: "i82875p_edac"
  Driver Modules: "i82875p_edac"
  Memory Range: 0xf0000000-0xf7ffffff (rw,prefetchable)
  Module Alias: "pci:v00008086d00002578sv0000147Bsd00001022bc06sc00i00"
  Driver Info #0:
    Driver Status: i82875p_edac is active
    Driver Activation Cmd: "modprobe i82875p_edac"
  Driver Info #1:
    Driver Status: intel_agp is not active
    Driver Activation Cmd: "modprobe intel_agp"
  Config Status: cfg=no, avail=yes, need=no, active=unknown

Please try to unload i82875p_edac and load intel-agp instead.
Comment 12 Tamás Cservenák 2006-10-20 03:25:49 UTC
five:~ # init 3
five:~ # lsmod | grep edac
i82875p_edac           10500  0
edac_mc                26704  1 i82875p_edac
five:~ # rmmod i82875p_edac edac_mc
five:~ # modprobe intel_agp
five:~ # dmesg
...EDAC MC: Removed device 0 for i82875p_edac i82875p: DEV 0000:00:00.0
agpgart: Detected an Intel i875 Chipset.
agpgart: AGP aperture is 128M @ 0xf0000000

Blacklisted module i82875p_edac, reboot.

Xorg.log:

(II) RADEON(0): Using 8 MB GART aperture
(II) RADEON(0): Using 1 MB for the ring buffer
(II) RADEON(0): Using 2 MB for vertex/indirect buffers
(II) RADEON(0): Using 5 MB for GART textures
(II) RADEON(0): Memory manager initialized to (0,0) (1280,8191)
(II) RADEON(0): Reserved area from (0,1024) to (1280,1026)
(II) RADEON(0): Largest offscreen area available: 1280 x 7165
(II) RADEON(0): Will use back buffer at offset 0x1400000
(II) RADEON(0): Will use depth buffer at offset 0x1900000
(II) RADEON(0): Will use 100352 kb for textures at offset 0x1e00000
(**) RADEON(0): Initializing backing store
(==) RADEON(0): Backing store disabled
(**) RADEON(0): DRI Finishing init !
(II) RADEON(0): X context handle = 0x1
(II) RADEON(0): [drm] installed DRM signal handler
(II) RADEON(0): [DRI] installation complete
(**) RADEON(0): EngineRestore (32/32)
(II) RADEON(0): [drm] Added 32 65536 byte vertex/indirect buffers
(II) RADEON(0): [drm] Mapped 32 vertex/indirect buffers
(II) RADEON(0): [drm] dma control initialized, using IRQ 185
(II) RADEON(0): [drm] Initialized kernel GART heap manager, 5111808
(WW) RADEON(0): DRI init changed memory map, adjusting ...
(WW) RADEON(0):   MC_FB_LOCATION  was: 0xd7ffd000 is: 0xd7ffd000
(WW) RADEON(0):   MC_AGP_LOCATION was: 0xffffffc0 is: 0xf07ff000
(**) RADEON(0): GRPH_BUFFER_CNTL from 30004c4c to 20217c7c
(II) RADEON(0): Direct rendering enabled

SUCCESS!

Attached good Xorg.log.

Apropos, what are i82875p_edac and edac_mc? Some unsupported HW that is being recognized by 10.2?

Comment 13 Tamás Cservenák 2006-10-20 03:27:00 UTC
Created attachment 102101 [details]
X.Org log (working)
Comment 14 Stefan Dirsch 2006-10-20 09:00:08 UTC
Steffen, see my comment #11. i82875p_edac is a kernel mdoule for core system error reporting. 

What we need instead is the intel-agp module. Where do we need to fix this issue? hwinfo, kernel or do we need to blacklist this module (how is this done?) ?
Comment 15 Steffen Winterfeldt 2006-10-20 10:43:14 UTC
I guess the idea is to load both modules, probably via some tricky with
/etc/modprobe.d/.
Comment 16 Stefan Dirsch 2006-10-20 11:00:52 UTC
zoz told me that loading both modules should be the default. Tamás, could you please try 

  modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias)

and add the output? Thanks.
Comment 17 Tamás Cservenák 2006-10-20 11:26:23 UTC
five:~ # modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias)
insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko 
insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko 

If you look at initial posts, you will see that i82875p_edac module loaded properly on my system, but it seems to me that the load order is problematic. Per default (before blacklisting it) it was loaded automatically, but it "hided" (diabled?) AGP from intel-agp.

Now, WITH changes from comment #12, so intel-agp LOADED and AGP is working properly (init is 5) I executed:
five:~ # modprobe i82875p_edac

dmesg:
edac_mc: module not supported by Novell, setting U taint flag.
EDAC MC: Ver: 2.0.1 Oct  9 2006
i82875p_edac: module not supported by Novell, setting U taint flag.
EDAC i82875p: i82875p init one
PCI: Unable to reserve mem region #1:1000@fecf0000 for device 0000:00:06.0
EDAC MC0: Giving out device to i82875p_edac i82875p: DEV 0000:00:00.0

It seems like OK.

As you see before, in comment #8, when I loaded intel-agp WITH LOADED i82875p_edac, dmesg was silent about AGP detection (and no /dev/agpgart existed), like AGP was disabled and I could not "fire it up", only with blacklisting i82875p_edac and rebooting.
Comment 18 Stefan Dirsch 2006-10-20 12:15:23 UTC
> five:~ # modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias)
> insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko 
> insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko 
IMHO 

  insmod lib/modules/2.6.18-11-default/kernel/drivers/char/agp/intel-agp.ko

is missing here. 

# grep 2578 modules.alias
modules.alias:alias pci:v00008086d00002578sv*sd*bc*sc*i* i82875p_edac
modules.alias:alias pci:v00008086d00002578sv*sd*bc06sc00i00* intel_agp

I'm not sure what's wrong here. Is it modprobe or pci hotplug?
Comment 19 Tamás Cservenák 2006-10-20 12:26:09 UTC
It seems that the changed module load order solves the problem.

Use case:
1. removed module i82875p_edac from blacklist, thus reverted system to
unmodified openSUSE 10.2 Alpha 5, reboot

2. rebooted to init 3, dmesg, lsmod attached. All modules loaded (order was
i82875p_edaC, intel-agp), looks OK but /dev/agpgart not exists.

3. rmmod i82875p_edac intel-agp agpgart

4. modprobe intel-agp. /dev/agpgart appears, dmesg contains lines about AGP
discovery.

5. modprobe i82875p_edac. dmesg contains EDAC lines, seems all OK.

6. init 5, AGP works OK, radeon initialized DRI.
Comment 20 Tamás Cservenák 2006-10-20 12:28:15 UTC
Created attachment 102123 [details]
dmesg of boot into init 3, with wrong module load order
Comment 21 Tamás Cservenák 2006-10-20 12:28:47 UTC
Created attachment 102124 [details]
lsmod after boot into init 3, with wrong module load order
Comment 22 Tamás Cservenák 2006-10-20 12:29:25 UTC
Created attachment 102125 [details]
dmesg after modifications (rmmod insmod in proper ordr)
Comment 23 Tamás Cservenák 2006-10-20 12:29:50 UTC
Created attachment 102126 [details]
lsmod after modifications (rmmod insmod in proper ordr)
Comment 24 Tamás Cservenák 2006-10-20 12:30:52 UTC
> # grep 2578 modules.alias

Hmm, i cannot locate modules.alias.

I repeat, this system is UPGRADED from openSUSE 10.1 to openSUSE 10.2 Alpha 5....
Maybe the upgrade is broken?
Comment 25 Stefan Dirsch 2006-10-20 12:37:58 UTC
> I'm not sure what's wrong here. Is it modprobe or pci hotplug?
Question is still open.

Looks like we need a special modprobe rule in /etc/modprobe.d. Does this one work for you?

cat > /etc/modprobe.d/xorg-x11-driver-video << EOF
# Load i82875p_edac after intel-agp
# should be one line !!! (stupid bugzilla!)
install i82875p_edac /sbin/modprobe -i intel-agp && /sbin/modprobe i82875p_edac
EOF
Comment 26 Stefan Dirsch 2006-10-20 12:38:48 UTC
> > # grep 2578 modules.alias
>
> Hmm, i cannot locate modules.alias.
This is in /lib/modules/<kernel-version>
Comment 27 Stefan Dirsch 2006-10-21 08:46:04 UTC
(In reply to comment #18)
> > five:~ # modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias)
> > insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko 
> > insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko 
> IMHO 
> 
>   insmod lib/modules/2.6.18-11-default/kernel/drivers/char/agp/intel-agp.ko
> 
> is missing here. 

If intel-agp is already loaded, then modprobe -vn will not show it,
because it does not need to load it anymore. Use --show-depends
additionally in this case:
modprobe -vn --show-depends \ $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias)

> # grep 2578 modules.alias
> modules.alias:alias pci:v00008086d00002578sv*sd*bc*sc*i* i82875p_edac
> modules.alias:alias pci:v00008086d00002578sv*sd*bc06sc00i00* intel_agp
> 
> I'm not sure what's wrong here. Is it modprobe or pci hotplug?

What does the modalias look like?

Comment 28 Tamás Cservenák 2006-10-23 18:30:45 UTC
This looks OK to me:

five:~ # modprobe -vn --show-depends $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias)
insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko 
insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko 
insmod /lib/modules/2.6.18-11-default/kernel/drivers/char/agp/agpgart.ko 
insmod /lib/modules/2.6.18-11-default/kernel/drivers/char/agp/intel-agp.ko 

Will attach modalias.

Will try this too:
> cat > /etc/modprobe.d/xorg-x11-driver-video << EOF
> # Load i82875p_edac after intel-agp
> # should be one line !!! (stupid bugzilla!)
> install i82875p_edac /sbin/modprobe -i intel-agp && 
> /sbin/modprobe > i82875p_edac
> EOF
Comment 29 Tamás Cservenák 2006-10-23 18:37:46 UTC
Created attachment 102346 [details]
modules.alias
Comment 30 Tamás Cservenák 2006-10-23 21:57:01 UTC
While away, system was updated to upcoming openSUSE 10.2 Beta1 (last issue in chagelog is dated "2006-10-20").
Comment 31 Tamás Cservenák 2006-10-23 22:02:29 UTC
Tested Stefan's suggestion:
> cat > /etc/modprobe.d/xorg-x11-driver-video << EOF
> # Load i82875p_edac after intel-agp
> # should be one line !!! (stupid bugzilla!)
> install i82875p_edac /sbin/modprobe -i intel-agp && 
> /sbin/modprobe > i82875p_edac
> EOF

It _partially_ works: the boot procedure HANGS for about 1,5-2 minutes after showing line 344 ("ieee1394: Host added...") but proceeds normally after this pause. intel-agp gets loaded (see attached lsmod), but now i82875p_edac module does not loads. X succesfully initializes AGP.

Attached files with "-xorg-x11-driver-video" suffix: dmesg, lsmod, Xorg.0.log
Comment 32 Tamás Cservenák 2006-10-23 22:03:45 UTC
Created attachment 102376 [details]
dmesg - with Stefan comment #25
Comment 33 Tamás Cservenák 2006-10-23 22:04:47 UTC
Created attachment 102377 [details]
lsmod - with Stefan comment #25
Comment 34 Tamás Cservenák 2006-10-23 22:05:39 UTC
Created attachment 102378 [details]
Xorg.0.log - with Stefan comment #25
Comment 35 Stefan Dirsch 2006-10-24 04:00:53 UTC
Please attach /etc/modprobe.d/xorg-x11-driver-video. Probably you simply did a cut & paste, which would explain the hang.
Comment 36 Olaf Hering 2006-10-24 06:29:24 UTC
two drivers cant bind to the same device.
does anyone seriously use edac?
CONFIG_EDAC=n is the correct fix.
Comment 37 Tamás Cservenák 2006-10-24 11:54:22 UTC
(In reply to comment #35)
> Please attach /etc/modprobe.d/xorg-x11-driver-video. Probably you simply did a
> cut & paste, which would explain the hang.
> 

Yes, I realized it, my current /etc/modprobe.d/xorg-x11-driver-video (added "-i" after second modprobe). It works without a hang.

install i82875p_edac /sbin/modprobe -i intel-agp ; /sbin/modprobe -i i82875p_edac
Comment 38 Tamás Cservenák 2006-10-24 12:01:31 UTC
(In reply to comment #36)
> two drivers cant bind to the same device.
> does anyone seriously use edac?
> CONFIG_EDAC=n is the correct fix.
> 

I think this statement is corrent. The i82875p_edac even if loaded now with Stefans fix (comment #25) does not echoes nothing in dmesg. It simply gets loaded but not activated, as in initial case. 

When EDAC echoes itself (gets activated), then AGP is not activated.
Comment 39 Stefan Dirsch 2006-10-24 12:12:28 UTC
Ok. I will add /etc/modprobe.d/xorg-x11-driver-video to the xorg-x11-driver-video package to make sure i82875p_edac no longer makes any trouble. Not sure, if we really should disable the build of i82875p_edac or even edac in general. Maybe someone wants to use it. Greg, what do you think?

Short summary for Greg:
intel-agp/i82875p_edac conflict. i82875p_edac is preferred. Therefore intel-agp doesn't work. Should we disable the build of 82875p_edac or even edac in general? For now I will add a workaround to make sure intel-agp is loaded before i82875p_edac.
Comment 40 Stefan Dirsch 2006-10-24 13:09:33 UTC
workaround enabled for buildservice and STABLE (openSUSE 10.2 Beta2).
Comment 41 Christian Zoz 2006-10-24 13:14:43 UTC
What about
   blacklist i82875p_edac
somewhere in modprobe.conf?
Comment 42 Stefan Dirsch 2006-10-24 13:24:01 UTC
somewhere means what?
Comment 43 Christian Zoz 2006-10-24 13:48:26 UTC
modprobe.d/<somewhere>

If any of your package already owns scuh a file you may add it there. Or add it to modprobe.d/blacklist.
   agadez:~ # rpm -qf /etc/modprobe.d/blacklist
   sysconfig-0.50.9-13.8

Assign the bug to me if i should add it.
Comment 44 Stefan Dirsch 2006-10-24 13:56:38 UTC
So you think I should replace

  install i82875p_edac /sbin/modprobe -i intel-agp && /sbin/modprobe i82875p_edac

in /etc/modprobe.d/xorg-x11-driver-video with

  blacklist i82875p_edac 

?
Comment 45 Andreas Kleen 2006-10-24 14:09:09 UTC
Blacklisting EDAC is the right solution right now. Should probably write
a script to check for more such duplicated IDs too and blacklist the less
important one too.

Longer term PCI subsystem needs to be fixed to allow this kind of coexistence.
I guess that is something for Greg.
Comment 46 Christian Zoz 2006-10-24 14:23:00 UTC
to comment 44:

Yes. Or let me add it to the blacklist file.
Comment 47 Stefan Dirsch 2006-10-24 14:27:41 UTC
Ok. I propose to add it to the general blacklist file.
Comment 48 Stefan Dirsch 2006-10-24 14:28:37 UTC
And I will remove /etc/modprobe.d/xorg-x11-driver-video again.
Comment 49 Stefan Dirsch 2006-10-25 13:13:11 UTC
(In reply to comment #48)
> And I will remove /etc/modprobe.d/xorg-x11-driver-video again.

done.

Comment 50 Christian Zoz 2006-10-26 09:14:26 UTC
added module to blacklist in svn. WIll be in next beta.