Bugzilla – Bug 213840
Wrong module loaded for agp support
Last modified: 2007-06-05 09:37:19 UTC
X driver "Radeon" cannot initialize AGP, thus it drops DRI support for screen. The AGP modules agpgart and intel_agp loads propely and are loaded before starting X. (BTW: the same is in effect with latest ATI fglrx driver. AGP related modules are loaded, but fglrx cannot initialize AGP, thus enables 2D accel only, 3D not.)
Created attachment 102088 [details] X.Org config
Created attachment 102089 [details] X.Org log (current session, still running)
Created attachment 102091 [details] hwinfo -gfx
Created attachment 102092 [details] X.Org packages installed (rpm -qa | grep xorg | sort)
> (WW) RADEON(0): [agp] AGP not available > (EE) RADEON(0): [agp] AGP failed to initialize. Disabling the DRI. Could you check with testgart (same package name) if agp really doesn't work? BTW, you're setting pretty optimistic options: Option "AGPMode" "8" Option "AGPFastWrite" "on" I would propose to try without them first.
> Option "AGPMode" "8" > Option "AGPFastWrite" "on" > > I would propose to try without them first. No go, same error as in attached xorg log. TESTGART: five:~ # rpm -qa | grep testgart testgart-0.1-209 five:~ # five:~ # testgart open: No such file or directory five:~ # five:~ # cat /proc/mtrr reg00: base=0x00000000 ( 0MB), size=1024MB: write-back, count=1 reg01: base=0xd0000000 (3328MB), size= 128MB: write-combining, count=1 five:~ # I do not know testgart, and google did not helped me. Is the result NEGATIVE? Some params maybe? Attached full hwinfo output.
Created attachment 102098 [details] hwinfo-all
dmesg upon init 5 (from 3) Linux agpgart interface v0.101 (c) Dave Jones [drm] Initialized drm 1.0.1 20051102 [drm] Initialized radeon 1.25.0 20060524 on minor 0 mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x8000000 mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x8000000 mtrr: 0xd0000000,0x10000000 overlaps existing 0xd0000000,0x8000000 [drm:radeon_cp_init] *ERROR* radeon_cp_init called without lock held [drm:drm_unlock] *ERROR* Process 9890 using kernel context 0 Modules loaded: five:~ # lsmod Module Size Used by radeon 109216 0 drm 71828 1 radeon agpgart 35656 1 drm ... Module intel-agp is NOT loaded. No "agpgart: Detected an Intel xxx Chipset." agpgart output in dmesg.
And one final note: device /dev/agpgart DOES not exists. This HW was running on openSUSE 10.1 succesfully using fglrx without problem (AGP was working). The system was UPDATED yesterday from 10.1 to 10.2 Factory.
Hmm. Does it help to load intel-agp manually? What's the output in dmesg then? Andy, could you help investigate here? PCI_DEVICE_ID_INTEL_82875_HB (0x2578) is registered to be supported via intel-agp.
Hell, what's this? 16: PCI(AGP) 00.0: 0600 Host bridge [Created at pci.281] UDI: /org/freedesktop/Hal/devices/pci_8086_2578 Unique ID: qLht.On0fbN4EXAA SysFS ID: /devices/pci0000:00/0000:00:00.0 SysFS BusID: 0000:00:00.0 Hardware Class: bridge Model: "ABIT 82875P/E7210 Memory Controller Hub" Vendor: pci 0x8086 "Intel Corporation" Device: pci 0x2578 "82875P/E7210 Memory Controller Hub" SubVendor: pci 0x147b "ABIT Computer Corp." SubDevice: pci 0x1022 Revision: 0x02 Driver: "i82875p_edac" Driver Modules: "i82875p_edac" Memory Range: 0xf0000000-0xf7ffffff (rw,prefetchable) Module Alias: "pci:v00008086d00002578sv0000147Bsd00001022bc06sc00i00" Driver Info #0: Driver Status: i82875p_edac is active Driver Activation Cmd: "modprobe i82875p_edac" Driver Info #1: Driver Status: intel_agp is not active Driver Activation Cmd: "modprobe intel_agp" Config Status: cfg=no, avail=yes, need=no, active=unknown Please try to unload i82875p_edac and load intel-agp instead.
five:~ # init 3 five:~ # lsmod | grep edac i82875p_edac 10500 0 edac_mc 26704 1 i82875p_edac five:~ # rmmod i82875p_edac edac_mc five:~ # modprobe intel_agp five:~ # dmesg ...EDAC MC: Removed device 0 for i82875p_edac i82875p: DEV 0000:00:00.0 agpgart: Detected an Intel i875 Chipset. agpgart: AGP aperture is 128M @ 0xf0000000 Blacklisted module i82875p_edac, reboot. Xorg.log: (II) RADEON(0): Using 8 MB GART aperture (II) RADEON(0): Using 1 MB for the ring buffer (II) RADEON(0): Using 2 MB for vertex/indirect buffers (II) RADEON(0): Using 5 MB for GART textures (II) RADEON(0): Memory manager initialized to (0,0) (1280,8191) (II) RADEON(0): Reserved area from (0,1024) to (1280,1026) (II) RADEON(0): Largest offscreen area available: 1280 x 7165 (II) RADEON(0): Will use back buffer at offset 0x1400000 (II) RADEON(0): Will use depth buffer at offset 0x1900000 (II) RADEON(0): Will use 100352 kb for textures at offset 0x1e00000 (**) RADEON(0): Initializing backing store (==) RADEON(0): Backing store disabled (**) RADEON(0): DRI Finishing init ! (II) RADEON(0): X context handle = 0x1 (II) RADEON(0): [drm] installed DRM signal handler (II) RADEON(0): [DRI] installation complete (**) RADEON(0): EngineRestore (32/32) (II) RADEON(0): [drm] Added 32 65536 byte vertex/indirect buffers (II) RADEON(0): [drm] Mapped 32 vertex/indirect buffers (II) RADEON(0): [drm] dma control initialized, using IRQ 185 (II) RADEON(0): [drm] Initialized kernel GART heap manager, 5111808 (WW) RADEON(0): DRI init changed memory map, adjusting ... (WW) RADEON(0): MC_FB_LOCATION was: 0xd7ffd000 is: 0xd7ffd000 (WW) RADEON(0): MC_AGP_LOCATION was: 0xffffffc0 is: 0xf07ff000 (**) RADEON(0): GRPH_BUFFER_CNTL from 30004c4c to 20217c7c (II) RADEON(0): Direct rendering enabled SUCCESS! Attached good Xorg.log. Apropos, what are i82875p_edac and edac_mc? Some unsupported HW that is being recognized by 10.2?
Created attachment 102101 [details] X.Org log (working)
Steffen, see my comment #11. i82875p_edac is a kernel mdoule for core system error reporting. What we need instead is the intel-agp module. Where do we need to fix this issue? hwinfo, kernel or do we need to blacklist this module (how is this done?) ?
I guess the idea is to load both modules, probably via some tricky with /etc/modprobe.d/.
zoz told me that loading both modules should be the default. Tamás, could you please try modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias) and add the output? Thanks.
five:~ # modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias) insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko If you look at initial posts, you will see that i82875p_edac module loaded properly on my system, but it seems to me that the load order is problematic. Per default (before blacklisting it) it was loaded automatically, but it "hided" (diabled?) AGP from intel-agp. Now, WITH changes from comment #12, so intel-agp LOADED and AGP is working properly (init is 5) I executed: five:~ # modprobe i82875p_edac dmesg: edac_mc: module not supported by Novell, setting U taint flag. EDAC MC: Ver: 2.0.1 Oct 9 2006 i82875p_edac: module not supported by Novell, setting U taint flag. EDAC i82875p: i82875p init one PCI: Unable to reserve mem region #1:1000@fecf0000 for device 0000:00:06.0 EDAC MC0: Giving out device to i82875p_edac i82875p: DEV 0000:00:00.0 It seems like OK. As you see before, in comment #8, when I loaded intel-agp WITH LOADED i82875p_edac, dmesg was silent about AGP detection (and no /dev/agpgart existed), like AGP was disabled and I could not "fire it up", only with blacklisting i82875p_edac and rebooting.
> five:~ # modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias) > insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko > insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko IMHO insmod lib/modules/2.6.18-11-default/kernel/drivers/char/agp/intel-agp.ko is missing here. # grep 2578 modules.alias modules.alias:alias pci:v00008086d00002578sv*sd*bc*sc*i* i82875p_edac modules.alias:alias pci:v00008086d00002578sv*sd*bc06sc00i00* intel_agp I'm not sure what's wrong here. Is it modprobe or pci hotplug?
It seems that the changed module load order solves the problem. Use case: 1. removed module i82875p_edac from blacklist, thus reverted system to unmodified openSUSE 10.2 Alpha 5, reboot 2. rebooted to init 3, dmesg, lsmod attached. All modules loaded (order was i82875p_edaC, intel-agp), looks OK but /dev/agpgart not exists. 3. rmmod i82875p_edac intel-agp agpgart 4. modprobe intel-agp. /dev/agpgart appears, dmesg contains lines about AGP discovery. 5. modprobe i82875p_edac. dmesg contains EDAC lines, seems all OK. 6. init 5, AGP works OK, radeon initialized DRI.
Created attachment 102123 [details] dmesg of boot into init 3, with wrong module load order
Created attachment 102124 [details] lsmod after boot into init 3, with wrong module load order
Created attachment 102125 [details] dmesg after modifications (rmmod insmod in proper ordr)
Created attachment 102126 [details] lsmod after modifications (rmmod insmod in proper ordr)
> # grep 2578 modules.alias Hmm, i cannot locate modules.alias. I repeat, this system is UPGRADED from openSUSE 10.1 to openSUSE 10.2 Alpha 5.... Maybe the upgrade is broken?
> I'm not sure what's wrong here. Is it modprobe or pci hotplug? Question is still open. Looks like we need a special modprobe rule in /etc/modprobe.d. Does this one work for you? cat > /etc/modprobe.d/xorg-x11-driver-video << EOF # Load i82875p_edac after intel-agp # should be one line !!! (stupid bugzilla!) install i82875p_edac /sbin/modprobe -i intel-agp && /sbin/modprobe i82875p_edac EOF
> > # grep 2578 modules.alias > > Hmm, i cannot locate modules.alias. This is in /lib/modules/<kernel-version>
(In reply to comment #18) > > five:~ # modprobe -vn $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias) > > insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko > > insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko > IMHO > > insmod lib/modules/2.6.18-11-default/kernel/drivers/char/agp/intel-agp.ko > > is missing here. If intel-agp is already loaded, then modprobe -vn will not show it, because it does not need to load it anymore. Use --show-depends additionally in this case: modprobe -vn --show-depends \ $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias) > # grep 2578 modules.alias > modules.alias:alias pci:v00008086d00002578sv*sd*bc*sc*i* i82875p_edac > modules.alias:alias pci:v00008086d00002578sv*sd*bc06sc00i00* intel_agp > > I'm not sure what's wrong here. Is it modprobe or pci hotplug? What does the modalias look like?
This looks OK to me: five:~ # modprobe -vn --show-depends $(cat /sys/bus/pci/devices/0000\:00\:00.0/modalias) insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/edac_mc.ko insmod /lib/modules/2.6.18-11-default/kernel/drivers/edac/i82875p_edac.ko insmod /lib/modules/2.6.18-11-default/kernel/drivers/char/agp/agpgart.ko insmod /lib/modules/2.6.18-11-default/kernel/drivers/char/agp/intel-agp.ko Will attach modalias. Will try this too: > cat > /etc/modprobe.d/xorg-x11-driver-video << EOF > # Load i82875p_edac after intel-agp > # should be one line !!! (stupid bugzilla!) > install i82875p_edac /sbin/modprobe -i intel-agp && > /sbin/modprobe > i82875p_edac > EOF
Created attachment 102346 [details] modules.alias
While away, system was updated to upcoming openSUSE 10.2 Beta1 (last issue in chagelog is dated "2006-10-20").
Tested Stefan's suggestion: > cat > /etc/modprobe.d/xorg-x11-driver-video << EOF > # Load i82875p_edac after intel-agp > # should be one line !!! (stupid bugzilla!) > install i82875p_edac /sbin/modprobe -i intel-agp && > /sbin/modprobe > i82875p_edac > EOF It _partially_ works: the boot procedure HANGS for about 1,5-2 minutes after showing line 344 ("ieee1394: Host added...") but proceeds normally after this pause. intel-agp gets loaded (see attached lsmod), but now i82875p_edac module does not loads. X succesfully initializes AGP. Attached files with "-xorg-x11-driver-video" suffix: dmesg, lsmod, Xorg.0.log
Created attachment 102376 [details] dmesg - with Stefan comment #25
Created attachment 102377 [details] lsmod - with Stefan comment #25
Created attachment 102378 [details] Xorg.0.log - with Stefan comment #25
Please attach /etc/modprobe.d/xorg-x11-driver-video. Probably you simply did a cut & paste, which would explain the hang.
two drivers cant bind to the same device. does anyone seriously use edac? CONFIG_EDAC=n is the correct fix.
(In reply to comment #35) > Please attach /etc/modprobe.d/xorg-x11-driver-video. Probably you simply did a > cut & paste, which would explain the hang. > Yes, I realized it, my current /etc/modprobe.d/xorg-x11-driver-video (added "-i" after second modprobe). It works without a hang. install i82875p_edac /sbin/modprobe -i intel-agp ; /sbin/modprobe -i i82875p_edac
(In reply to comment #36) > two drivers cant bind to the same device. > does anyone seriously use edac? > CONFIG_EDAC=n is the correct fix. > I think this statement is corrent. The i82875p_edac even if loaded now with Stefans fix (comment #25) does not echoes nothing in dmesg. It simply gets loaded but not activated, as in initial case. When EDAC echoes itself (gets activated), then AGP is not activated.
Ok. I will add /etc/modprobe.d/xorg-x11-driver-video to the xorg-x11-driver-video package to make sure i82875p_edac no longer makes any trouble. Not sure, if we really should disable the build of i82875p_edac or even edac in general. Maybe someone wants to use it. Greg, what do you think? Short summary for Greg: intel-agp/i82875p_edac conflict. i82875p_edac is preferred. Therefore intel-agp doesn't work. Should we disable the build of 82875p_edac or even edac in general? For now I will add a workaround to make sure intel-agp is loaded before i82875p_edac.
workaround enabled for buildservice and STABLE (openSUSE 10.2 Beta2).
What about blacklist i82875p_edac somewhere in modprobe.conf?
somewhere means what?
modprobe.d/<somewhere> If any of your package already owns scuh a file you may add it there. Or add it to modprobe.d/blacklist. agadez:~ # rpm -qf /etc/modprobe.d/blacklist sysconfig-0.50.9-13.8 Assign the bug to me if i should add it.
So you think I should replace install i82875p_edac /sbin/modprobe -i intel-agp && /sbin/modprobe i82875p_edac in /etc/modprobe.d/xorg-x11-driver-video with blacklist i82875p_edac ?
Blacklisting EDAC is the right solution right now. Should probably write a script to check for more such duplicated IDs too and blacklist the less important one too. Longer term PCI subsystem needs to be fixed to allow this kind of coexistence. I guess that is something for Greg.
to comment 44: Yes. Or let me add it to the blacklist file.
Ok. I propose to add it to the general blacklist file.
And I will remove /etc/modprobe.d/xorg-x11-driver-video again.
(In reply to comment #48) > And I will remove /etc/modprobe.d/xorg-x11-driver-video again. done.
added module to blacklist in svn. WIll be in next beta.