|
Bugzilla – Full Text Bug Listing |
| Summary: | PCI: Multiple Domains Are Not Supported ( kernel warning/error ) | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Ric Johnson <fhj-52> |
| Component: | Kernel | Assignee: | Greg Kroah-Hartman <gregkh> |
| Status: | RESOLVED WONTFIX | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | fhj-52 |
| Version: | Final | Keywords: | release_note |
| Target Milestone: | --- | ||
| Hardware: | x86-64 | ||
| OS: | SuSE Linux 10.0 | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | List of supported OS for GA-2CEWH-RH mainboard | ||
|
Description
Ric Johnson
2006-03-01 08:47:52 UTC
Hi Ric, this should be supported in the upcoming SL10.1/SLES10 kernel. We do not backport hardware support to older Suse Linux kernels though. As such I need to close this as WONTFIX. Please do give the SL10.1 beta a try. Olaf, sorry, but not, this is not supported in SL10.1 or SLES10, so trying out a new kernel will not help at all. The problem is that we have some initial attempts at doing this, but when running them, they break other types of systems (see lkml for the bug reports), so the majority of this patch is backed out in the -mm series due to that. It will take some access to one of these machines (local access is preferred), and some time to get it all working properly. And that will have to happen after SLES10 is released. So sorry, Linux currently does not support this kind of hardware very well yet. First, thanks. You guys are great. :) Did you know that I initially wanted SuSE as my Linux of choice 6 years ago but did not primarily because the license model was too confusing for a newbie? I am sure you did not but I mention that because I have some years of Linux experience(primarily RPM distro), have built RPMs & kernels, successfully, in the past and am available to work on this very significant problem. I don't claim to be a kernel expert and my "SuSE" experience is extremely low but I think I still have enough enthusism to make up the difference. I am rather upset with Gigabyte because they advertise this mobo as compatible with both 64-bit and 32-bit flavors of Linux, in particular SuSE( as well as RH ). I mention that because in email conversations last week they said they would work on it. They should provide some assistance if/when needed. Olaf Kirch : Please do not close as won't fix. This is a current as well as future problem for Novell & SuSE, as well as the Linux Operating System for x86 arch in general. I am willing to post/transfer-to as a 10.1 bug if that is what is needed( but I am not running the beta at this time ). Closing as 'wont fix' has ramifications that are not good. Greg Kroah-Hartman : I had hoped that Gigabyte would contact you, Greg, as well as Garzik and Morton about getting the support into the kernel. I guess not yet, huh? I recognize part of the problem is that these are new, cutting-edge mainboards and few in number in the Linux world. However this one is available but it is the only 64-bit system here. There are other 32-bit x86 (P, PII & PIII) systems that could be used for testing... I read about AM's problem on the older system in his mailing list post and that it might have possibly buggy ACPI(?). I have read everything I could find about this issue, which, frankly, is not much so have been hesitant to just jump into the water, so to speak. I really do not know, yet, still, what is under the surface. I am not a developer (although I do c) and certainly not a kernel hacker. The point here, really, is that I am asking what is the best way to proceed to get this done? ( Please email me, if needed. ) Ok, I'll take this one Yes, it would be best if Gigabyte were to contact me (my email address is very easy to find as the kernel PCI maintainer.) I would be more than willing to work together with them to get Linux working properly on their machines. The fact that they are advertising that it all works is odd, I'd be interested in finding out what they are doing to get that to work for these devices. And no, the problem isn't for buggy ACPI issues, but on other platforms that have multiple PCI domains (NUMA boxes from IBM), that currently work just fine with Linux. We can't break them with this patch, as you can understand. I'll mark this LATER, to remind me if someone contacts me about this in the future. marking LATER to remind me in the future. Thanks, :) . Gigabyte(GBT) advertises this mobo on their front page, http://us.giga-byte.com/ and just ended a February promotion. It is, presumably, their flagship AMD Dual Opteron(TM) graphics workstation/server product. It supports NUMA too. There is a PDF doc of "supported" OS via http://us.giga-byte.com/Server/Support/OSSupport/OSSupport_ServerBoard_GA-2CEWH.htm which I am going to upload here( for reference ). Did not see the IBM problem(I'm not receiving lkml) & agree, of course, that a patch should not, generally, bust other things. GBT support has been responsive to other issues. I'll ping GBT ... Created attachment 70864 [details]
List of supported OS for GA-2CEWH-RH mainboard
OS Compatibility for GA-2CEWH
(Updated PDF version)
I did ping GBT and they have responded that this info is being forwarded to "HQ" for the BIOS team to use. I hope they have made contact. As for me, I have the recent kernels(2.6.15.5 & 2.6.16-rc5) as well as the -mm patchset(2.6.16-rc5-mm2) to attempt build and use for testing. It has been delayed 'cause of a little setup problem with SuSE due to lack of clear and valuable info on the *proper* setup for building in the SuSE environment(i.e., I have newbie-itis, :) ) ... a couple more days, probably. I am speaking from #> uname -r 2.6.16-rc6-mm1-smp and it has made NO difference. (I did not have any better results with previous kernels ... this is the freshest.) Something is a bit haywire because the CONFIG_PCI_DOMAIN was NOT available during the config, the multiple configs for multiple attempts, thanks (mostly) to a missing object file that was not really missing because it was not supposed to be in the make in the first place and because I kept trying to find the sucker. It is in the patch. Why is it not in the config? I'll forego printing out the confirmation data that it does not work( PCI: Multiple domains not supported ) to provide the PCI-X access as well as the rest of my mainboard since it is probably a (kernel) config oversight. What could it be? I have saved logs if needed... kernel as well as PCI debug is in too. I took a (closer) look at what we built. The CONFIG_PCI_DOMAINS=y seems to be available for just about every arch on the planet Earth EXCEPT i386 and x86_64 (amd64). Why and how do I fix that so I can test it on this Dual Opteron workstation? Andrew has dropped the patches from his tree (well, he has a revert for them) as they caused too many problems. I'm about to drop them too. If you want, I can email them to you offline, or point you to them on the web. Thanks for answering. :) Any others besides these?: gregkh-pci-pci-fix-the-x86-pci-domain-support-fix.patch gregkh-pci-x86-pci-domain-support-a-humble-fix.patch gregkh-pci-x86-pci-domain-support-struct-pci_sysdata.patch gregkh-pci-x86-pci-domain-support-the-meat.patch I have those from the ' Broken-out ' at Morton's site. You said there are "Too many problems"? - Is that on the lkml? Please do point me to any public discussion; I have to make some kind of informed decision. I am a little confused as to why a patch, the only patch that I could find ever existed for x86 Multiple PCI Domains(MPD), is just going to be dropped rather than fixed. It seems rather important since even Win2k supports MPD. I suppose that also means that GBT never contacted you. Yes/no? And thanks for the claification. I was a bit bugged since I thought I had done all that work incorrectly. You may certainly email me if you prefer. There are some odd things with the sysdata & meat patches comments. They seem to be missing the closing "*/" in some places so I guess I'll need the real thing to make sure there is not a C&P error somehow or something else. (I could not find them at the ' .../people/jgarzik/patches ' ...) I am a simple, by-the-book, sometimes-if/when-necessary programmer so I do not know what/why the sysdata & meat patches do that. I got it. (I have not been doing enough patches to be familiar with format). If you will verify to me that the complete patchset is as I listed I will try to do something here... ( this GBT mainboard is _worthless_ to me w/o linux at full tilt. ) No, no one has contacted me. And the patches are being dropped, as no one is working on them to fix the remaining issues. They can all be found at: http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/patches/bad/pci-domain/ Look at the file, "series" in that directory for the order in which the patches should be applied. good luck. Thanks, :). Are the "remaining issues" colated anywhere? I cannot predict the future but MS Win2k (& later) support the multiple PCI domains so I am relatively sure that this will be an ongoing issue, with, at least, advanced mobos using multiple types of PCI. And thanks for the luck, too -will need it as it seems that GBT is being more difficult than necessary. Vendors ... :rollseyes: No, the issues aren't collected anywhere, except that some boxes are still crashing with these patches (that work fine today, we can't allow that.) And sure, other operating system probably support this just fine, I know, but unless someone does the work here, it's not going to happen for Linux, that is just how this project works :) mass reopening all SuSE Linux bugs that are set to REMIND+LATER to change the resolution to WONTFIX (adapting to new policy) mass reopening all SuSE Linux bugs that are set to REMIND+LATER to change the resolution to WONTFIX (adapting to new policy) mass reopening all SuSE Linux bugs that are set to REMIND+LATER to change the resolution to WONTFIX (adapting to new policy) Closing old LATER+REMIND bugs as WONTFIX - if you still plan to work on it, feel free to reopen and set to ASSIGNED. In case the report saw repeated reopen comments, it's due to bugzilla timing out on the huge request ;( |