Bugzilla – Bug 246701
ahci SATA RAID - mdraid segment fault
Last modified: 2007-11-10 16:46:58 UTC
System maniboard: S5000VSA CPU: 2x xeon dual core 64bit 2.0 GHz RAM: 2 GB HD: 4x SATA 500 GB in RAID 10 configuration Other: pata DVD writer Installation aborts (drops user to text console) after probing hard drives, after loading kermel modules for hard drives (presumely ahci module). Significant error message before dumping data to USB stick (/dev/sde): dmraid[3786]: segfault at 00002b51bafeea00 rip 00002b51bafeea00 rsp 00007ffff0439a18 error 15 Note: had to pass kernel parameter startshell=1 to be able to mount USB stick after Installation aborting. data to follow in attachments
Created attachment 119885 [details] dmegs log
Created attachment 119886 [details] hwinfo --storage
Created attachment 119888 [details] hwinfo --all
Created attachment 119893 [details] lspci -v output
Created attachment 119896 [details] YaST logs
Please do not set any bug to PO-Critsit, those are for internal usages only!
Please boot the rescue system and try running dmraid manually. Provide the output of: dmraid -r -vvv -d If that fails, please provide also the output of the strace of the above command.
Created attachment 120274 [details] dmraid output Output provided by dmraid -r -vvv -d Note: the raid logical drive size is 1 TB.
Forgot to tick the option to remove the status of NEEDINFO from this bug
Ok, thanks. Next please try the same with the command dmraid -s -ccc -d
Created attachment 120444 [details] dmraid output Using dmraid -s -ccc -d
Now lets try the same with the argument of the RAID set, because this seems to be the one that fails according to the Yast log. Please try this: dmraid -s -ccc -d ddf1_4c534920202020201000005500000000330adc0c00000a28 If this fails, please also provide the output of the strace of this command.
Created attachment 120483 [details] strace output Running above command caused Segment Fault. Ran strace -o outputFilename command
It would be very helpful to get more debugging information. Do you think you could provide a gdb backtrace of the command in comment #12?
Obtain the following response -bash: gdb: command not found How do I boot the installation CD to get this application?
Unfortunately gdb and debuginfo packages are not in the rescue system. So this will be some more work, I am not sure if you have the resources to do this, but if you want to give it a try here are some hints: You would have to use another 10.2 system of the same architecture, which has dmraid, the dmraid-debuginfo package and gdb installed and then mount this system into your rescue system. I think it will enough to export /usr and /sbin and mount them into the rescue system at equal locations. It should be also possible to copy /usr /sbin to an USB stick and mount this. Indepently from the steps above: Currently I have not the possibility to test the ddf1 metadata format, as soon as this changes I can try to set up your raid configuration. Could you please dump your RAID metadata with dmraid -rD and provide all resulting files.
Ah, please forget the complicated stuff about setting up a debugging environment. All I need is a core file of the command in comment #12. You can create this core file by setting the resource limits of the max core size, for example: ulimit -c 10000 and running the command. The core file will then be written to current directory.
Created attachment 121722 [details] Output of dmraid -rD No core dump was generated or segmentation fault.
Sorry, I think we have some misconception here, maybe my instructions have not been clear enough. 1. You need to create the core dump for the command that failed in comment #12: dmraid -s -ccc -d ddf1_4c534920202020201000005500000000330adc0c00000a28 With the this core file I am able to debug dmraid in the state of the failure. 2. dmraid -rD will write files to the current directory containing the metadata. Please provide these files, not the stdout output of the program.
Created attachment 121759 [details] gz of requested files for 2 dmraid coomands Changed my current directory from /root to /var/log, then ran the 2 commands. This time I ended up with a core file (327680 bytes) and 4 sets of sd* files (41926656, 13, 10 bytes each).
Some major problem is that the RAID10 configuration is currently not supported by dmraid with the DDF1 metaformat. So even if the segfault will be fixed, you will not be able to use the RAID in this configuration with dmraid.
If need, I can fall back to using 2 logical drives in RAID 1 configuration instead of RAID 10, until such time RAID 10 is properly supported.
Did you try to configure 2 RAIDs in RAID 1 configuration and then try to install? Does the segfault still happen?
Built 2 RAIDs in RAID 1 configuration. Still segfault occurs. What data do you require for this segfault for RAID 1?
Thanks, there is no more data required.
Could you provide me the status of progress to support atleast RAID 1, though RAID 10 is preferred.
I am sorry, I currently do not have the resources to debug this further. It will still take some time.
I thought I did post this reply I received from Intel. Sorry about this. PS: He still does not have the drivers. --------------------------- Dear Mr. Peck Unfortunatelly we don't have source code for SuSE* Linux* 10 available as yet. I have requested the drives but don't have an ETA to when we'll have it available. Unfortunatelly the current drivers we have on the web is for Linux* kernels 2.6.16 If you still have any questions plese feel free to contact us. Kind regards, Herbert H Senior Customer Support Engineer · Enterprise Products Intel® Customer Support (EMEA) Intel Corporation (UK) Limited
I have seen some possibly related segfault issue on the ataraid list. I got the following response from Intel: "The S5000VSA board should have BIOS/orom support for ISW metadata, any reason why DDF is being used? Jason Gaston and Ying Fang released a patch fixing a segfault issue when using ISW metadata, perhaps there is a similar issue with DDF. http://marc.info/?l=ataraid-list&m=118315445123823&w=2" I am wondering why your RAID is detected as DDF1 format, according to this information it should be ISW format. There has been work by Intel on better ISW support for dmraid. Can you figure out, if it is possible to configure your RAID in a way it will be in ISW format? Or can you describe how you configured your RAID and why it is ddf1 and not isw? Do you use the the onboard RAID functionality or do you have an additional add-in card? I got another reply from Intel, which might help: "If you can risk loosing the array set, you could enable the S5000VSA's onboard RAID support (assuming that there is not an add-in card being used). Buried in the BIOS of the S5000VSA is an option to enable RAID. Select Advanced -> ATA Controller Configuration -> set On board SATA Controller as Enabled -> set SATA Mode as Enhanced -> set Configure SATA as RAID as Enabled. This just enables the 'Intel RAID option ROM' which can then be used to create an ISW RAID set. Then you can use the ISW segfault patch..." In this case I could apply the Intel patches, which might resolve this issue.
Shall we close this bug? There seems little interest
no objections it seems