|
Bugzilla – Full Text Bug Listing |
| Summary: | kernel crash/oops with XFS | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE Linux 10.1 | Reporter: | Adrian Schröter <adrian.schroeter> |
| Component: | Kernel | Assignee: | Forgotten User f0K9NrX7su <forgotten_f0K9NrX7su> |
| Status: | RESOLVED DUPLICATE | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Critical | ||
| Priority: | P5 - None | CC: | ajones, dmueller, forgotten_aHtZ2osk0j, forgotten_f0K9NrX7su |
| Version: | unspecified | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
|
Description
Adrian Schröter
2005-10-14 15:54:26 UTC
SGI, this was on a machine with xfs root, with an almost-stock 2.6.14-rc4-git4 kernel. Are you aware of any problems that look like this? Thanks for checking! Hi Andreas, As a matter of fact we do have one such reported bug recently from the community (pv#945029 - sorry no bugworks access:) which has the same stack callback. However, their problem occurs when they run out of space (and say it happens when testing with default ACLs and inheriting ACLs). I plan to try out their scenario. However, are there any unusual circumstances in your situation which would provide a clue to reproduce locally? Things are going wrong when the inmemory log buffer makes it to disk, we get a callback and then call our xfs_trans_committed routine. This adds the items in the transaction to the active-item-list, which is a list of items (for metadata) which are in the ondisk log but whose metadata has not been written to disk yet. If the item already exists then it just updates its position in the list. For the pv#945029, they reported that xfs_ail_insert fails because lip->li_ail.ail_forw field is NULL which is a problem when it is linking the next item's back ptr to our new item. The insert works by scanning back from the end of the list. So we traverse just using the back ptrs. Somehow the back ptrs are intact but the forward ptr isn't. The active item list (AIL) is locked prior to this call, so there shouldn't be a race problem. --Tim unfortunately I cannot provide further information as I reinstalled the corrupted partition with a different filesystem. I cannot immediately trigger it, but running autobuild (which does a lot of compilation, file reads and writes) on the machine for several days appears to have caused this problem. Tim, I'm assigning this bug to you until we have a fix. Traceback looks same as 133990 traceback. Was the FS full and was it using default ACLs? --Tim *** This bug has been marked as a duplicate of 133990 *** I don't think the file system was full, but it could have happened. looks similiar indeed. |