|
Bugzilla – Full Text Bug Listing |
| Summary: | oops in reiserfs_writepage | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE LINUX 10.0 | Reporter: | Andreas Kleen <ak> |
| Component: | Kernel | Assignee: | Chris L Mason <mason> |
| Status: | RESOLVED INVALID | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | jeffm, trenn |
| Version: | Beta 1 | ||
| Target Milestone: | --- | ||
| Hardware: | Other | ||
| OS: | All | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
While testing on AIM7 on Adams (16 core Opteron) I hit the following
lockup too. Two CPUs ran into the same backtrace while spinning on the BKL
in reiserfs_setattr. This was a mainline 2.6.13rc6-git3 kernel with some x86-64
patches, but should be near HEAD.
reiserfs to blame too?
(same lockup on CPU 11, after that panic reboot stopped things)
NMI Watchdog detected LOCKUP on CPU14CPU 14
^MModules linked in:
^MPid: 19381, comm: reaim Not tainted 2.6.13-rc6-git7
^MRIP: 0010:[<ffffffff80416989>] <ffffffff80416989>{_spin_lock_irqsave+9}
^MRSP: 0018:ffff81013be31c40 EFLAGS: 00000002
^MRAX: 0000000000000000 RBX: ffffffff804bcd20 RCX: ffff81013be30000
^MRDX: ffff81013ef90000 RSI: ffff81013be0d0b0 RDI: ffffffff804bcd28
^MRBP: 0000000000000282 R08: ffff81013be30000 R09: 0000000000000002
^MR10: 00000000ffffffff R11: ffff810180e235e0 R12: ffffffff804bcd28
^MR13: ffff81013be0d0b0 R14: ffff81013be31c50 R15: 00000000000001ff
^MFS: 00002aaaaaf3b0a0(0000) GS:ffffffff805f7f00(0000) knlGS:0000000000000000
^MCS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
^MCR2: 00002aaaaaf1f7e8 CR3: 000000013be2f000 CR4: 00000000000006a0
^MProcess reaim (pid: 19381, threadinfo ffff81013be30000, task ffff81013be0d0b0)
^MStack: 0000000000000282 ffffffff80414860 0000000000000001 ffff81013be0d0b0
^M ffffffff80131d60 ffff81007d84fc68 ffff81007a8e1c68 ffff81013be31dd8
^M ffff81013be31dd8 ffff8100cdfb9cb0
^MCall Trace:<ffffffff80414860>{__down+160}
<ffffffff80131d60>{default_wake_function+
0}
^M <ffffffff80416789>{__down_failed+53}
<ffffffff80416d56>{.text.lock.kernel_lo
ck+25}
^M <ffffffff801c385c>{reiserfs_setattr+44}
<ffffffff80131d43>{try_to_wake_up+10
43}
^M <ffffffff80416613>{__down_write+51} <ffffffff8019a574>{notify_change+340}
^M <ffffffff8017d011>{do_truncate+65} <ffffffff8018e2c4>{may_open+468}
^M <ffffffff8018fc0e>{open_namei+734} <ffffffff80415b05>{thread_return+0}
^M <ffffffff8017cc57>{filp_open+39} <ffffffff8017ca0b>{get_unused_fd+219}
^M <ffffffff8017ccd4>{sys_open+84} <ffffffff8010d95e>{system_call+126}
^M
The Oops in the description maps to the buffer_dirty check in the mapping loop
of reiserfs_write_full_page():
bh = head;
block = page->index << (PAGE_CACHE_SHIFT - s->s_blocksize_bits);
/* first map all the buffers, logging any direct items we find */
do {
/* v----- oops */
if ((checked || buffer_dirty(bh)) && (!buffer_mapped(bh) ||
(buffer_mapped(bh)
&& bh->b_blocknr ==
0))) {
/* not mapped yet, or it points to a direct item, search
* the btree for the mapping info, and log any direct
* items found
*/
if ((error = map_block_for_writepage(inode, bh, block))) {
goto fail;
}
}
bh = bh->b_this_page;
block++;
} while (bh != head);
More detail:
b748: 41 8b 07 mov (%r15),%eax # bh->b_state
b74b: 89 c0 mov %eax,%eax
b74d: a8 02 test $0x2,%al # BH_Dirty
r15 contains garbage, definatley not a kernel address: 068f4832bb77ee4a
This looks like memory corruption. Can you try with a more recent kernel?
2.6.13-rc5-git3 is two weeks old already.
The machine runs rc7 fine some days, but I haven't retried with the AIM7 stress test yet Reproduced lots of deadlocks (tracked in oterh bug), but not the memory corruption. So it might have been an one-off. |
alfaro (running SLES9, but with a HEAD kernel) just threw this nice oops. alfano login: general protection fault: 0000 [1] SMP CPU 0 Modules linked in: freq_table edd autofs4 ipv6 thermal processor fan button battery ac af_packet tg3 i2c_i801 i2c_core ehci_hcd generic uhci_hcd usbcore shpchp pci_hotplug parport_pc lp parport video1394 ohci1394 raw1394 ieee1394 dm_mod reiserfs ata_piix ahci libata piix ide_disk ide_cd ide_core sr_mod cdrom sd_mod scsi_mod Pid: 384, comm: pdflush Not tainted 2.6.13-rc5-git3-3-smp RIP: 0010:[<ffffffff880d2748>] <ffffffff880d2748>{:reiserfs:reiserfs_writepage+392} RSP: 0018:ffff81015f511a38 EFLAGS: 00010246 RAX: 000000008fb6c9fb RBX: 0000000000000000 RCX: 0000000000000000 RDX: 00000000ffffffff RSI: 0000000000000000 RDI: ffff81015e0d05b0 RBP: ffff810143397c70 R08: 0000000000001000 R09: 0000000000000019 R10: 0000000000000258 R11: 0000000000000019 R12: ffff81014ba92270 R13: 0000000000000000 R14: ffff810150c22920 R15: 068f4832bb77ee4a FS: 0000000000000000(0000) GS:ffffffff80588800(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: 0000000000594170 CR3: 0000000155aab000 CR4: 00000000000006e0 Process pdflush (pid: 384, threadinfo ffff81015f510000, task ffff81015f814760) Stack: 0000000000000000 ffff81015f511e38 ffff810005566e00 00000000805d2800 ffff81015e1a6800 000000015f4f80a8 00000000000001cd 000001cc00000000 00000000001cc001 0000000000000000 Call Trace:<ffffffff8036e217>{thread_return+145} <ffffffff80166888>{find_get_pages_tag+152} <ffffffff801b3267>{mpage_writepages+455} <ffffffff880d25c0>{:reiserfs:reiserfs_writepage+0} <ffffffff801b189c>{__writeback_single_inode+428} <ffffffff8016d770>{pdflush+0} <ffffffff801b1ec2>{generic_sync_sb_inodes+546} <ffffffff8016d770>{pdflush+0} <ffffffff80152cc0>{keventd_create_kthread+0} <ffffffff801b220d>{writeback_inodes+125} <ffffffff8016cde6>{wb_kupdate+214} <ffffffff8016d8a5>{pdflush+309} <ffffffff8016cd10>{wb_kupdate+0} <ffffffff80152f73>{kthread+243} <ffffffff80137e30>{schedule_tail+64} <ffffffff8010fa52>{child_rip+8} <ffffffff80152cc0>{keventd_create_kthread+0} <ffffffff80152e80>{kthread+0} <ffffffff8010fa4a>{child_rip+0} Code: 41 8b 07 89 c0 a8 02 0f 84 a2 05 00 00 41 8b 07 89 c0 a8 20 RIP <ffffffff880d2748>{:reiserfs:reiserfs_writepage+392} RSP <ffff81015f511a38> Workload was probably just autobuild/icecream.