|
Bugzilla – Full Text Bug Listing |
| Summary: | LTC21180-Unable to initialize qeth/dasd devices with CONFIG_DEBUG_SLAB enabled | ||
|---|---|---|---|
| Product: | [openSUSE] SUSE Linux 10.1 | Reporter: | Jan Blunck <jblunck> |
| Component: | Kernel | Assignee: | Frank Pavlic <pavlic> |
| Status: | VERIFIED FIXED | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Normal | ||
| Priority: | P5 - None | CC: | big-iron, gjlynx, hannsj_uhl, ihno |
| Version: | Beta 2 | ||
| Target Milestone: | --- | ||
| Hardware: | S/390-64 | ||
| OS: | Other | ||
| Whiteboard: | |||
| Found By: | Development | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: |
slab2616.diff
x3270 log |
||
|
Description
Jan Blunck
2006-01-27 12:12:03 UTC
changed:
What |Removed |Added
----------------------------------------------------------------------------
Owner|gjlynx@us.ibm.com |h.carstens@de.ibm.com
------- Additional Comments From pavlic@de.ibm.com 2006-01-31 10:48 EDT -------
Reassigning this bugzilla to Heiko ...
Frank
changed:
What |Removed |Added
----------------------------------------------------------------------------
Owner|h.carstens@de.ibm.com |pavlic@de.ibm.com
Severity|normal |low
------- Additional Comments From h.carstens@de.ibm.com(prefers email via heiko.carstens@de.ibm.com) 2006-02-14 14:09 EDT -------
Without deeper knowledge of QDIO I\'m not able to debug this. Frank, why are
these check conditions generated?
changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |cborntra@de.ibm.com
------- Additional Comments From cborntra@de.ibm.com 2006-04-10 07:55 EDT -------
I have no clue about the dasds (you dont mean fcp devices?) but it seems that I
have found the alignment problem in qdio.c.
The qib structure must be aligned to 256 Bytes AND is enmbedded into the
qeth_irq structure. The alignment cannot be guaranteed with slab debugging.
If you force the qeth_irq structure to a page boundary qeth works for me with
CONFIG_DEBUG_SLAB. See the patch below (the whitespaces are broken due to cut
and paste into this bugzilla)
diff -u -p -r1.3 qdio.c
--- drivers/s390/cio/qdio.c 4 Apr 2006 07:25:26 -0000 1.3
+++ drivers/s390/cio/qdio.c 10 Apr 2006 11:51:54 -0000
@@ -1637,7 +1637,7 @@ next:
}
kfree(irq_ptr->qdr);
- kfree(irq_ptr);
+ free_page((unsigned long) irq_ptr);
}
static void
@@ -2984,7 +2984,7 @@ qdio_allocate(struct qdio_initialize *in
qdio_allocate_do_dbf(init_data);
/* create irq */
- irq_ptr=kmalloc(sizeof(struct qdio_irq), GFP_KERNEL | GFP_DMA);
+ irq_ptr=(void *) get_zeroed_page(GFP_KERNEL | GFP_DMA);
QDIO_DBF_TEXT0(0,setup,\"irq_ptr:\");
QDIO_DBF_HEX0(0,setup,&irq_ptr,sizeof(void*));
@@ -2994,14 +2994,13 @@ qdio_allocate(struct qdio_initialize *in
return -ENOMEM;
}
- memset(irq_ptr,0,sizeof(struct qdio_irq));
init_MUTEX(&irq_ptr->setting_up_sema);
/* QDR must be in DMA area since CCW data address is only 32 bit */
irq_ptr->qdr=kmalloc(sizeof(struct qdr), GFP_KERNEL | GFP_DMA);
if (!(irq_ptr->qdr)) {
- kfree(irq_ptr);
+ free_page((unsigned long) irq_ptr);
QDIO_PRINT_ERR(\"kmalloc of irq_ptr->qdr failed!
\");
return -ENOMEM;
}
Let me knwo if this patch works.
Created attachment 77517 [details]
slab2616.diff
changed:
What |Removed |Added
----------------------------------------------------------------------------
Owner|pavlic@de.ibm.com |cborntra@de.ibm.com
------- Additional Comments From cborntra@de.ibm.com 2006-04-10 07:58 EDT -------
patch against 2.6.16 which makes qeth work CONFIG_SLAB_DEBUG
please test this patch and let me know if it works. we can then mak an official
patch.
Hello Jan, I am assigning this bugzilla back to you ... ... can you please test the attached patch whether it resolves the problem in this bugzilla ..? Thanks in advance for your support. Created attachment 80349 [details]
x3270 log
Thanks for the patch. I have good and bad news. The QDIO issue seems to be fixed. But the SLAB debugger found the following bug in our latest kernel:
Unable to handle kernel pointer dereference at virtual kernel address 6b6b6b6b6b6b6000
Oops: 0038 [#1]
CPU: 0 Not tainted
Process ifup (pid: 1211, task: 00000000011a2150, ksp: 000000000d56bd88)
Krnl PSW : 0704200180000000 0000000010a983f6 (qeth_hard_start_xmit+0x1dda/0x2218
[qeth])
Krnl GPRS: 0000000000000006 6b6b6b6b6b6b6ba5 0000000000000001 0000000000000000
0000000010a97c94 0000000000000001 000000000f8a8000 000000000f8a9c10
0000000000e57000 000000000f8a0000 0000000000000000 000000000f5b9bd0
0000000010a8e000 0000000010abb948 0000000010a97c94 00000000012b1b20
Krnl Code: bf 43 10 06 a7 84 00 13 58 50 f0 f0 12 55 a7 84 00 0e 58 10
Call Trace:
([<0000000010a97c94>] qeth_hard_start_xmit+0x1678/0x2218 [qeth])
[<00000000003907fa>] qdisc_restart+0x13e/0x280
[<0000000000375376>] dev_queue_xmit+0x496/0x718
[<0000000010a443f8>] mld_sendpack+0x32c/0x4ec [ipv6]
[<0000000010a49196>] mld_ifc_timer_expire+0x316/0x3c0 [ipv6]
[<00000000001549f4>] run_timer_softirq+0x660/0x704
[<0000000000149950>] __do_softirq+0x6c/0x108
[<000000000010f226>] do_softirq+0xba/0xf4
[<0000000000110034>] ext_no_vtime+0x16/0x1a
[<00000000001c01ae>] do_wp_page+0x10e/0x4e0
([<00000000001c0184>] do_wp_page+0xe4/0x4e0)
[<00000000001c753c>] __handle_mm_fault+0xcc4/0xdcc
[<0000000000101a98>] do_protection_exception+0x1c0/0x450
[<000000000010f95a>] sysc_return+0x0/0x10
[<0000020000180138>] 0x20000180138
<0>Kernel panic - not syncing: Fatal exception in interrupt
01: HCPGSP2629I The virtual machine is placed in CP mode due to a SIGP stop from
CPU 00.
00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 00103A44
First look at the problem:
This seems to be in qeth_send_packet() (line 4506 in qeth_main.c):
rc = qeth_do_send_packet_fast(card, queue, skb, hdr,
elements_needed, ctx);
if (!rc){
card->stats.tx_packets++;
card->stats.tx_bytes += tx_bytes;
#ifdef CONFIG_QETH_PERF_STATS
if (skb_shinfo(skb)->tso_size && <======= here
!(large_send == QETH_LARGE_SEND_NO)) {
card->perf_stats.large_send_bytes += skb->len;
card->perf_stats.large_send_cnt++;
}
if (skb_shinfo(skb)->nr_frags > 0){
card->perf_stats.sg_skbs_sent++;
/* nr_frags + skb->data */
card->perf_stats.sg_frags_sent +=
skb_shinfo(skb)->nr_frags + 1;
}
#endif /* CONFIG_QETH_PERF_STATS */
}
if (ctx != NULL) {
/* drop creator's reference */
qeth_eddp_put_context(ctx);
I looked into all the skb handling in qeth_do_send_packet(_fast) but I don't see why the shinfo is already freed.
Frank, can you take a look?
Christian, thanks for the fix. The problem with CONFIG_DEBUG_SLAB is fixed by your patch. Second problem is fixed by IBM Codestream linux-2.6.16 october2005 patch 02-19, thanks to Frank. Both patches in CVS. ----- Additional Comments From cborntra@de.ibm.com 2006-05-08 06:57 EDT ------- Yes, I have found the same probe in qeth with slab debugging. qeth should work now with slab debugging. The slab debugging fix is now upstream as well. changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|ACCEPTED |CLOSED
Impact|------ |RAS
------- Additional Comments From cborntra@de.ibm.com 2006-05-16 11:27 EDT -------
slab debugging fix is in SLES10 RC1.
Closed. |