Bug 612081

Summary: PV DomU Virtual machines can no more mount their hard disks while using tap:aio driver
Product: [openSUSE] openSUSE 11.2 Reporter: Henry Laurent <laurent.henry>
Component: XenAssignee: Jan Beulich <jbeulich>
Status: RESOLVED DUPLICATE QA Contact: E-mail List <qa-bugs>
Severity: Critical    
Priority: P2 - High CC: laurent.henry
Version: Final   
Target Milestone: ---   
Hardware: x86-64   
OS: openSUSE 11.2   
Whiteboard:
Found By: Community User Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: xen-hotplug.log
screenshot while trying to reboot a domU configured with tap:aio
Dom0 xend.log after a reboot attempt

Description Henry Laurent 2010-06-07 09:46:15 UTC
Created attachment 367414 [details]
xen-hotplug.log

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; fr; rv:1.9.1.9) Gecko/20100317 SUSE/3.5.9-0.1 Firefox/3.5.9

On my production environment (Opensuse 11.2 x64 Dom0), since 2 weeks i can no more use VMs configured with tap:aio driver (for vm with "file" driver all is fine).
While installing, the install process complains saying my system does not have a hard disk.
While rebooting a previously working vm, the boot fails while attempting to mount the disks, the system can't be booted no more.
The running vm are functionnals and worked well (and still works if i don't reboot).

xen-tools-3.4.1_19718_04-2.1.x86_64
kernel-xen-base-2.6.31.8-0.1.1.x86_64
xen-libs-3.4.1_19718_04-2.1.x86_64
kernel-xen-2.6.31.8-0.1.1.x86_64
xen-kmp-desktop-3.4.1_19718_04_2.6.31.5_0.1-2.1.x86_64
xen-3.4.1_19718_04-2.1.x86_64


Reproducible: Always

Steps to Reproduce:
1.i don't know
2.
3.


Expected Results:  
VM mounted
Comment 1 Henry Laurent 2010-06-07 09:47:44 UTC
Created attachment 367415 [details]
screenshot while trying to reboot a domU configured with tap:aio
Comment 2 Henry Laurent 2010-06-07 10:13:50 UTC
Created attachment 367423 [details]
Dom0 xend.log after a reboot attempt
Comment 3 Henry Laurent 2010-06-07 14:10:10 UTC
I think the problem comes from i don't see any blktap process associated with the vm id:

alca2:/etc/xen/vm # xm start cas-11.2
alca2:/etc/xen/vm # xm list|grep cas
cas-11.2                                   135   384     1     -b----      0.2
alca2:/etc/xen/vm # ps axu|grep blk|grep 135
alca2:/etc/xen/vm # ps axu|grep blk
root      2021  0.0  0.0   9468   916 pts/1    S+   16:07   0:00 grep blk
root      3946  0.0  0.0  86184   632 ?        Ssl  Jan13   0:00 blktapctrl
root      5152  0.0  0.0      0     0 ?        S<   May18   0:07 [blktap.97.xvda]
root      5153  0.0  0.0      0     0 ?        S<   May18   0:00 [blktap.97.xvdb]
root      8318  0.0  0.0      0     0 ?        S<   Apr06   0:02 [blktap.90.xvda]
root      8319  0.0  0.0      0     0 ?        S<   Apr06   0:00 [blktap.90.xvdb]
root      8321  0.0  0.0      0     0 ?        S<   Apr06   0:00 [blktap.90.xvdc]
root      8406  0.0  0.0      0     0 ?        S<   Apr02   0:00 [blktap.89.xvda]
root      8407  0.0  0.0      0     0 ?        S<   Apr02   0:00 [blktap.89.xvdb]
root      8408  0.0  0.0      0     0 ?        S<   Apr02   0:00 [blktap.89.xvdc]
root      8410  0.0  0.0      0     0 ?        S<   Apr02   0:00 [blktap.89.xvdd]
root     11311  0.0  0.0      0     0 ?        S<   Feb23   0:10 [blkback.77.xvda]
root     11312  0.0  0.0      0     0 ?        S<   Feb23   0:01 [blkback.77.xvdb]
root     11313  0.0  0.0      0     0 ?        S<   Feb23   0:00 [blkback.77.xvdc]
root     14419  0.0  0.0      0     0 ?        S<   May18   0:00 [blktap.99.xvda]
root     14420  0.0  0.0      0     0 ?        S<   May18   0:00 [blktap.99.xvdb]
root     14920  0.0  0.0      0     0 ?        S<   Jun04   0:00 [blkback.130.xvd]
root     14921  0.0  0.0      0     0 ?        S<   Jun04   0:00 [blkback.130.xvd]
root     14922  0.0  0.0      0     0 ?        S<   Jun04   0:00 [blkback.130.xvd]
root     17048  0.0  0.0      0     0 ?        S<   Mar24   0:00 [blktap.88.xvda]
root     17050  0.0  0.0      0     0 ?        S<   Mar24   0:00 [blktap.88.xvdb]
root     19189  0.0  0.0      0     0 ?        S<   Feb26   0:00 [blktap.78.sda1]
root     19190  0.0  0.0      0     0 ?        S<   Feb26   0:17 [blktap.78.sdb1]
root     19191  0.0  0.0      0     0 ?        S<   Feb26   0:00 [blktap.78.sdc1]
root     22376  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.103.xvd]
root     22377  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.103.xvd]
root     22378  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.103.xvd]
root     24137  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.104.xvd]
root     24138  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.104.xvd]
root     24139  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.104.xvd]
root     24140  0.0  0.0      0     0 ?        S<   May26   0:00 [blkback.104.xvd]
Comment 4 Henry Laurent 2010-06-07 14:13:40 UTC
un  7 16:11:41 alca2 kernel: [12537879.848530] blk_tap: Error initialising /dev/xen/blktap - No more devices
Jun  7 16:11:41 alca2 logger: /etc/xen/scripts/blktap: Writing backend/tap/136/51712/hotplug-status connected to xenstore.
Jun  7 16:11:41 alca2 logger: /etc/xen/scripts/blktap: Writing backend/tap/136/51728/hotplug-status connected to xenstore.
Jun  7 16:11:44 alca2 kernel: [12533839.409637] blktap: ring-ref 8, event-channel 12, protocol 1 (x86_64-abi)
Jun  7 16:11:44 alca2 kernel: [12533839.412361] blktap: ring-ref 9, event-channel 13, protocol 1 (x86_64-abi)
Jun  7 16:12:09 alca2 logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/136/51712
Jun  7 16:12:09 alca2 logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/136/51728
Jun  7 16:12:09 alca2 logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/136/51728
Jun  7 16:12:09 alca2 logger: /etc/xen/scripts/blktap: remove XENBUS_PATH=backend/tap/136/51712
Comment 5 Henry Laurent 2010-06-11 09:44:58 UTC
i made a better summary under bug #613490
Comment 6 Jan Beulich 2010-06-14 14:53:10 UTC
Is there any difference between this report and the latter report? If not, can we close this one.
Comment 7 Henry Laurent 2010-06-14 15:01:24 UTC
You can choose the one you think more understandble.
Both are the same issue
Comment 8 Jan Beulich 2010-06-14 15:28:42 UTC
Let's use the other one then.

*** This bug has been marked as a duplicate of bug 613490 ***