|
Bugzilla – Full Text Bug Listing |
| Summary: | data corruption on DVD-RAM | ||
|---|---|---|---|
| Product: | [openSUSE] openSUSE 10.2 | Reporter: | Stanislav Brabec <sbrabec> |
| Component: | Kernel | Assignee: | Tony Jones <tonyj> |
| Status: | RESOLVED WORKSFORME | QA Contact: | E-mail List <qa-bugs> |
| Severity: | Major | ||
| Priority: | P5 - None | CC: | jack, jeffm, nadvornik |
| Version: | Final | ||
| Target Milestone: | --- | ||
| Hardware: | i586 | ||
| OS: | All | ||
| Whiteboard: | |||
| Found By: | Other | Services Priority: | |
| Business Priority: | Blocker: | --- | |
| Marketing QA Status: | --- | IT Deployment: | --- |
| Attachments: | blktrace.out.bz2 | ||
|
Description
Stanislav Brabec
2005-06-16 12:12:55 UTC
I don't believe this bug is related to #74402. That bug is an issue with vfat not honoring -osync and getting corruption during an eject-before-umount case. I suspect this bug has to do with inode/dentry refcounting in udffs. Since this is a CDROM device, it won't eject unless it can umount the filesystem first. I've been playing with the drive here. The main result is that without 'subfs' everything seems to be working fine and no data is lost. I'll try to find out more later. Can you confirm your second result, that some drives report disk change, even if medium removal is prevented and medium unload fails? If it is true, it probably means, that kernel should track or check door lock status before sending eject. Yes, on my notebook (HP Omnibook 6000) media-changed for the CD-ROM drive is set even though the previous CDROMEJECT was unsuccessful and returned EIO. OTOH I have not been able to reproduce the behaviour on any other computer... It seems, that I am able to repeat this bug even without subfs by calling eject without previous sync. Vanilla kernel 2.6.12.1. Symptoms are the same (used inodes on mounted medium in syslog, corrupted directory or file). But I am not yet able to simply reproduce this corruption. When no eject method specified, eject tries them all in this order until one succeeds (or all fai): ioctl(fd, CDROMEJECT); series of ioctl(fd, SCSI_IOCTL_SEND_COMMAND, (void *)&scsi_cmd); ioctl(fd, FDEJECT); ioctl(fd, MTIOCTOP, &op); I would think the burner dislikes especialy the SCSI commands. Could you please try to reproduce with each eject mehod? I guess that before trying to send eject ioctl, eject calls umount and umount should write all buffers to medium. As I wrote, I am not able to reproduce this problem in a simple way without subfs. Yes, the problem with the eject command is, that it call umount(1) and that does sync before unmounting. I'll extend my ejecting program to not only call CDROMEJECT (which is the right command and works) but also call umount(2) so we avoid syncing. And just for record - I reproduced the problem with ext2+subfs so it's probably not a UDF problem. So the problem arise when ioctl CDROMEJECT is called before umount. The ioctl returns EIO but the medium is ejected anyway and hence kernel starts to spit errors and the rest is obvious. Jens, what is the expected behaviour? I think that returning EIO but ejecting the CD is wrong but that seems more like the drive's fault... Do you have some suggestion how to fix it? I guess you want NEEDINFO from Jens Axboe. Further investigation shows that the command send to DVD-RAM returns error 0x8000002 (and hence ioctl() subsequently returns EIO). Hello! Due to the proposal from Andreas Jaeger, this bug will be closed as a reminder, because it is in a NEEDINFO-state for longer than 4 weeks by now which indicates a very low activity for this bug. If the required information is offered by the respective person or the bug has in fact a wrong status, it should be reopened, changing/correcting the information and a reason should be provided. In any case there should be checked if this problem hasn't been already fixed or is at least of relevance for the current release, specially for the older bugs (<9.3). Sorry if this causes you any inconvenience. Kind regards, the BNC-Screening-Team Not yet fixed, reopening. Should be ok for 10.1. Reopening. Experienced again in SuSE Linux 10.1. This time reproduced on i586 machine. On cleanly formatted DVD-RAM (mkudffs --vid=BackupDVD1 --media-type=dvdram --utf8 /dev/hdc), ejected and inserted only as fresh formatted to automount it, then wrote 4GB, I again got: Jun 18 02:41:22 utx kernel: udf: udf_read_inode(ino 558554) failed !bh Jun 18 02:41:22 utx kernel: udf: udf_get_filelongad() invalidparms Jun 18 02:41:37 utx kernel: udf: udf_read_inode(ino 558554) failed !bh Jun 18 02:41:37 utx kernel: udf: udf_get_filelongad() invalidparms Jun 18 02:41:37 utx kernel: udf: udf_read_inode(ino 558554) failed !bh In 4th level of subdirectory I found corrupted directory. As root it returns: utx:/media/BackupDVD1/sb/SuSE_Atari_Amiga/Atari_130XE # ls /bin/ls: REST: Permission denied BASIC CHR COM ENT FRACTAL INDEX MARS PHOTO ROMS SRC TOS TOSMAN TXT And reading another corrupted medium with unfinished write, I got: Jun 17 15:51:40 utx kernel: Unable to handle kernel paging request at virtual address bc60a4c0 Jun 17 15:51:40 utx kernel: f12aefd0 Jun 17 15:51:40 utx kernel: *pde = 00000000 Jun 17 15:51:40 utx kernel: Oops: 0000 [#1] Jun 17 15:51:40 utx kernel: CPU: 0 Jun 17 15:51:40 utx kernel: EIP: 0060:[<f12aefd0>] Tainted: G U VLI Jun 17 15:51:40 utx kernel: EFLAGS: 00010217 (2.6.16.13-4-default #1) Jun 17 15:51:40 utx kernel: eax: 00000028 ebx: bc60a4c0 ecx: c4adbd5c edx: 00000030 Jun 17 15:51:40 utx kernel: esi: a6508878 edi: e8b2f214 ebp: c4adbd44 esp: c4adbccc Jun 17 15:51:40 utx kernel: ds: 007b es: 007b ss: 0068 Jun 17 15:51:40 utx kernel: Stack: <0>bc60a4c0 e8b2f1dc f12a58fc 00000001 c4adbd4c e8b2f214 c4adbd54 c4adbd44 Jun 17 15:51:40 utx kernel: f12a59f5 c4adbd44 c4adbd58 c4adbd54 00000001 c4adbd5c e8b2f2ff e8b2f33c Jun 17 15:51:40 utx kernel: e8b2f1dc df4ca564 f12ae6c0 c4adbd44 c4adbd58 c4adbd54 00000001 e8b2f214 Jun 17 15:51:40 utx kernel: Call Trace: Jun 17 15:51:40 utx kernel: [<f12a58fc>] udf_current_aext+0xab/0x12d [udf] Jun 17 15:51:40 utx kernel: [<f12a59f5>] udf_next_aext+0x77/0x87 [udf] Jun 17 15:51:40 utx kernel: [<f12ae6c0>] udf_discard_prealloc+0x153/0x266 [udf] Jun 17 15:51:40 utx kernel: [<f12a8d7d>] udf_clear_inode+0x16/0x2a [udf] Jun 17 15:51:40 utx kernel: [<c015e6c8>] clear_inode+0xd5/0x102 Jun 17 15:51:40 utx kernel: [<c015e8eb>] generic_drop_inode+0x101/0x114 Jun 17 15:51:40 utx kernel: [<c015e26b>] iput+0x63/0x65 Jun 17 15:51:40 utx kernel: [<f12a6d9c>] udf_iget+0x85/0x8c [udf] Jun 17 15:51:40 utx kernel: [<f12aa7cc>] udf_lookup+0x62/0x85 [udf] Jun 17 15:51:40 utx kernel: [<c01554a4>] do_lookup+0xa3/0x135 Jun 17 15:51:40 utx kernel: [<c0156e6f>] __link_path_walk+0x6c1/0xab6 Jun 17 15:51:40 utx kernel: [<c0164a2e>] setxattr+0xb0/0xbf Jun 17 15:51:40 utx kernel: [<c01572ab>] link_path_walk+0x47/0xb9 Jun 17 15:51:40 utx kernel: [<c0157600>] do_path_lookup+0x1cc/0x1e6 Jun 17 15:51:40 utx kernel: [<c015d3d4>] dput+0x31/0x122 Jun 17 15:51:40 utx kernel: [<c01607b9>] mntput_no_expire+0x11/0x59 Jun 17 15:51:40 utx kernel: [<c0164b08>] sys_setxattr+0x43/0x4f Jun 17 15:51:40 utx kernel: [<c01575cc>] do_path_lookup+0x198/0x1e6 Jun 17 15:51:40 utx kernel: [<c0157d0e>] __user_walk_fd+0x29/0x3a Jun 17 15:51:40 utx kernel: [<c0152326>] vfs_lstat_fd+0x12/0x39 Jun 17 15:51:40 utx kernel: [<c0157600>] do_path_lookup+0x1cc/0x1e6 Jun 17 15:51:40 utx kernel: [<c015d3d4>] dput+0x31/0x122 Jun 17 15:51:40 utx kernel: [<c01607b9>] mntput_no_expire+0x11/0x59 Jun 17 15:51:40 utx kernel: [<c0164b08>] sys_setxattr+0x43/0x4f Jun 17 15:51:40 utx kernel: [<c0152392>] sys_lstat64+0xf/0x23 Jun 17 15:51:40 utx kernel: [<c010580a>] do_syscall_trace+0x123/0x169 Jun 17 15:51:40 utx kernel: [<c0102a39>] syscall_call+0x7/0xb Jun 17 15:51:40 utx kernel: Code: c3 56 85 c0 53 89 d6 89 c3 74 04 85 c9 75 0f 68 40 0d 2b f1 e8 ba 8c e6 ce 31 c0 5e eb 21 8b 01 85 c0 78 19 8d 50 08 39 f2 77 12 <83> 3b 00 89 d8 74 0b 83 7c 24 0c 00 74 06 89 11 eb 02 31 c0 5b >>EIP; f12aefd0 <pg0+30f1ffd0/3fc6f400> <===== >>ebx; bc60a4c0 <phys_startup_32+bc50a4c0/c0000000> >>ecx; c4adbd5c <pg0+474cd5c/3fc6f400> >>esi; a6508878 <phys_startup_32+a6408878/c0000000> >>edi; e8b2f214 <pg0+287a0214/3fc6f400> >>ebp; c4adbd44 <pg0+474cd44/3fc6f400> >>esp; c4adbccc <pg0+474cccc/3fc6f400> Trace; f12a58fc <pg0+30f168fc/3fc6f400> Trace; f12a59f5 <pg0+30f169f5/3fc6f400> Trace; f12ae6c0 <pg0+30f1f6c0/3fc6f400> Trace; f12a8d7d <pg0+30f19d7d/3fc6f400> Trace; c015e6c8 <clear_inode+d5/102> Trace; c015e8eb <generic_drop_inode+101/114> Trace; c015e26b <iput+63/65> Trace; f12a6d9c <pg0+30f17d9c/3fc6f400> Trace; f12aa7cc <pg0+30f1b7cc/3fc6f400> Trace; c01554a4 <do_lookup+a3/135> Trace; c0156e6f <__link_path_walk+6c1/ab6> Trace; c0164a2e <setxattr+b0/bf> Trace; c01572ab <link_path_walk+47/b9> Trace; c0157600 <do_path_lookup+1cc/1e6> Trace; c015d3d4 <dput+31/122> Trace; c01607b9 <mntput_no_expire+11/59> Trace; c0164b08 <sys_setxattr+43/4f> Trace; c01575cc <do_path_lookup+198/1e6> Trace; c0157d0e <__user_walk_fd+29/3a> Trace; c0152326 <vfs_lstat_fd+12/39> Trace; c0157600 <do_path_lookup+1cc/1e6> Trace; c015d3d4 <dput+31/122> Trace; c01607b9 <mntput_no_expire+11/59> Trace; c0164b08 <sys_setxattr+43/4f> Trace; c0152392 <sys_lstat64+f/23> Trace; c010580a <do_syscall_trace+123/169> Trace; c0102a39 <syscall_call+7/b> This architecture has variable length instructions, decoding before eip is unreliable, take these instructions with a pinch of salt. Code; f12aefa5 <pg0+30f1ffa5/3fc6f400> 00000000 <_EIP>: Code; f12aefa5 <pg0+30f1ffa5/3fc6f400> 0: c3 ret Code; f12aefa6 <pg0+30f1ffa6/3fc6f400> 1: 56 push %esi Code; f12aefa7 <pg0+30f1ffa7/3fc6f400> 2: 85 c0 test %eax,%eax Code; f12aefa9 <pg0+30f1ffa9/3fc6f400> 4: 53 push %ebx Code; f12aefaa <pg0+30f1ffaa/3fc6f400> 5: 89 d6 mov %edx,%esi Code; f12aefac <pg0+30f1ffac/3fc6f400> 7: 89 c3 mov %eax,%ebx Code; f12aefae <pg0+30f1ffae/3fc6f400> 9: 74 04 je f <_EIP+0xf> Code; f12aefb0 <pg0+30f1ffb0/3fc6f400> b: 85 c9 test %ecx,%ecx Code; f12aefb2 <pg0+30f1ffb2/3fc6f400> d: 75 0f jne 1e <_EIP+0x1e> Code; f12aefb4 <pg0+30f1ffb4/3fc6f400> f: 68 40 0d 2b f1 push $0xf12b0d40 Code; f12aefb9 <pg0+30f1ffb9/3fc6f400> 14: e8 ba 8c e6 ce call cee68cd3 <_EIP+0xcee68cd3> Code; f12aefbe <pg0+30f1ffbe/3fc6f400> 19: 31 c0 xor %eax,%eax Code; f12aefc0 <pg0+30f1ffc0/3fc6f400> 1b: 5e pop %esi Code; f12aefc1 <pg0+30f1ffc1/3fc6f400> 1c: eb 21 jmp 3f <_EIP+0x3f> Code; f12aefc3 <pg0+30f1ffc3/3fc6f400> 1e: 8b 01 mov (%ecx),%eax Code; f12aefc5 <pg0+30f1ffc5/3fc6f400> 20: 85 c0 test %eax,%eax Code; f12aefc7 <pg0+30f1ffc7/3fc6f400> 22: 78 19 js 3d <_EIP+0x3d> Code; f12aefc9 <pg0+30f1ffc9/3fc6f400> 24: 8d 50 08 lea 0x8(%eax),%edx Code; f12aefcc <pg0+30f1ffcc/3fc6f400> 27: 39 f2 cmp %esi,%edx Code; f12aefce <pg0+30f1ffce/3fc6f400> 29: 77 12 ja 3d <_EIP+0x3d> This decode from eip onwards should be reliable Code; f12aefd0 <pg0+30f1ffd0/3fc6f400> 00000000 <_EIP>: Code; f12aefd0 <pg0+30f1ffd0/3fc6f400> <===== 0: 83 3b 00 cmpl $0x0,(%ebx) <===== Code; f12aefd3 <pg0+30f1ffd3/3fc6f400> 3: 89 d8 mov %ebx,%eax Code; f12aefd5 <pg0+30f1ffd5/3fc6f400> 5: 74 0b je 12 <_EIP+0x12> Code; f12aefd7 <pg0+30f1ffd7/3fc6f400> 7: 83 7c 24 0c 00 cmpl $0x0,0xc(%esp) Code; f12aefdc <pg0+30f1ffdc/3fc6f400> c: 74 06 je 14 <_EIP+0x14> Code; f12aefde <pg0+30f1ffde/3fc6f400> e: 89 11 mov %edx,(%ecx) Code; f12aefe0 <pg0+30f1ffe0/3fc6f400> 10: eb 02 jmp 14 <_EIP+0x14> Code; f12aefe2 <pg0+30f1ffe2/3fc6f400> 12: 31 c0 xor %eax,%eax Code; f12aefe4 <pg0+30f1ffe4/3fc6f400> 14: 5b pop %ebx Stanislav - Since 10.1 no longer uses subfs, I'd like to start at the beginning again. Are the steps to reproduce still the same? The lost buffer is not reproducible and not reproducible by an easy way in the top of this report. I wrote the same data second time and the error did not occur. For SuSE 10.0 this problem appeared during nearly every writing and the DVD-RAM with UDF was unusable. In 10.1, I have experienced this problem only once during writing of about 10 x 4G. But the filesystem corruption is the same: The directory should contain another directory. Instead of it, it contains undeletable object of unknown type. I am using rsync for creating DVD-RAM back-ups. This time the corruption appeared without ejecting the media (it was ejected only when it was empty). Syslog did not show any IDE or I/O error in time of writing. Corruption appeared in the middle of writing. Possible blind guesses: - race while renaming file and doing mkdir in the same directory at once - random buffer overflow for long or international file names - fuzzy buffers (trashed before write or written outdated) The Oops is reproducible, if you have properly corrupted filesystem. Making another back-up of the same directory, the error occurred nearly in the same place, now the one level upper is not writable ("permission denied" error) nor deletable ("directory not empty" error).
Now the error occurs one inode later.
Jul 3 02:34:00 utx kernel: udf: udf_read_inode(ino 558555) failed !bh
I moved two corrupted inodes to a separate directory and tried to delete them:
rm -rf /media/BackupDVD1/BAD
rm: cannot remove directory `/media/BackupDVD1/BAD/tool-options': Directory not empty
rm: cannot lstat `/media/BackupDVD1/BAD/SuSE_Atari_Amiga/Atari_130XE': Permission denied
Making a "ls -alRi --sort=none /media/BackupDVD1", corrupted item was the last directory in the list of each "session". "sync" command was called before ejecting the disc. It seems, that the corruption is caused by a "sync" command in some circumstances. After removing it from the backup script, I have not experienced the corruption. Reopen for 10.1 if the bug still exists. Yes, it still exists in 10.1 and comments #15 and below are related to 10.1. Note to the previous comments: The corruption occurs much more frequent, if "sync" command is called before ejecting the disc, but it was already experienced several times even without calling the "sync". Corruption experienced once again. This time it was after "disc full" error using rsync. Does this still happen on 10.2? If so, please reopen with the new log messages. Still happens in openSUSE 10.2: hammer:~ # ls -al "/media/BackupDVD1/sb_deleted/.wine/" ls: cannot access /media/BackupDVD1/sb_deleted/.wine/dosdevices: Permission denied total 430 drwxr-xr-x 4 root users 240 Oct 4 11:04 . drwxr-xr-x 10 root users 4028 Feb 27 10:52 .. ?????????? ? ? ? ? ? dosdevices drwxr-xr-x 4 root users 204 Dec 28 21:20 drive_c -rw------- 1 root users 366067 Jan 10 23:51 system.reg -rw------- 1 root users 52825 Jan 11 01:08 user.reg hammer:~ # ls -al "/media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files" ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/Winwap Technologies: Permission denied ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/Neat Image: Permission denied ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/InstallShield Installation Information: Permission denied ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/FunTom: Permission denied ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/FreeUndelete: Permission denied ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/DiMAGE Viewer: Permission denied ls: cannot access /media/BackupDVD1/sb_deleted/.wine/drive_c/Program Files/ASUS: Permission denied total 0 drwxr-xr-x 10 root users 480 Dec 28 21:20 . drwxr-xr-x 4 root users 204 Dec 28 21:20 .. ?????????? ? ? ? ? ? ASUS drwxr-xr-x 3 root users 92 Dec 28 21:20 Common Files ?????????? ? ? ? ? ? DiMAGE Viewer ?????????? ? ? ? ? ? FreeUndelete ?????????? ? ? ? ? ? FunTom ?????????? ? ? ? ? ? InstallShield Installation Information ?????????? ? ? ? ? ? Neat Image ?????????? ? ? ? ? ? Winwap Technologies Notes: - Medium was never inserted to any OS except openSUSE 10.2. - Backup process was interrupted, after reboot I continued with backing-up. - My backup script uses rsync --delete -b --backup-dir=/media/BackupDVD1/sb_deleted -v -aHS - I did "mount -oremount,noatime /dev/sr0" during work. Syslog shows: Mar 5 12:08:18 hammer kernel: UDF-fs INFO UDF 0.9.8.1 (2004/29/09) Mounting volume 'BackupDVD1', timestamp 2007/02/26 07:14 (103c) Mar 5 12:16:45 hammer kernel: udf: udf_read_inode(ino 1935807) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935983) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935811) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935844) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935848) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935880) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935919) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935920) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935954) failed !bh Mar 5 12:16:46 hammer kernel: udf: udf_read_inode(ino 1935956) failed !bh Mar 5 12:16:47 hammer kernel: udf: udf_read_inode(ino 1936221) failed !bh Mar 5 12:18:19 hammer kernel: udf: udf_read_inode(ino 1935807) failed !bh Mar 5 12:18:39 hammer kernel: udf: udf_read_inode(ino 1935807) failed !bh Mar 5 12:18:48 hammer kernel: udf: udf_read_inode(ino 1935807) failed !bh Mar 5 12:28:59 hammer kernel: udf: udf_read_inode(ino 1935807) failed !bh No other directories except upper mentioned were corrupted during backup of full 4.6GB and several thousand directories. This bug is hard to reproduce, but easily to experience. If I want to experience it, I am simply using my home-backup script. If you use it daily, often stop by Ctrl+C if and then continue while still working on the desktop, you will experience this corruption in a few days. rsync --delete -b --backup-dir=/media/test_dvdram/home_deleted -v -aHS . /media/test/dvdram/home I'm going to try taking a look at it next week. Thanks for the testcase, this is pretty much what I'd distilled from the previous comments also. And an additional info: Both DVD+-RW/RAM units where I experienced this bug are manufactured by LG. There is known problem of some LG firmwares, that in some special conditions they may trash buffers, e. g. problems reported with cdrecord sg0/sr0 locking clash problem (status check commands issued to the drive break running burning process). Stanislav, have you tried using a different form of drive? I note in comment #31 that there is possible concern over bad LG firmware. I don't have this drive and have been unable to reproduce using the hardware I have, No. All drives I am using, are LG. I can provide you access to one. VladimĂr may know more about the LG drives problem in cdrecord. It was bug #226019 > I can provide you access to one.
I'm located in the US, might be problematic but I'll let you decide this.
Stanislav, since I'm unable to reproduce this on my hardware, should I reassign to someone geographically in same area as you? Don't see you sending the drive to the United States as being particularly practical. I just encountered very similar breakage on USB flash while working around bug 388248 on openSUSE 10.3 i586. How to reproduce: 1. Insert USB flash 2. Mount manually (I used "mount -tvfat -onoexec,nosuid,async,nodev,gid=100,umask=0002,utf8=true,quiet,rw /dev/sda1 /mnt/", but it seems to be reproducible with other arguments as well.) 3. mkdir /mnt/testdir 4. sync 5. umount /mnt 6. sync 7. Wait for sync finished. Wait few additional seconds. Remove the USB flash Now insert again and you see corrupted data. Note that only manual umount is affected. If I am using GNOME volume manager to unmount the medium, I never seen the corruption. I don't know GNOME volume manager exactly. Is there any other action, which is not performed when doing steps above? I'm reluctant to believe this ;) Can you show me on Monday at work? I'd suspect some HW or userspace issue because I haven't seen any other reports of similar problems... I was not able to reproduce the problem with an arbitrary flash, but when I restored the original image back, the I got the corruption again. Steps to reproduce: see comment 37 ls -al /mnt/ ls: cannot access /mnt/testdir: Input/output error total 8 drwxrwxr-x 3 root users 4096 Jan 1 1970 . drwxr-xr-x 25 root root 4096 May 12 20:58 .. d????????? ? ? ? ? ? testdir Note that writing large amount of data to this flash works. Only last few blocks are lost. Syslog contents: May 12 21:26:38 utx kernel: usb 4-6: new high speed USB device using ehci_hcd and address 11 May 12 21:26:38 utx kernel: usb 4-6: new device found, idVendor=1307, idProduct=0163 May 12 21:26:38 utx kernel: usb 4-6: new device strings: Mfr=1, Product=2, SerialNumber=3 May 12 21:26:38 utx kernel: usb 4-6: Product: USB Flash Drive May 12 21:26:38 utx kernel: usb 4-6: Manufacturer: USB 2.0 May 12 21:26:38 utx kernel: usb 4-6: SerialNumber: 789485f5f19059 May 12 21:26:38 utx kernel: usb 4-6: configuration #1 chosen from 1 choice May 12 21:26:38 utx kernel: scsi7 : SCSI emulation for USB Mass Storage devices May 12 21:26:38 utx kernel: usb-storage: device found at 11 May 12 21:26:38 utx kernel: usb-storage: waiting for device to settle before scanning May 12 21:26:39 utx kernel: scsi 7:0:0:0: Direct-Access USB 2.0 USB Flash Drive 0.00 PQ: 0 ANSI: 2 May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] 7897087 512-byte hardware sectors (4043 MB) May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] Write Protect is off May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] Mode Sense: 00 00 00 00 May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] Assuming drive cache: write through May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] 7897087 512-byte hardware sectors (4043 MB) May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] Write Protect is off May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] Mode Sense: 00 00 00 00 May 12 21:26:39 utx kernel: sd 7:0:0:0: [sda] Assuming drive cache: write through May 12 21:26:40 utx kernel: sda: sda1 May 12 21:26:40 utx kernel: sd 7:0:0:0: [sda] Attached SCSI removable disk May 12 21:26:40 utx kernel: sd 7:0:0:0: Attached scsi generic sg0 type 0 May 12 21:26:40 utx kernel: usb-storage: device scan complete May 12 21:28:01 utx kernel: FAT: Did not find valid FSINFO signature. May 12 21:28:01 utx kernel: Found signature1 0x00000000 signature2 0x00000000 (sector = 1) May 12 21:28:03 utx kernel: FAT: Did not find valid FSINFO signature. May 12 21:28:03 utx kernel: Found signature1 0x00000000 signature2 0x00000000 (sector = 1) May 12 21:28:11 utx kernel: FAT: Did not find valid FSINFO signature. May 12 21:28:11 utx kernel: Found signature1 0x00000000 signature2 0x00000000 (sector = 1) May 12 21:28:28 utx kernel: usb 4-6: USB disconnect, address 11 May 12 21:28:37 utx kernel: usb 4-6: new high speed USB device using ehci_hcd and address 12 May 12 21:28:37 utx kernel: usb 4-6: new device found, idVendor=1307, idProduct=0163 May 12 21:28:37 utx kernel: usb 4-6: new device strings: Mfr=1, Product=2, SerialNumber=3 May 12 21:28:37 utx kernel: usb 4-6: Product: USB Flash Drive May 12 21:28:37 utx kernel: usb 4-6: Manufacturer: USB 2.0 May 12 21:28:37 utx kernel: usb 4-6: SerialNumber: 789485f5f19059 May 12 21:28:37 utx kernel: usb 4-6: configuration #1 chosen from 1 choice May 12 21:28:37 utx kernel: scsi8 : SCSI emulation for USB Mass Storage devices May 12 21:28:37 utx kernel: usb-storage: device found at 12 May 12 21:28:37 utx kernel: usb-storage: waiting for device to settle before scanning May 12 21:28:38 utx kernel: scsi 8:0:0:0: Direct-Access USB 2.0 USB Flash Drive 0.00 PQ: 0 ANSI: 2 May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] 7897087 512-byte hardware sectors (4043 MB) May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Write Protect is off May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Mode Sense: 00 00 00 00 May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Assuming drive cache: write through May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] 7897087 512-byte hardware sectors (4043 MB) May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Write Protect is off May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Mode Sense: 00 00 00 00 May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Assuming drive cache: write through May 12 21:28:38 utx kernel: sdb: sdb1 May 12 21:28:38 utx kernel: sd 8:0:0:0: [sdb] Attached SCSI removable disk May 12 21:28:38 utx kernel: sd 8:0:0:0: Attached scsi generic sg0 type 0 May 12 21:28:38 utx kernel: usb-storage: device scan complete May 12 21:28:52 utx kernel: FAT: Did not find valid FSINFO signature. May 12 21:28:52 utx kernel: Found signature1 0x00000000 signature2 0x00000000 (sector = 1) May 12 21:28:55 utx smartd[3593]: Device: /dev/hda, SMART Usage Attribute: 190 Temperature_Celsius changed from 67 to 56 May 12 21:28:55 utx smartd[3593]: Device: /dev/hda, SMART Usage Attribute: 194 Temperature_Celsius changed from 33 to 44 May 12 21:28:55 utx smartd[3593]: Device: /dev/hda, SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 70 to 66 May 12 21:28:56 utx kernel: FAT: Filesystem panic (dev sdb1) May 12 21:28:56 utx kernel: fat_get_cluster: invalid cluster chain (i_pos 0) May 12 21:28:56 utx kernel: File system has been set read-only May 12 21:31:40 utx su: (to nobody) root on none May 12 21:31:43 utx syslog-ng[2072]: last message repeated 2 times Created attachment 214511 [details]
blktrace.out.bz2
blktrace output (binary form).
Just seen another VFAT corruptions on openSUSE 11.0 (i586 and x86_64). Reproduced on another vendor formatted medium (A-Data 2G SD+USB card in USB mode). Medium was accepted by HAL and mounted, files were copied. When reading them, tails of copied archives were corrupted (truncated). Device was umounted via GNOME desktop device icon context menu in both cases. Syslog shows no errors. I did a research on this corruption: - copy file to the problematic flash - compare files - the result is OK - umount ; sync (or use GNOME desktop unmount) - dump the whole contents of the flash - eject and re-insert - mount (or wait for mounting) - dump the whole contents of the flash - compare files - somewhere in the last few megabytes, there is a corrupted block - compare dumps - there is 2048 different bytes So I am starting to believe, that it is not a problem of the filesystem. The flash problem was added to the kernel database as http://bugzilla.kernel.org/show_bug.cgi?id=11012 I did not reproduce the DVD-RAM problem since update to openSUSE 11.0. Based on comment #43 "I did not reproduce the DVD-RAM problem since update to openSUSE 11.0" I'm proposing closing this bug. You are free to reopen if it recoccurs. I agree. |