Bug 1211898 - stress-ng fsize failure on NFSv3
Summary: stress-ng fsize failure on NFSv3
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: openSUSE Kernel Bugs
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-06-01 09:12 UTC by Richard Palethorpe
Modified: 2023-06-26 06:55 UTC (History)
1 user (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Richard Palethorpe 2023-06-01 09:12:40 UTC
While setting up some OpenQA NFS testing I came across the following error with stress-ng.

$ stress-ng --sequential -1 --timeout 3 --class filesystem
...
stress-ng: fail:  [3866] fsize: fallocate unexpectedly succeeded at offset 262144 (0x40000), expecting EFBIG error
stress-ng: fail:  [3866] fsize: expected a SIGXFSZ signal at offset 262144 (0x40000), nothing happened
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 106797 (0x1a12d), expecting EFBIG error
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 1 (0x1), expecting EFBIG error
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 3 (0x3), expecting EFBIG error
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 7 (0x7), expecting EFBIG error
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 15 (0xf), expecting EFBIG error
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 31 (0x1f), expecting EFBIG error
stress-ng: info:  [3866] fsize: fallocate unexpectedly succeeded at offset 63 (0x3f), expecting EFBIG error
...

This does not happen with NFSv4. It happens with both NFSv3 sync and async. I don't see any other errors.

The tests have not been merged or scheduled on the main OpenQA instance yet. When they are I can post a link. https://github.com/os-autoinst/os-autoinst-distri-opensuse/pull/17181
Comment 1 Neil Brown 2023-06-26 06:55:42 UTC
The timeout setting it too small.  Use a bigger number.

NFSv3 (and NFSv4.1, but not NFSv4.2) does not implement fallocate().
So stress-ng uses a "shim_emulate_fallocate()" instead, which writes data.

shim_emulate_fallocate() stops trying to write if keep_stressing_flag() fails.
One of the things that causes this to fail is when a SIGALARM is delivered, which happens after the timeout.

So after the timeout, a fallocate attempt will appear to succeed.  This is arguably a bug in stress-ng.

NFS is behaving correctly.  stress-ng is not getting an error, because it the way is emulates fallocate is not reliable.