Bug 66013 (suse51013)

Summary: pthread_rwlock_rdlock: blocking error
Product: [openSUSE] openSUSE Tumbleweed Reporter: Gernot Payer <gpayer>
Component: BasesystemAssignee: Andreas Schwab <schwab>
Status: RESOLVED NORESPONSE QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P3 - Medium CC: chrubis, gpayer, hmuelle, matz
Version: CurrentFlags: schwab: needinfo? (gpayer)
Target Milestone: ---   
Hardware: x86-64   
OS: Linux   
See Also: http://sourceware.org/bugzilla/show_bug.cgi?id=13701
Whiteboard:
Found By: Other Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: pthread_rwlock_rdlock/2-1.c

Description Gernot Payer 2005-02-18 00:02:08 UTC
A low priority reader is not blocked when acquiring the lock, if a medium 
priority writer is waiting for the lock and a high priority reader is owing the 
lock.
Comment 1 Gernot Payer 2005-02-18 00:02:08 UTC
<!-- SBZ_reproduce  -->
Run attached test program.
Comment 2 Gernot Payer 2005-02-18 00:03:26 UTC
Created attachment 28607 [details]
pthread_rwlock_rdlock/2-1.c
Comment 3 Thorsten Kukuk 2005-02-18 21:09:00 UTC
. 
Comment 4 Thorsten Kukuk 2005-07-11 15:28:47 UTC
And this sounds like a duplicate of #66052 
Comment 5 Thorsten Kukuk 2005-07-11 15:29:18 UTC
Scheduler is implemented in kernel. 
Comment 6 Michael Gross 2005-10-04 12:40:21 UTC
Hello!

Due to the proposal from Andreas Jaeger, this bug will be closed as WONTFIX,
because there was no change to this bug for more than 2 months by now which
indicates a very low activity. 

If this bug is of relevance for the current release or should be kept open due
to another (not obvious reason), please reopen it ans state a reason for your
decision.

Please also check if the status-information for this bug is correct at all
correct if you reopen it. Generally the product-version should be elevated to
the current release in this case. Sorry if this causes you any inconvenience.

Kind regards,
the BNC-Screening-Team
Comment 7 Peter Morreale 2011-06-07 14:02:42 UTC
re-opening as the LTP test: .../pthread_rwlock_rdlock/2-1 clearly demonstrates that the issue still exists in (at least) OS-11.3.  

POSIX clearly states that if a writer is blocked, an additional reader entering into the lock must block until the writer has released the lock.  pthread_rwlock_rdlock() does not implement that behavior.  The additional reader obtains the lock while a thread wanting the write mode is blocked.

The behavior of the 2-1 test is:

main thread:   -> obtain read lock
writer thread: -> attempt to obtain write lock - block
reader tread:  -> attempt to obtain read lock - succeeds. 

The reader thread must block, since another thread wants a write lock.

There are, quite possibly, a lot of broken programs out there if they are dependent upon documented behavior.
Comment 8 Greg Kroah-Hartman 2011-08-31 21:09:28 UTC
Peter, I'd blame LTP, otherwise how was this test ever supposed to pass?

Is it file-system dependent?  Still an issue in openSUSE 11.4?
Comment 9 Peter Morreale 2011-08-31 21:58:49 UTC
(In reply to comment #8)
> Peter, I'd blame LTP, otherwise how was this test ever supposed to pass?
> 

:-)  The LTP code is functionally correct, unless I missed something.  Note I was testing with the current LTP version of this test, unsure whether anything changed from the original posting of this bug.

> Is it file-system dependent?  Still an issue in openSUSE 11.4?

Not that I'm aware of.  Not entirely sure how a file system would play into this.

Fails identically on OS 11.4
Comment 10 Cyril Hrubis 2011-09-01 08:07:42 UTC
Greg, we are actually fixing the LTP testcases and proved many of them wrong and fixed them, we just couldn't find anything wrong with this one (minus some coding style). So it would be great if somebody else could look at this as well.

The test source wasn't most likely touched since 2002 (besides some minor whitespace cleanup), at least the git history says so.

And as far as I understand the POSIX threads the read lock should not be obtained if there is writer locked or blocked on the lock and the writer has higher or equal priority. Unfortunately the reader gets the lock here.
Comment 11 Greg Kroah-Hartman 2011-09-01 23:24:32 UTC
So has this test ever worked?  If so, that's good perhaps we have a regression somewhere.

If not, I'd blame the test :)
Comment 12 Peter Morreale 2011-09-02 13:11:14 UTC
(In reply to comment #11)
> So has this test ever worked?  If so, that's good perhaps we have a regression
> somewhere.
> 
> If not, I'd blame the test :)

Unsure whether this has ever worked, given that the original bug report was 6 years ago, probably not.
Comment 13 Cyril Hrubis 2011-09-02 13:44:37 UTC
And there were tests that Newer Worked but still were correct and resulted in kernel or glibc patches.

And as we are on the noble and difficult path of making LTP stable testsuite, we still need either to prove the test wrong or fix Linux.
Comment 14 Greg Kroah-Hartman 2011-09-02 16:15:38 UTC
Proving LTP wrong, or fixing it, or the kernel, is on the _very_ low list of things we need to work on at the moment.

Have you brought this issue up on the upstream LTP mailing list?  What do the developers there say about this?  Have they asked the kernel community about this issue?
Comment 15 Cyril Hrubis 2011-09-07 09:21:04 UTC
Okay, given that we are LTP upstream for some time now (I tend to say we, but it's more like I do most of the upstream work now). This leaves us the kernel community, do you think that I should bring this to LKML?
Comment 16 Greg Kroah-Hartman 2011-09-07 15:14:08 UTC
If you have an LTP test that has _never_ worked, and no one has ever noticed it before now, then odds are, the test is incorrect.

And read/write locks are a glibc thing, not a kernel thing, right?  So this might not even be a kernel issue in the first place...
Comment 17 Cyril Hrubis 2011-09-08 09:35:37 UTC
Argh, you are right these locks are implemented in glibc using futexes. And I think I've pinpointed the problem, it seems that Single Unix Specification and POSIX contradicts in this case. The first one says it's undefined while the latter defines it's behavior. I'll move this to glibc upstream.
Comment 18 Harald Mueller-Ney 2012-02-09 12:55:38 UTC
Reopen to set needinfo
Comment 19 Harald Mueller-Ney 2012-02-09 12:57:09 UTC
Michael, can assign one of your guys to support Cyril on fixing the test case.
Comment 21 Michael Matz 2012-02-09 16:44:34 UTC
Cyril, please post the upstream bug id here, once you have one.
Comment 22 Cyril Hrubis 2012-02-16 17:04:39 UTC
Here it is:

http://sourceware.org/bugzilla/show_bug.cgi?id=13701
Comment 23 Jeff Mahoney 2012-03-06 07:55:30 UTC
Getting this out of the "unassigned kernel bugs" queue. Thanks for tracking it.
Comment 25 Michael Matz 2012-03-07 13:49:02 UTC
Assigning to Jan.  Jan: this is fairly low priority.
Comment 30 Andreas Schwab 2016-11-02 14:22:29 UTC
Moving to Tumbleweed.
Comment 52 Andreas Schwab 2019-10-14 09:32:24 UTC
Does this still happen?
Comment 53 Andreas Schwab 2020-02-26 11:43:39 UTC
Upstream bug has been closed.