Bug 1177060 - drm_calc_timestamping_constants error
drm_calc_timestamping_constants error
Status: RESOLVED INVALID
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Kernel
Current
x86-64 openSUSE Tumbleweed
: P5 - None : Normal (vote)
: ---
Assigned To: openSUSE Kernel Bugs
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2020-09-28 18:44 UTC by Therapon Sundoulos
Modified: 2022-01-14 14:24 UTC (History)
4 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
logfile snippet showing start of dotclock error (14.36 KB, text/plain)
2020-09-29 12:40 UTC, Therapon Sundoulos
Details
kernel log with sleep/wake cycles showing error (210.23 KB, text/x-log)
2020-10-03 11:33 UTC, Therapon Sundoulos
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Therapon Sundoulos 2020-09-28 18:44:36 UTC
Dell Precision M4800 laptop
nVidia GK107GLM [Quadro K1100M]
Nouveau 20.1.8-264.1 driver

Once it starts happening, the following line is printed to the journald log:

[drm:drm_calc_timestamping_constants [drm]] *ERROR* crtc 50: Can't calculate constants, dotclock = 0!

It's a continuous stream as long as the mouse is moving. After just a few hours there are thousands of these lines in the log.
Comment 1 Therapon Sundoulos 2020-09-29 12:40:33 UTC
Created attachment 842069 [details]
logfile snippet showing start of dotclock error
Comment 2 Therapon Sundoulos 2020-09-29 12:48:36 UTC
Rebooted, ran fine for several hours with no dotclock errors in the logfile. 

Resumed from sleep, errors began - in the attachment is the journald logfile from wakeup until the stream of dotclock errors began. At times the errors hit the log at the rate of 1000 to 2000 lines per minute.
Comment 3 Takashi Iwai 2020-09-29 16:00:26 UTC
Could you try the older kernel found in OBS history repo and check in which version the problem started?
  http://download.opensuse.org/history/

If this is a regression during the minor kernel version update, it's easier to spot out.  But if it were a major kernel version update (say, 5.7 to 5.8), it's harder...

In anyway, it'd be best to report to upstream bug tracker, as the issue is about nouveau.
Comment 4 Therapon Sundoulos 2020-09-30 19:13:10 UTC
Working on trying to figure out which kernel was last without the problem. So far it is present in all from 5.8.4-1.1 and later. Takes time to determine if the problem is still present - have to reboot, use it for a little while, then let it suspend. The issue doesn't appear until wake after first suspend.
Comment 5 Therapon Sundoulos 2020-10-01 16:31:46 UTC
All of the kernels I have tried from kernel-default-5.8.0-1.1 through 5.8.10-1.2 display the same behavior.

I am currently running kernel-default-5.7.12-1.1.g9c98feb with no issues thus far. So... it seems to be something introduced in 5.8.0

You said "it'd be best to report to upstream bug tracker". Where should I do that?
Comment 6 Therapon Sundoulos 2020-10-01 22:24:56 UTC
fwiw, the issue is still present in kernel-default-5.9.rc6-1.1.gc5644c3
Comment 7 Therapon Sundoulos 2020-10-01 23:28:06 UTC
Well... it just showed up in kernel-default-5.7.12-1.1.g9c98feb

Could be wrong but I don't believe I've had the problem since the beginning of August. Maybe a plasma/kde issue rather than a kernel one?
Comment 8 Therapon Sundoulos 2020-10-02 17:46:28 UTC
I downgraded libdrm_nouveau2 and libdrm_nouveau2-32bit from version 2.4.102-2.1 to 2.4.102-1.2 and the errors quit printing to the log. Downgrading might have introduced other issues but the dotclock error has stopped.

Where does it go from here to get fixed?
Comment 9 Therapon Sundoulos 2020-10-02 23:51:09 UTC
Darn! It came back. Grr, I don't have time to go any further with this.
Comment 10 Takashi Iwai 2020-10-03 07:50:43 UTC
Oh that sounds nasty...

The latest upstream bug tracker would be gitlab.freedesktop.org "issues", I suppose.

Does the issue happen with the normal X session like icewm, or it's only with KDE or whatever compositor?

I looked for other similar bugs showing this error message, but nothing hit by the search, so far.
Comment 11 Therapon Sundoulos 2020-10-03 11:33:16 UTC
Created attachment 842228 [details]
kernel log with sleep/wake cycles showing error
Comment 12 Therapon Sundoulos 2020-10-03 11:47:50 UTC
I haven't tried any others, my workflow depends on kde features for some things. 

There is a repeatable scenario though:

1. fresh boot, all is quiet, do some stuff and then wait for the power save timer to trigger sleep.
2. wait a while, wake from sleep, errors begin quickly.
3. trigger sleep using the keyboard
4. wait a minute, wake from sleep, all is quiet.

One thing I noticed, the timestamp on entries in the kernel log doesn't change on wake. In the attached log, sleep triggered at 10/2/20 9:12 PM. The log shows wake and subsequent errors at the same time. The journal log shows those events as occurring at 10/3/20 7:03 AM. It's like the clock goes to sleep too and doesn't catch up after waking.
Comment 13 Therapon Sundoulos 2020-10-07 12:36:18 UTC
I did post this over on gitlab.freedesktop.org "issues", no response yet.

One more piece of the puzzle. The error only occurs when the power switch is depressed to wake the laptop up from sleep. It does not happen when the lid switch triggers wake. And it doesn't seem to matter how the sleep action is called - gui, timer, lid switch.

And that raises the question: is this a nouveau issue or a systemd one?
Comment 14 Takashi Iwai 2020-10-09 08:06:42 UTC
(In reply to Therapon Sundoulos from comment #13)
> I did post this over on gitlab.freedesktop.org "issues", no response yet.

Could you tell us the URL?  I'll watch it, too.

> One more piece of the puzzle. The error only occurs when the power switch is
> depressed to wake the laptop up from sleep. It does not happen when the lid
> switch triggers wake. And it doesn't seem to matter how the sleep action is
> called - gui, timer, lid switch.
> 
> And that raises the question: is this a nouveau issue or a systemd one?

It's hard to answer, but spamming with error messages must be avoided anyways.
Comment 15 Therapon Sundoulos 2020-10-09 12:10:57 UTC
The link is:  https://gitlab.freedesktop.org/drm/nouveau/-/issues/12

If it is truly not a consequential error (which it doesn't seem to be), it would be nice if it just didn't post to the log. Same is true for these: "qt.qpa.xcb: QXcbConnection: XCB error: 3 (BadWindow)". I get way too many of them at times, too.
Comment 16 Therapon Sundoulos 2020-10-14 13:19:57 UTC
Here's what I posted over at https://gitlab.freedesktop.org/drm/nouveau/-/issues/12

I'm giving powerdevil5 the credit for this one.

On further testing, if Screen Energy Saving is turned on in the powermanager module, the errors occur after wake from suspend. This is the section of suspendsession.cpp from powerdevil-5.20.0 that I'm looking at:

> void SuspendSession::onIdleTimeout(int msec)
> {
>     QVariantMap args{
>         {QStringLiteral("Type"), m_autoType}
>     };
> 
>     // we fade the screen to black 5 seconds prior to suspending to alert the user
>     if (msec == m_idleTime - 5000) {
>         args.insert(QStringLiteral("GraceFade"), true);
>     } else {
>         args.insert(QStringLiteral("SkipFade"), true);
>     }
> 
>     trigger(args);
> }

If the screen is already switched off and then the command is sent to fade the screen, it causes the errors after wake. There is logic to trigger the fade 5 seconds before suspend. There should also be logic to SkipFade if the screen is already switched off.

This should probably get posted somewhere else, maybe bugs.kde.org in the powerdevil section?
Comment 17 Takashi Iwai 2020-10-23 16:26:00 UTC
So the behavior of powerevil explains the weird error, indeed.
You might have a clear picture by turning the drm debug option to follow what operation is being hit there, too.
Comment 18 Therapon Sundoulos 2020-10-24 00:16:09 UTC
Ok, how do I do that?
Comment 19 Takashi Iwai 2020-10-24 07:39:05 UTC
You can adjust the value by writing to /sys/module/drm/parameters/drm file, e.g.
  echo 0x0e > /sys/module/drm/parameters/drm

Each bit represents the debug component, and 0x0e a good one in general.  0x0f will show more, and you can use it before suspend / resume.
Comment 20 Therapon Sundoulos 2020-10-24 13:49:10 UTC
/sys/module/drm/parameters/drm file does not exist and I get a permission denied error if I try to make any writes to the /sys/module/drm/parameters/ folder. Logged in as root, of course.
Comment 21 Takashi Iwai 2020-10-24 15:52:31 UTC
Sorry, it was a typo.  It must be /sys/module/drm/parameters/debug.
And it has to be written as root.
Comment 22 Therapon Sundoulos 2020-10-24 19:43:15 UTC
Hope this is helpful to someone - it's the journal log with drm debug turned up.

https://drive.google.com/open?id=1ne_8dWsm1IHfebFckjjDdGph7fhBvEdZ
Comment 23 Miroslav Beneš 2022-01-14 13:43:56 UTC
Is there anything new on this? Therapon, have you reported the bug to powerdevil at bugs.kde.org?

Anyway, it does not seem to be the kernel bug, so we either change the component or close.
Comment 24 Therapon Sundoulos 2022-01-14 14:19:07 UTC
No activity here: https://gitlab.freedesktop.org/drm/nouveau/-/issues/12
or here: https://bugs.kde.org/show_bug.cgi?id=427754

I didn't see any change in the relevant section of suspendsession.cpp
I don't use ScreenEnergySaving so it isn't an issue for me. Apparently the developers have bigger fish to fry.

You are correct, it doesn't seem to be a kernel error. Probably ok to close. Thanks.
Comment 25 Miroslav Beneš 2022-01-14 14:24:09 UTC
Ok then, thanks for the feedback.