Bug 1041129 - Complete freeze with "ata2: COMRESET failed" when on battery power
Summary: Complete freeze with "ata2: COMRESET failed" when on battery power
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Other (show other bugs)
Version: Current
Hardware: x86-64 Other
: P5 - None : Normal with 3 votes (vote)
Target Milestone: ---
Assignee: Alexei Sorokin
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-05-28 12:39 UTC by jean-christophe baptiste
Modified: 2017-09-17 09:33 UTC (History)
0 users

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
screenshot of VTY during the crash (5.99 MB, image/jpeg)
2017-05-28 12:39 UTC, jean-christophe baptiste
Details
Thinkpad UEFI hardware tests (313.12 KB, application/zip)
2017-05-28 13:10 UTC, jean-christophe baptiste
Details

Note You need to log in before you can comment on or make changes to this bug.
Description jean-christophe baptiste 2017-05-28 12:39:56 UTC
Created attachment 726715 [details]
screenshot of VTY during the crash

A few seconds after I unplug the power cord, my laptop starts lagging and freezing more and more until it becomes totally unresponsive.

The first symptoms are : applications cannot be launched or closed anymore, gnome-shell starts loosing GUI items, etc.
Then the gnome-shell session crashes and in return VTY also become unreachable.

On the VTY, the following messages appear :

 - ata2: COMRESET failed (errno=16)
 - systemd-journald complaining it cannot write logs.

< A photo of the error messages is enclosed here >

Same pattern if I try to boot straight on  battery power. It fails either early in the boot process or at the login screen.

Looking around for the "COMRESET failed" message, I did the following :

 - update the SSD firmware update to the latest version ;
 - update the UEFI (Thinkpad T460)

Also, note that I freshly install Tumbleweed on Friday.

Before, I was on Fedora and never had this issue.

So, except a weird coincidence, it is unlikely to be hardware related. Moreover, if my SSD was dying, why it would have no error in SMART and it would fail only on battery power.
Comment 1 jean-christophe baptiste 2017-05-28 13:10:14 UTC
Created attachment 726717 [details]
Thinkpad UEFI hardware tests

Adding here screenshots of Thinkpad hardware tests, including R/W tests and bus checks.

All tests are shown as PASSED.
Comment 2 jean-christophe baptiste 2017-05-28 16:43:37 UTC
Well, I finally found the culprit, it is not the kernel, it is TLP.

I got this hint after finding a similar report:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/539467

I had had also bad experiences issues with a similar tool, Powertop.
This time, I was not even aware that TLP came installed with Tumbleweed.

My opinion is that such tools should NOT be provided by default with any distro.

They bring weird instability issues and are not worth it.
On modern platforms, most of the powersave takes place at hardware/firmware level and tweaks enabled by these tools don't make much difference.

I don't know where to move this report... "installation" maybe ?
Comment 3 Stefan Hundhammer 2017-05-29 08:50:57 UTC
It's not CRITICAL if it happened once to one single user. Readjusting severity to a reasonable level.
Comment 4 Stefan Hundhammer 2017-05-29 09:00:57 UTC
Not installer related at all.
Comment 5 jean-christophe baptiste 2017-05-29 17:17:16 UTC
Sorry for messing with triage and severity (I thought it is a single impact estimate).

But I disagree with your assumption that only one user is affected. You cannot tell if other people simply did not report and moved ahead.

Several threads for distros like Arch and Ubuntu prove that other users were 
affected, and it is potentially harming to rely on such tweaking programs (tlp or powertop).
Comment 6 jean-christophe baptiste 2017-09-02 14:18:08 UTC
So this bug did affect a lot of people on other distros and still not assigned ?

To be on the safe side, openSUSE should NOT install TLP by default.

It is not efficient on modern hardware and it is buggy.