Bug 1212765

Summary: mysql-systemd-helper 60 second wait too short sometimes
Product: [openSUSE] openSUSE Tumbleweed Reporter: Patrick Schaaf <patrick.schaaf>
Component: OtherAssignee: Danilo Spinella <danilo.spinella>
Status: NEW --- QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None    
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---

Description Patrick Schaaf 2023-06-27 09:47:10 UTC
We recently ran into a glitch when updating tumbleweed DB machines to current state, with mariadb moving from 10.10 to 10.11.

This was on some dev db VMs on admittedly kind of overbooked and older hardware hosts where everything disk related is a bit more sluggish during reboot... which resulted in the ExecPre "mysql-systemd-helper upgrade" timing out while starting the protected upgrade instance of the daemon.

2023-06-26T15:17:08.118054+02:00 dev-db3 mysql-systemd-helper[885]: Running protected MySQL...
2023-06-26T15:17:08.118071+02:00 dev-db3 mysql-systemd-helper[885]: Waiting for MySQL to start
2023-06-26T15:18:14.437342+02:00 dev-db3 mysql-systemd-helper[885]: MySQL is still dead
2023-06-26T15:18:14.437359+02:00 dev-db3 mysql-systemd-helper[885]: MySQL didn't start, can't continue

Looking at the code, I see it hardcodes the 60 seconds:

mysql_wait() {
...
        for i in {1..60}; do

Maybe that could be increased a bit?
Comment 1 Patrick Schaaf 2023-06-27 09:50:05 UTC
Actually, once the server got into that state where the ExecPre failed, short of a reboot with more timing luck, manual attempts at starting mysql specifically, hung. 

     Active: deactivating (stop-sigterm) (Result: exit-code)
       Docs: man:mysqld(8)
             https://mariadb.com/kb/en/library/systemd/
    Process: 873 ExecStartPre=/usr/libexec/mysql/mysql-systemd-helper install (code=exited, status=0/SUCCESS)
    Process: 885 ExecStartPre=/usr/libexec/mysql/mysql-systemd-helper upgrade (code=exited, status=1/FAILURE)

and "rcmysq start" attempt hung, and the colleague who experienced that first hand, says that took more than 10 minutes (and then he force-restarted the whole machine)