Bug 1217618

Summary: yast2 nfs-server hangs after clicking the "Finish" button
Product: [openSUSE] PUBLIC SUSE Linux Enterprise Server 15 SP6 Reporter: Huajian Luo <hluo>
Component: YaST2Assignee: E-mail List <yast2-maintainers>
Status: RESOLVED FIXED QA Contact:
Severity: Normal    
Priority: P5 - None CC: hluo
Version: unspecified   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.suse.de/tests/12915763/modules/nis_server/steps/52
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---
Attachments: screenshot
nis_server config
yast2log
y2log for the yast nfs-server call

Description Huajian Luo 2023-11-29 03:12:27 UTC
## Description:
 - In the yast test, we have a test nis_server which will setup a nis server with nfs domain name 'nis.openqa.suse.de'. afer click the next button which should show
the 'Directories to export', but it just show partial page and the rest is still
same with the previous page.

 - I've rerun with a branch to wait 300 seconds for the tag 'nfs-server-export'
but it still failed.
https://openqa.suse.de/tests/12923929#step/nis_server/52

 - I've attached the failed screenshot in the attachment.


## Observation

openQA test in scenario sle-15-SP6-Online-x86_64-nis_server@64bit fails in
[nis_server](https://openqa.suse.de/tests/12915763/modules/nis_server/steps/52)

## Test suite description
https://progress.opensuse.org/issues/9900
this is only the working part, there are few bugs


## Reproducible

Fails since (at least) Build [40.1](https://openqa.suse.de/tests/12895161)


## Expected result

Last good: [39.1](https://openqa.suse.de/tests/12838368) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.suse.de/tests/latest?arch=x86_64&distri=sle&flavor=Online&machine=64bit&test=nis_server&version=15-SP6)
Comment 1 Huajian Luo 2023-11-29 03:13:43 UTC
Created attachment 871046 [details]
screenshot
Comment 2 Huajian Luo 2023-11-29 05:48:00 UTC
Created attachment 871047 [details]
nis_server config
Comment 3 Huajian Luo 2023-11-29 05:54:48 UTC
we have another test case that have similar issue with refresh the network card setup page.

https://openqa.suse.de/tests/12923972#step/yast2_lan_restart_bridge/26
Comment 4 Stefan Hundhammer 2023-11-29 09:14:12 UTC
As always for each and every YaST bug report, we need y2logs.
Comment 5 Stefan Hundhammer 2023-11-29 09:21:04 UTC
In the video at 0:18, I see that screenshot, and this is clearly not a refresh problem (and the button is "Finish", not "Next", so there is no more step after this), something is hanging. A refresh problem would be UI-related which this is clearly not.

Probably it attempts to start something in the background, and that hangs.
Comment 6 Stefan Hundhammer 2023-11-29 09:22:09 UTC
ARGH. and it's an NFS SERVER that you are trying to configure here, not a NIS server. That's something completely different.
Comment 7 Stefan Hundhammer 2023-11-29 09:30:13 UTC
So that YaST NFS server module is called after you configured a NIS server.

And the last step (writing the configuration and starting the NFS server) hangs after you just configured a NIS server.

Do you know if the NIS server is now active? If you now have a working NIS server? Are there any tests if that worked? I don't see anything. A lot of weird things might happen if that didn't work.

What is this test actually testing with the NFS server after the NIS server?
Comment 8 Huajian Luo 2023-11-29 09:51:18 UTC
but we use the dev mode and it works https://openqa.suse.de/tests/12924231 . so that might be some refresh issue for the yast2 I think.
Comment 9 Huajian Luo 2023-11-29 09:56:24 UTC
The NIS server is now active and works and it can pass with the OpenQA develop mode. https://openqa.suse.de/tests/12924231
Comment 10 Stefan Hundhammer 2023-11-29 10:14:17 UTC
(In reply to Huajian Luo from comment #8)
> but we use the dev mode and it works https://openqa.suse.de/tests/12924231 .
> so that might be some refresh issue for the yast2 I think.

All that is known that that time is that something hangs.

There is no refresh problem; that would mean partial screen refresh which is not what we see here.

And please attach y2logs, as requested. Without logs, we can't do anything.
Comment 11 Huajian Luo 2023-11-29 10:19:11 UTC
OK, I'll attach one tomorrow, looks like there isn't one in the openQA job's logs, https://openqa.suse.de/tests/12923929#downloads. thanks
Comment 12 Stefan Hundhammer 2023-11-29 10:41:32 UTC
And please stop changing the bug subject back to the wrong thing.
It's NFS server that hangs, not NIS server.
Comment 13 Huajian Luo 2023-11-30 07:56:25 UTC
I've tried to get the y2log from a branch and uploaded to the attachment.
the job is: https://openqa.suse.de/tests/12931336#step/nis_server/53

thanks.
Comment 14 Huajian Luo 2023-11-30 07:56:46 UTC
Created attachment 871078 [details]
yast2log
Comment 15 Stefan Hundhammer 2023-11-30 09:11:27 UTC
Created attachment 871081 [details]
y2log for the yast nfs-server call

Extracted from the attached y2logs tarball with

  y2log-merge
  y2log-split

from https://github.com/shundhammer/y2log-util
Comment 16 Stefan Hundhammer 2023-11-30 09:22:43 UTC
From that partial y2log:

It starts with calling "yast2 nfs_server qt ...".

> 2023-11-30 02:49:03 <1> susetest(5695) [Ruby] bin/y2start(<main>):22
>   y2base called with
>   ["nfs_server", "qt", "-name", "YaST2", "-icon", "yast"]
> ...
> ...

This can be used to identify the last dialog that was visible:

> 2023-11-30 02:49:08 <0> susetest(5695) [ui-shortcuts] YShortcutManager.cc(checkShortcuts):66
>   Checking keyboard shortcuts
>   Shortcut conflict: 'C' used for YSelectionBox "Directories"
>   Shortcut conflict: 'E' used for YPushButton "Edit"
>   Shortcut conflict: 'H' used for YPushButton "Add Host"
>   Shortcut conflict: 'H' used for YQWizardButton "Help"
>   Shortcut conflict: 'E' used for YQWizardButton "Release Notes"
>   Shortcut conflict: 'C' used for YQWizardButton "Cancel"
> 2023-11-30 02:49:08 <0> susetest(5695) [ui-shortcuts] YShortcutManager.cc(resolveAllConflicts):166
>   Resolving shortcut conflicts
>   Keeping preferred shortcut 'H' for YQWizardButton "Help"
>   Keeping preferred shortcut 'C' for YQWizardButton "Cancel"
>   Keeping preferred shortcut 'E' for YQWizardButton "Release Notes"
>   Couldn't resolve shortcut conflict for YPushButton "Edit" at 0x7faa645a2bf0 - assigning no shortcut
>   Reassigning shortcut 'A' to YPushButton "Add Host"
>   Reassigning shortcut 'R' to YSelectionBox "Directories"
> ...
> ...
> 2023-11-30 02:49:06 <0> susetest(5695) [ui-wizard] YCPWizardCommandParser.cc(isCommand):223
>   Recognized wizard command SetNextButtonID(any) : `SetNextButtonID (`next)
> 2023-11-30 02:49:06 <0> susetest(5695) [ui-wizard] YCPWizardCommandParser.cc(isCommand):223
>   Recognized wizard command SetFocusToNextButton() : `SetFocusToNextButton ()
> ...  
> 2023-11-30 02:49:06 <0> susetest(5695) [Ruby] binary/Yast.cc(ycp_module_call_ycp_function):395
>   Call WaitForEvent
> ...


Finally, this is the last call in the log:

> 2023-11-30 02:49:08 <0> susetest(5695) [Ruby] binary/Yast.cc(ycp_module_call_ycp_function):395
>   Call WaitForEvent


This is simply waiting for an event - a mouse click, a key press, or a synthetic event sent by the libyui-rest-api which is heavily used by OpenQA (I don't know if that is the case for this test).

AFAICS simply no event ever arrives. That's not a "refresh problem" by any stretch of the imagination, nor is it hanging: It's normal UI operation.

I am pretty sure when you do the same thing interactively, you can simply press that "Finish" button, and it will continue.
Comment 17 Stefan Hundhammer 2023-11-30 10:19:54 UTC
https://openqa.suse.de/tests/12915763/modules/nis_server/steps/1/src

> sub run {
>     my ($self) = @_;
>     x11_start_program('xterm -geometry 155x45+5+5', target_match => 'xterm');
>     turn_off_gnome_screensaver if check_var('DESKTOP', 'gnome');
>     become_root;
>     setup_static_mm_network($setup_nis_nfs_x11{server_address});
>     zypper_call 'in yast2-nis-server yast2-nfs-server';
> 
>     # we have to stop the firewall, see bsc#999873 and bsc#1083487#c36
>     systemctl 'stop ' . $self->firewall;
> 
>     my $module_name = y2_module_consoletest::yast2_console_exec(yast2_module => 'nis_server');
>     nis_server_configuration();
>     wait_serial("$module_name-0", 360) || die "'yast2 nis server' didn't finish";
>     assert_screen 'yast2_closed_xterm_visible';
>     $module_name = y2_module_consoletest::yast2_console_exec(yast2_module => 'nfs_server');
>     nfs_server_configuration();
      ^^^^^^^^^^^^^^^^^^^^^^^^^^^
      This is where this is stuck

>     wait_serial("$module_name-0", 360) || die "'yast2 nfs server' didn't finish";
>     assert_screen 'yast2_closed_xterm_visible', 200;
>     # In order for the hostname to get the set value via yast2 nis_server, a restart is needed. Otherwise "make"
>     # command won't work as in Makefile, there is a variable that gets it's value from "domainname" command
>     systemctl 'restart network';
>     # NIS and NFS Server is configured and running, configuration continues on client side
>     mutex_create('nis_nfs_server_ready');
>     my $children = get_children();
>     my $child_id = (keys %$children)[0];
>     mutex_wait('nis_nfs_client_ready', $child_id);
>     # Read content of a file created by the client
>     setup_verification();
>     enter_cmd "killall xterm";    # game over -> xterm
> }
Comment 18 Huajian Luo 2023-11-30 10:34:41 UTC
Now we add 'SHIFT+F3" as a workaround for it.

```
sub apply_workaround_poo124652 {
    my ($mustmatch) = shift;
    my $timeout;
    $timeout = shift if (@_ % 2);
    my %args = (timeout => $timeout // 0, @_);
    if (!check_screen($mustmatch, %args)) {
        record_info('poo#124652', 'poo#124652 - gtk glitch not showing dialog window decoration on openQA');
        send_key('shift-f3', wait_screen_change => 1);
        check_screen('style-sheet-selection-popup', 10);
        send_key('esc', wait_screen_change => 1);
        # in some verification tests this didn't work, so let's check
        if (!check_screen($mustmatch)) {
            record_info('Retry', "shift-f3 workaround did not solve the problem");
            send_key('alt-f10', wait_screen_change => 1);
            send_key('alt-f10', wait_screen_change => 1);
        }
    }
}

```
and it can pass now.
https://openqa.suse.de/tests/12931367
Comment 19 Stefan Hundhammer 2023-11-30 10:37:24 UTC
In the video at 0:16 I can see that the NIS server is configured to handle /etc/hosts, too. That may or may not affect the network for subsequent network operations (like the NFS server configuration).

At 0:17 I see the xterm where those commands were started; first a 'zypper -n yast2-nis-server yast2-nfs-server' that refreshed repos and then installed new versions of those packages:

  nfs-kernel-server-2.1.1-150500.22.3.1.x86_64.rpm
  yast2-nis-server-4.6.0-150600.1.1.noarch.rpm
  yast2-nfs-server-4.6.0-150600.1.1.noarch.rpm

Then it started

  systemctl --no-pager stop firewalld
  yast2 nis-server     (no screenoutput at all)
  yast2-nfs-server

which immediately showed the screenshot of the first attachment here:

--------------------------------------------
  NFS Server Configuration

    NFS Server
      ( ) Start
      (x) Do Not Start

    Firewall not configurable
      ...
      ...

    Enable NFSv4
      [x] Enable NFSv4

      Enter NFSv4 domain name:
      [localdomain              ]

    [ ] Enable GSS Security

             [Cancel] [Back] [OK]

--------------------------------------------  

OpenQA changed the text for the comain name to
   [nfs.openqa.suse.de           ]

The [OK] button changed to [Next] a fraction of a second later, then to [Finish]. By 0:17, the changes were done, and from that point on, the screen was static - no more change.

Some of those details were NOT recorded with OpenQA screenshots, but the change of the button label and the input field content were:

https://openqa.suse.de/tests/12915763#step/nis_server/50
https://openqa.suse.de/tests/12915763#step/nis_server/51
https://openqa.suse.de/tests/12915763#step/nis_server/52
Comment 20 Stefan Hundhammer 2023-11-30 10:39:35 UTC
(In reply to Huajian Luo from comment #18)
> Now we add 'SHIFT+F3" as a workaround for it.
> 
> ```
> sub apply_workaround_poo124652 {
...
>     if (!check_screen($mustmatch, %args)) {
>         record_info('poo#124652', 'poo#124652 - gtk glitch not showing
> dialog window decoration on openQA');
> 

What does that mean? That a Gtk error changed the actual screenshot vs. the expected one?


> and it can pass now.
> https://openqa.suse.de/tests/12931367

So we can close this one now?
Comment 21 Huajian Luo 2023-11-30 10:42:01 UTC
You can close if you are confirmed it's not a bug. we now applied this workaround for all these screenshot checking and passed now.

Thanks.
Comment 22 Stefan Hundhammer 2023-11-30 11:57:17 UTC
If this is a bug, it's not on the YaST side; either on the networking setup or with REST API events not being sent.

Closing.