Bug 785240 - btrfs inode cache causes boot delays
btrfs inode cache causes boot delays
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE 12.2
Classification: openSUSE
Component: Kernel
Final
x86-64 openSUSE 12.2
: P5 - None : Major (vote)
: ---
Assigned To: David Sterba
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-10-16 09:01 UTC by Ákos Szőts
Modified: 2013-05-16 10:00 UTC (History)
5 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
Y2Logs attached (6.40 MB, application/x-gzip)
2012-11-01 08:36 UTC, Ákos Szőts
Details
Extra debug output got from /dev/.sysconfig/network (49.73 KB, application/x-xz)
2012-11-13 20:56 UTC, Ákos Szőts
Details
SuSEfirewall2 (33.88 KB, application/octet-stream)
2012-11-13 20:58 UTC, Ákos Szőts
Details
/sbin/SuSEfirewall2 with echo-s (66.67 KB, text/plain)
2012-11-20 09:54 UTC, Ákos Szőts
Details
/etc/init.d/SuSEfirewall2_init with echo-s (2.07 KB, text/plain)
2012-11-26 12:58 UTC, Ákos Szőts
Details
"/sbin/SuSEfirewall2" with lots of echos (91.26 KB, text/plain)
2012-11-26 20:22 UTC, Ákos Szőts
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Ákos Szőts 2012-10-16 09:01:38 UTC
User-Agent:       Opera/9.80 (X11; Linux x86_64; Edition Next) Presto/2.12.388 Version/12.10

I have a btrfs partition with 3.6.0-1-desktop kernel.
The booting of the system takes four minutes...

From this time SuSEfirewall2_init.service shares in 2 minutes. According to "systemd-analyze blame":

122804ms SuSEfirewall2_init.service
[...]
307ms SuSEfirewall2_setup.service


I don't know how to debug this, but I know the following for sure:
- YaST says the network is pulled up with ifup
- NetworkManager.service is disabled (disabled; inactive; dead)
- IPv6 is disabled in YaST

Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Comment 1 Jiaying Ren 2012-11-01 02:16:27 UTC
Ákos Szőts ,thank you for your report,could you please help to provide y2logs according to the following URL?
   http://en.opensuse.org/openSUSE:Bugreport_YaST
   Thanks.
Comment 2 Ákos Szőts 2012-11-01 08:36:59 UTC
Created attachment 511611 [details]
Y2Logs attached
Comment 3 Ákos Szőts 2012-11-05 11:29:18 UTC
When the system boots, the HDD is working very heavily. What IO operations do SuSEfirewall2_init.service have to do during the boot process?

Is there any tool with which I can observe the IO usage during the boot time (like the iotop)?
Comment 4 Ákos Szőts 2012-11-12 08:04:49 UTC
I made some inspection on the network and found this one:

$> systemctl status network.service

network.service - LSB: Configure the localfs depending network interfaces
          Loaded: loaded (/etc/init.d/network)
          Active: failed (Result: exit-code) since Mon, 12 Nov 2012 08:34:43 +0100; 27min ago
         Process: 973 ExecStart=/etc/init.d/network start (code=exited, status=7/NOTRUNNING)
          CGroup: name=systemd:/system/network.service
                  └ 4174 /sbin/dhcpcd --netconfig -L -E -HHH -c /etc/sysconfig/network/scripts/dhcpcd-hook -t 0 -h linux-suse eth0

Nov 12 08:34:43 linux-suse network[973]: eth0      is up
Nov 12 08:34:43 linux-suse network[973]: [1Awaiting
Nov 12 08:34:43 linux-suse network[973]: eth0      interface could not be set up until now
Nov 12 08:34:43 linux-suse network[973]: ..failedSetting up service (localfs) network  .  .  .  .  .  .  .  .  .  ...failed
Nov 12 08:34:43 linux-suse dhcpcd[2293]: eth0: timed out
Nov 12 08:34:43 linux-suse dhcpcd[2293]: eth0: trying to use old lease in `/var/lib/dhcpcd/dhcpcd-eth0.info'
Nov 12 08:34:43 linux-suse dhcpcd[2293]: eth0: adding IP address 192.168.1.110/24
Nov 12 08:34:43 linux-suse dhcpcd[2293]: eth0: adding default route via 192.168.1.254 metric 0
Nov 12 08:34:54 linux-suse dhcpcd-hook[3563]: ATTENTION: You have modified /etc/resolv.conf.  Leaving it untouched...
Nov 12 08:34:54 linux-suse dhcpcd[2293]: eth0: Failed to lookup hostname via DNS: Name or service not known
Comment 5 Marius Tomaschewski 2012-11-13 12:22:35 UTC
Please install most recent sysconfig update and move:
   /etc/sysconfig/network/if-up.d/21-cifs
and
   /etc/sysconfig/network/21-dhcpcd-hook-samba

away. The problem is, that the samba hooks make systemctl
calls which cause something like a deadlock in systemd.

You can set DEBUG=EXTRA in /etc/sysconfig/network/config and
then _reboot_ -- it will collect bash -x and oder traces in
/dev/.sysconfig/network/exdbg*. Perhaps it shows something.

I've no idea why SuSEFirewall2_init my need so long.

Ludwig, see initial comment.
Comment 6 Marius Tomaschewski 2012-11-13 14:41:13 UTC
I've made a patch to rcnetwork to not return !0 to systemd in the
dhcp-in-background "interface could not be set up until now" case,
that basically happens when something blocks ifup like in the well
known DHCLIENT_MODIFY_SMB_CONF=yes issue:

- Changed rcnetwork to not report dhcp in background to systemd
  (bnc#785240,bnc#780644). You may set DHCLIENT_MODIFY_SMB_CONF
  to "no" in /etc/sysconfig/network/dhcp as the samba/cifs hooks
  make systemctl calls to reload/restart nmb and other services
  which block (systemd deadlock) ifup scripts for a long time.
- Fixed to check and reject too long interface names or names
  with suspect characters, do not wait when creation of virtual
  interface name fails (bnc#784952).
- Fixed to not report failure while setting unsupported power
  management option in ifup-wireless as the support and features
  depends on the driver (bnc#716652).
- Fixed ifup-dhcp to not start dhcp clients too early or they may
  be unable to send packets and just increase resend time. Instead,
  load af_packet module early and wait for link ready (bnc#780644).
- Fixed netconfig/ntp-runtime to not use try-restart as its start
  may block for a while, so just add servers ourself. When ntpd is
  not running (not yet started), the init script will pickup our
  servers later (bnc#780644).

Requested (rq#141173) to release in openSUSE:12.2:Update.

=> Reassinging to Ludwig to check the SuSEfirewall2 problem.
Comment 7 Ludwig Nussel 2012-11-13 15:14:31 UTC
please attach /etc/sysconfig/SuSEfirewall2
Comment 8 Ákos Szőts 2012-11-13 20:56:21 UTC
Created attachment 512977 [details]
Extra debug output got from /dev/.sysconfig/network

The steps I have done now:

1.) Installed your sysconfig update (sysconfig-0.76.7-149.1) and rebooted:
132015ms SuSEfirewall2_init.service
 38540ms SuSEfirewall2_setup.service (not much changed)

2.) Moved the "21-cifs" and "21-dhcpcd-hook-samba" out of their directory and ran SuSEconfig. From the next reboot on I don't have any internet connection until I manually issue the command "dhclient eth0".

83488ms SuSEfirewall2_init.service
 1175ms SuSEfirewall2_setup.service (much better)

$> systemctl status network.service:
network.service - LSB: Configure the localfs depending network interfaces
          Loaded: loaded (/etc/init.d/network)
          Active: failed (Result: exit-code) since Tue, 13 Nov 2012 21:43:05 +0100; 2min 40s ago
         Process: 946 ExecStart=/etc/init.d/network start (code=exited, status=7/NOTRUNNING)
          CGroup: name=systemd:/system/network.service

Nov 13 21:42:35 linux-suse network[946]: Setting up (localfs) network interfaces:
Nov 13 21:42:35 linux-suse network[946]: lo
Nov 13 21:42:35 linux-suse network[946]: lo        IP address: 127.0.0.1/8
Nov 13 21:42:35 linux-suse network[946]: ..doneWaiting for mandatory devices:  eth0
Nov 13 21:43:05 linux-suse network[946]: 28 26 25 23 22 20 18 17 15 14 12 11 9 8 6 5 3 2 0
Nov 13 21:43:05 linux-suse network[946]: eth0      device: Realtek Semiconductor Co., Ltd. RTL8111/8168B
Nov 13 21:43:05 linux-suse network[946]: eth0      DHCP4 client NOT running
Nov 13 21:43:05 linux-suse network[946]: eth0      is down
Nov 13 21:43:05 linux-suse network[946]: eth0      interface could not be set up until now

3.) Rebooted with the DEBUG="EXTRA" flags and I attached the compressed version of the aforementioned directory.
Comment 9 Ákos Szőts 2012-11-13 20:58:35 UTC
Created attachment 512978 [details]
SuSEfirewall2

Attached the /etc/sysconfig/SuSEfirewall2 script.
If my memory servers me well, I haven't modified anything inside (aka. it should be factory default).
Comment 10 Bernhard Wiedemann 2012-11-14 13:43:12 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/141173
Comment 11 Ludwig Nussel 2012-11-14 15:13:48 UTC
The config file does not contain suspicous entries. When looking at the issue I found that iptables-batch is not included in 12.2 thow. That means SuSEfirewall2 has to issue hundreds of iptables calls which may explain the heavy disk usage. I've prepared a package with iptables-batch enabled again in home:lnussel:branches:OBS_Maintained:iptables. If Jan is fine with the change to the package (also for Factory) I'd submit the package as maintenance update.
Comment 12 Jan Engelhardt 2012-11-14 15:57:35 UTC
Just call autoreconf unconditionally for the time being, and let the file list be what it was.

# patched Makefile.am to get iptables-apply
if true || [ -e configure ]; then
      ./autogen.sh
fi;


To that end, I have started a thread with upstream to see if autoconf can fail more prominently (that would be ideal), rather than whispering a warning into the build log (which nobody reads on success).
Comment 13 Ákos Szőts 2012-11-15 09:44:54 UTC
I'm sorry, but I can't find the "iptables:" directory inside http://download.opensuse.org/repositories/home:/lnussel:/branches:/OBS_Maintained:/ and there's no iptables package update in the Update channel either.

Could you please provide the RPM package to test? :)
Comment 14 Jan Engelhardt 2012-11-15 14:18:08 UTC
Restored iptables-batch in the package. (See security:netfilter/iptables for when BS gets around to have it built.)

JFYI, upstream reports that automake-1.13 is planned to catch errors like these.
Comment 15 Ludwig Nussel 2012-11-15 14:48:47 UTC
Thanks. I still think that not using wildcards in the file list is a good way to avoid accidentally missing binaries though.
Comment 16 Jan Engelhardt 2012-11-15 14:51:16 UTC
Patches welcome to make rpm support the a{b,c} notation :)
Comment 17 Ludwig Nussel 2012-11-15 16:03:19 UTC
(In reply to comment #13)
> I'm sorry, but I can't find the "iptables:" directory inside
> http://download.opensuse.org/repositories/home:/lnussel:/branches:/OBS_Maintained:/
> and there's no iptables package update in the Update channel either.
> 
> Could you please provide the RPM package to test? :)

publishing was off. turned on now.
Comment 18 Ákos Szőts 2012-11-16 10:44:56 UTC
I upgraded the packages iptables, libxtables7 and libiptc0 from your repository. 

I rebooted twice, but unfortunately not much changed:

85332ms SuSEfirewall2_init.service
 5687ms systemd-vconsole-setup.service
 4946ms systemd-modules-load.service
 3058ms systemd-logind.service
 2054ms boot.mount
Comment 19 Jan Engelhardt 2012-11-16 14:06:20 UTC
Can you try security:netfilter/iptables instead? I'm pondering sending 1.4.16.3 to openSUSE:12.2:Update because it fixes http://bugs.debian.org/678499 .
Comment 20 Ákos Szőts 2012-11-16 18:43:44 UTC
I installed the latest iptables & co. from http://download.opensuse.org/repositories/security:/netfilter/openSUSE_12.2/ , rebooted twice, but unfortunately the results are the same:

96652ms SuSEfirewall2_init.service
 5435ms systemd-logind.service
 3700ms systemd-modules-load.service
 2747ms systemd-vconsole-setup.service
 2219ms boot.mount

Btw. what does YaST First Boot do exactly? I always see it, on every boot (not just on the first one).
Comment 21 Ludwig Nussel 2012-11-19 09:00:47 UTC
YaST2 Firstboot should only start if /var/lib/YaST2/reconfig_system exists.

I have no idea why SuSEfirewall2 would still take so long. One guess would be module loading. Try
echo ip_tables >> /etc/modules-load.d/iptables.conf
echo nf_conntrack >> /etc/modules-load.d/iptables.conf

Then you should see the module loading time not accounted to sfw2. If that doesn't help either you could try adding some "echo $LINENO $SECOND" to /sbin/SuSEfirewall2 to find out where all the time is spent. Search for "main mode" to get to the point where the main script action stats. From there on everything is in functions.
Comment 22 Ákos Szőts 2012-11-19 15:53:36 UTC
I added lots of "echo" lines under the section #main mode#, but could you tell me please where can I find their output? I see no clue in /var/log/boot.log.

But in the meantime I found this in dmesg:

[...]
[   23.502738] Adding 2104476k swap on /dev/sda5.  Priority:-1 extents:1 across:2104476k
[   25.912062] ip6_tables: (C) 2000-2006 Netfilter Core Team
[  168.220565] NVRM: GPU at 0000:02:00: GPU-d1a06ee5-33bd-5b52-0bec-c4875a0b6df9
[  168.220571] NVRM: Your system is not currently configured to drive a VGA console
[  168.220574] NVRM: on the primary VGA device. The NVIDIA Linux graphics driver
[  168.220576] NVRM: requires the use of a text-mode VGA console. Use of other console
[...]

As I mentioned earlier, I did disable IPv6 support in YaST network manager module (and now it's still disabled, I've just checked).
Comment 23 Ludwig Nussel 2012-11-20 09:46:05 UTC
systemctl status SuSEfirewall2_init.service
Comment 24 Ákos Szőts 2012-11-20 09:54:12 UTC
Created attachment 513768 [details]
/sbin/SuSEfirewall2 with echo-s

It says nothing special:

#> systemctl status SuSEfirewall2_init.service

SuSEfirewall2_init.service - LSB: SuSEfirewall2 phase 1
          Loaded: loaded (/etc/init.d/SuSEfirewall2_init)
          Active: active (exited) since Mon, 19 Nov 2012 21:06:53 +0100; 13h ago
         Process: 829 ExecStart=/etc/init.d/SuSEfirewall2_init start (code=exited, status=0/SUCCESS)
          CGroup: name=systemd:/system/SuSEfirewall2_init.service

I attached the /sbin/SuSEfirewall2 file with full of echos, to double-check I did the modifications aright.
Comment 25 Ludwig Nussel 2012-11-21 15:45:57 UTC
should be $SECONDS instead of $SECOND. I would have expected the output of SuSEfirewall2_init.service be recorded in the journal. Looks like that's only happening when restarting it in the running system. Is there anything special that needs to be done to enable logging to journal during boot?
Comment 26 Bernhard Wiedemann 2012-11-23 10:00:38 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/142549 Maintenance /
Comment 27 Bernhard Wiedemann 2012-11-23 11:00:39 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/142552 Maintenance /
Comment 28 Bernhard Wiedemann 2012-11-23 13:00:24 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/142576 Factory / sysconfig
Comment 29 Ákos Szőts 2012-11-23 23:57:37 UTC
The output was in /var/log/messages.
The interesting part is that the #main mode# section belongs to SuSEfirewall2_setup and not to SuSEfirewall2_init.

Here is the excerpt of the log:
Nov 24 00:47:56 linux-suse SuSEfirewall2: Firewall rules set to CLOSE.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: Loading basic firewall rules WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: last message repeated 3 times
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: WARNING: The NOTRACK target is obsolete. Use CT instead.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: WARNING: The NOTRACK target is obsolete. Use CT instead.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: WARNING: The NOTRACK target is obsolete. Use CT instead.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: WARNING: The NOTRACK target is obsolete. Use CT instead.
Nov 24 00:47:56 linux-suse SuSEfirewall2_init[811]: ..done
Nov 24 00:48:28 linux-suse SuSEfirewall2: Setting up rules from /etc/sysconfig/SuSEfirewall2 ...
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Loading firewall rules Start time: 2354 0
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Start time: 2356 0
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Start time: 2358 0
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Start time: 2360 0
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Start time: 2362 0
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Start time: 2364 0
Nov 24 00:48:28 linux-suse SuSEfirewall2_setup[2907]: Start time: 2366 0

[...]

As you can see, only the line numbers are changing, the $seconds remain the same, 0.

Could you tell me please where can I find the "init" file of the SuSEfirewall2?
Comment 30 Swamp Workflow Management 2012-11-26 10:07:58 UTC
openSUSE-RU-2012:1551-1: An update that has one recommended fix can now be installed.

Category: recommended (low)
Bug References: 785240
CVE References: 
Sources used:
openSUSE 12.2 (src):    iptables-1.4.14-2.4.1
Comment 31 Swamp Workflow Management 2012-11-26 10:08:39 UTC
openSUSE-RU-2012:1553-1: An update that has one recommended fix can now be installed.

Category: recommended (low)
Bug References: 785240
CVE References: 
Sources used:
openSUSE 12.2 (src):    iptables-1.4.14-2.4.1
Comment 32 Swamp Workflow Management 2012-11-26 10:09:00 UTC
openSUSE-RU-2012:1560-1: An update that has one recommended fix can now be installed.

Category: recommended (low)
Bug References: 785240
CVE References: 
Sources used:
openSUSE 12.2 (src):    iptables-1.4.14-2.4.1
Comment 33 Ludwig Nussel 2012-11-26 10:31:34 UTC
(In reply to comment #29)
> The output was in /var/log/messages.
> The interesting part is that the #main mode# section belongs to
> SuSEfirewall2_setup and not to SuSEfirewall2_init.

Ah, of course. Sorry for that. SuSEfirewall2_init only causes the
section with $ACTION=close to be executed (around line 2312). It
only calls set_basic_rules().

> [...]
> 
> As you can see, only the line numbers are changing, the $seconds remain the
> same, 0.

Which means fast :)

> Could you tell me please where can I find the "init" file of the SuSEfirewall2?

/etc/init.d/SuSEfirewall2_init
Comment 34 Bernhard Wiedemann 2012-11-26 11:00:23 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/142883 Factory / sysconfig
Comment 35 Ákos Szőts 2012-11-26 12:58:54 UTC
Created attachment 514473 [details]
/etc/init.d/SuSEfirewall2_init with echo-s

I think we're getting closer.

The output of the attached file:
Nov 26 13:51:46 linux-suse SuSEfirewall2: Firewall rules set to CLOSE.
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 21 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 24 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 27 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 32 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 34 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 36 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Loading basic firewall rules Start time: 40 0
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 42 1
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 44 1
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: last message repeated 4 times
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: WARNING: The NOTRACK target is obsolete. Use CT instead.
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: last message repeated 3 times
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: /sbin/SuSEfirewall2 --bootlock -q close
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 47 172
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: ..done
Nov 26 13:51:46 linux-suse SuSEfirewall2_init[805]: Start time: 80 172
Nov 26 13:53:23 linux-suse SuSEfirewall2: Setting up rules from /etc/sysconfig/SuSEfirewall2 ...
Nov 26 13:53:23 linux-suse SuSEfirewall2_setup[2966]: Loading firewall rules WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Nov 26 13:53:23 linux-suse SuSEfirewall2_setup[2966]: WARNING: Deprecated config file /etc/modprobe.conf, all config files belong into /etc/modprobe.d/.
Nov 26 13:53:24 linux-suse SuSEfirewall2_setup[2966]: last message repeated 4 times
Nov 26 13:53:24 linux-suse SuSEfirewall2_setup[2966]: WARNING: The NOTRACK target is obsolete. Use CT instead.
Nov 26 13:53:54 linux-suse SuSEfirewall2_setup[2966]: last message repeated 3 times
Nov 26 13:53:55 linux-suse SuSEfirewall2: Firewall rules successfully set
Nov 26 13:53:55 linux-suse SuSEfirewall2_setup[2966]: ..done

To sum up, it takes 3 minutes to execute `/sbin/SuSEfirewall2 --bootlock -q close`.
Comment 36 Ludwig Nussel 2012-11-26 13:06:01 UTC
Yes, and the interesting question is what inside /sbin/SuSEfirewall2 takes so long :)
Comment 37 Frederic Crozat 2012-11-26 17:59:46 UTC
(In reply to comment #25)
> should be $SECONDS instead of $SECOND. I would have expected the output of
> SuSEfirewall2_init.service be recorded in the journal. Looks like that's only
> happening when restarting it in the running system. Is there anything special
> that needs to be done to enable logging to journal during boot?

it should work fine but there was some bugs in the journal code which could not log in journal output from process which are no longer running (and SuSEfirewall might be one of those). It will be fixed with pending systemd update for 12.2
Comment 38 Ákos Szőts 2012-11-26 20:22:21 UTC
Created attachment 514535 [details]
"/sbin/SuSEfirewall2" with lots of echos

I attached the scary version of /sbin/SuSEfirewall2, with full of echos.

The interesting part is:
Nov 26 21:13:00 linux-suse SuSEfirewall2_init[793]: Start time: 126 1
Nov 26 21:13:00 linux-suse SuSEfirewall2_init[793]: /bin/logger  -p auth.info -t SuSEfirewall2 "/var/lock/SuSEfirewall2.booting exists which means system boot in progress, exit."
Nov 26 21:13:00 linux-suse SuSEfirewall2_init[793]: Start time: 129 102

Except for this one, every other line shows 1, 2 or 3 seconds, but according to this one it takes one and a half minute to execute the command `/bin/logger $dashs $pri -t SuSEfirewall2 "$*"`.
Comment 39 Ákos Szőts 2012-11-26 20:25:55 UTC
Ps.: while I was testing and made some syntax errors, the /bin/logger command ran in no time.
Just for this very command it acts so slow.
Comment 40 Ludwig Nussel 2012-11-27 10:01:19 UTC
Ok, now it gets interesting. There are two things going wrong
1. the logger call shouldn't take ages
2. SuSEfirewall2 shouldn't even get there


1. might be a systemd/journald bug. Maybe the syslog implementation takes time to start up? Which systlog do you  use?

2. SuSEfirweall2_init is supposed to run once and as first instance of SuSEfirewall2. The lock file is /var/lock/SuSEfirewall2.pid so on tmpfs. It cannot exist at SuSEfirewall2_init time unless some other process that runs in parallel executed SuSEfirewall2.

So to debug 2) further you may restore SuSEfirewall2 to it's original state and e.g. add an ls -l $LOCKFILE or cat $LOCKFILE to the start of setlock(). If a lock file shows up there you may add some "ps faux" call or something like that to see what's running in parallel there.
Comment 41 Ákos Szőts 2012-11-28 23:53:27 UTC
1.) I haven't changed anything related to syslog to the default install. Zypper says:

$> zypper se -i syslog:
i | rsyslog        | The enhanced syslogd for Linux and Unix
i | syslog-service | Syslog service files & scripts         

2.) I put the calls with $LOCKFILE into the restored SuSEfirewall2 but it didn't find anything:

Nov 29 00:38:24 linux-suse SuSEfirewall2: Firewall rules set to CLOSE.
Nov 29 00:38:24 linux-suse SuSEfirewall2_init[840]: Loading basic firewall rules LOCK BEGIN
Nov 29 00:38:24 linux-suse SuSEfirewall2_init[840]: ls: cannot access /var/lock/SuSEfirewall2.pid: No such file or directory
Nov 29 00:38:24 linux-suse SuSEfirewall2_init[840]: cat: /var/lock/SuSEfirewall2.pid: No such file or directory


I don't know whether is it normal or just indicates the previous that there is two "LOCK BEGIN [...] LOCK END" calls with these time stamps:

Nov 29 00:38:24 linux-suse SuSEfirewall2_init[840]: Loading basic firewall rules LOCK BEGIN
Nov 29 00:38:24 linux-suse SuSEfirewall2_init[840]: LOCK END
Nov 29 00:38:57 linux-suse SuSEfirewall2_setup[2941]: Loading firewall rules LOCK BEGIN
Nov 29 00:38:57 linux-suse SuSEfirewall2_setup[2941]: LOCK END

Maybe I should put the debug lines before the error message line.
Comment 42 Swamp Workflow Management 2012-12-04 13:11:05 UTC
openSUSE-RU-2012:1602-1: An update that has four recommended fixes can now be installed.

Category: recommended (low)
Bug References: 716652,780644,784952,785240
CVE References: 
Sources used:
openSUSE 12.2 (src):    sysconfig-0.76.4-1.4.1
Comment 43 Herbert Meier 2012-12-05 21:38:14 UTC
(In reply to comment #38)
> Nov 26 21:13:00 linux-suse SuSEfirewall2_init[793]: /bin/logger  -p auth.info
> this one it takes one and a half minute to execute the command `/bin/logger

(In reply to comment #40)
> Ok, now it gets interesting. There are two things going wrong
> 1. the logger call shouldn't take ages

I also saw this, when using btrfs for root filesystem. In my case it was postfix that took more than 40 seconds to start. The following is the strace output:

22:35:44 execve("/usr/sbin/postlog", ["/usr/sbin/postlog", "-t", "postfix/postfix-script", "-p", "info", "starting", "the", "Postfix", "mail", "system"], ["MAIL_CONFIG=/etc/postfix", "data_directory=/var/lib/postfix", "sample_directory=/usr/share/doc/packages/postfix-doc/samples", "setgid_group=maildrop", "sendmail_path=/usr/sbin/sendmail", "mailq_path=/usr/bin/mailq", "manpage_directory=/usr/share/man", "readme_directory=/usr/share/doc/packages/postfix-doc/README_FILES", "newaliases_path=/usr/bin/newaliases", "PATH=/bin:/usr/bin:/sbin:/usr/sbin", "PWD=/var/spool/postfix", "queue_directory=/var/spool/postfix", "LANG=C", "mail_owner=postfix", "daemon_directory=/usr/lib/postfix", "SHLVL=1", "config_directory=/etc/postfix", "MAIL_LOGTAG=postfix", "html_directory=/usr/share/doc/packages/postfix-doc/html", "command_directory=/usr/sbin", "OLDPWD=/etc/postfix", "_=/usr/sbin/postlog"]) = 0 <0.000116>

22:35:44 socket(PF_FILE, SOCK_DGRAM|SOCK_CLOEXEC, 0) = 3 <0.000010>
22:35:44 connect(3, {sa_family=AF_FILE, sun_path="/dev/log"}, 110) = 0 <0.000009>
22:35:44 sendto(3, "<22>Oct 26 22:35:44 postfix/postfix-script[2096]: starting the Postfix mail system", 82, MSG_NOSIGNAL, NULL, 0) = 82 <39.725261>
22:36:24 exit_group(0)                  = ?

I moved the root filesystem to an ext4 partition and this fixed the problem. Boot time was reduced from 
 Startup finished in 4917ms (kernel) + 123483ms (userspace) = 128401ms
to
 Startup finished in 4206ms (kernel) +  20974ms (userspace) =  25181ms
Comment 44 Ákos Szőts 2012-12-06 10:16:28 UTC
I tried to reproduce the "system boot in progress, exit" bug with strace, but as soon as I started to debug this, the bug disappeared...

Anyway when I started the system in "init 1" mode I observed the same heavy IO and I could run an iotop: according to it systemd-journald was the responsible for the 99% of the IO.

And an other observation: I have the inode_cache turned on and that also causes many seconds slow-down (btrfs-inodcache-writer or something like this is the name of the running process) in that time when X is starting.
Comment 45 Bernhard Wiedemann 2012-12-06 19:00:45 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/144440 Maintenance /
Comment 46 Ákos Szőts 2012-12-09 19:44:53 UTC
Well, I think I found out the root cause of this error: btrfs inode cache.
As soon as I remove the "inode_cache" from /etc/fstab, the system start time goes down to this:

 28865ms ntp.service
 16724ms systemd-readahead-replay.service
 16720ms systemd-readahead-collect.service
 14467ms remount-rootfs.service
 14303ms systemd-vconsole-setup.service
 14251ms systemd-modules-load.service
  2771ms udev-root-symlink.service
  2143ms iscsid-early.service
  2047ms SuSEfirewall2_init.service
  1984ms cycle.service

So now it's 2 seconds to start SuSEfirewall2_init instead of 2 minutes.

NTP is an other interesting thing, because after I removed the "21-dhcpcd-hook-samba" and "21-cifs" service files I don't have internet (because the lack of automatic DHCP setup) and it appears in this case NTP blocks the system start-up.
Comment 47 Ludwig Nussel 2012-12-10 08:44:52 UTC
so changing bug to kernel issue
Comment 48 Ákos Szőts 2012-12-10 11:59:23 UTC
Some questions:

1.) When Samba and CIFS service files are missing there's no DHCP. Is this a consequence or a bug?

2.) Should I open an other bug report for 30 sec NTPd start when there's no DHCP and internet connection?

3.) What can I do for debugging the btrfs inode cache bug? When I could log in into tty1, even iotop froze for about 40 seconds when btrfs was working.
Comment 49 Ákos Szőts 2012-12-16 16:59:58 UTC
After I upgraded to kernel v3.7.0-3-desktop, I cannot reproduce the long start up time with inode_cache turned on.

But, even after I moved the "21-dhcpcd-hook-samba" and "21-cifs" service files back to their original place, the DHCP still doesn't work and that's why NTPd still blocks the start up process.

#> systemd-analyze blame | head
 24750ms ntp.service
  8369ms remount-rootfs.service
  6028ms SuSEfirewall2_init.service
  5543ms systemd-logind.service
  4625ms systemd-vconsole-setup.service
Comment 50 Swamp Workflow Management 2012-12-17 12:09:58 UTC
openSUSE-RU-2012:1651-1: An update that has 5 recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 716652,775281,780644,784952,785240
CVE References: 
Sources used:
openSUSE 12.1 (src):    sysconfig-0.75.4-2.16.1
Comment 51 Ákos Szőts 2013-01-02 10:21:07 UTC
I was wrong previously when I said I couldn't reproduce it on a newer kernel. Using kernel 3.7 this bug still occurs, now when the xdm is starting.

Ps.: I created Bug 796331 for tracking the NTP issue.
Comment 52 Bernhard Wiedemann 2013-03-25 09:00:19 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/160871 Maintenance / 
https://build.opensuse.org/request/show/160872 Maintenance / 
https://build.opensuse.org/request/show/160873 Maintenance /
Comment 53 Benjamin Brunner 2013-04-02 13:51:44 UTC
Update released for openSUSE 12.1, 12.2 and 12.3. Resolved fixed.
Comment 54 Bernhard Wiedemann 2013-05-16 09:00:23 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/175838 Factory / sysconfig
Comment 55 Bernhard Wiedemann 2013-05-16 10:00:22 UTC
This is an autogenerated message for OBS integration:
This bug (785240) was mentioned in
https://build.opensuse.org/request/show/175839 Factory / sysconfig