Bug 1182834 - glibc: bad performance calling localtime() when TZ env var not set
Summary: glibc: bad performance calling localtime() when TZ env var not set
Status: NEW
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Basesystem (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Andreas Schwab
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-01 00:48 UTC by Dario Faggioli
Modified: 2021-04-15 11:47 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Dario Faggioli 2021-03-01 00:48:08 UTC
The other day, by pure coincidence, I run into this: https://bugzilla.redhat.com/show_bug.cgi?id=1608367#c7

It's basically about making sure that the TZ env variable is defined, to avoid doing syscalls for stat-ing /etc/localtime. Out of curiosity, I checked on my openSUSE and SLE systems, and we seem not to have it there (or is it just my installations?).

I went on, and run the quick experiments suggested in the bugreport. Here's me on openSUSE Tumbleweed:

$ gcc -Wall -O2 time_vdso.c -o time_vdso
$ ldd time_vdso
	linux-vdso.so.1 (0x00007fff71fde000)
	libc.so.6 => /lib64/libc.so.6 (0x00007f2e3a774000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f2e3a95e000)

$ echo "TZ currently is: " $TZ
TZ currently is:

$ time ./time_vdso 
real	0m6,687s
user	0m1,569s
sys	0m5,118s

$ strace -c ./time_vdso 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
100,00   25,410273           5   5000003           newfstatat
  0,00    0,000000           0         3           read
  0,00    0,000000           0         3           close
  0,00    0,000000           0         1           lseek
  0,00    0,000000           0         8           mmap
  0,00    0,000000           0         3           mprotect
  0,00    0,000000           0         1           munmap
  0,00    0,000000           0         3           brk
  0,00    0,000000           0         4           pread64
  0,00    0,000000           0         1         1 access
  0,00    0,000000           0         1           execve
  0,00    0,000000           0         2         1 arch_prctl
  0,00    0,000000           0         3           openat
------ ----------- ----------- --------- --------- ----------------
100,00   25,410273           5   5000036         2 total

On the other hand, after having done an `export TZ=Europe/Rome`:

$ echo "TZ currently is: " $TZ
TZ currently is:  Europe/Rome

$ time ./time_vdso

real	0m0,845s
user	0m0,845s
sys	0m0,000s

$ strace -c ./time_vdso 
% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
  0,00    0,000000           0         3           read
  0,00    0,000000           0         3           close
  0,00    0,000000           0         1           lseek
  0,00    0,000000           0         8           mmap
  0,00    0,000000           0         3           mprotect
  0,00    0,000000           0         1           munmap
  0,00    0,000000           0         3           brk
  0,00    0,000000           0         4           pread64
  0,00    0,000000           0         1         1 access
  0,00    0,000000           0         1           execve
  0,00    0,000000           0         2         1 arch_prctl
  0,00    0,000000           0         3           openat
  0,00    0,000000           0         4           newfstatat
------ ----------- ----------- --------- --------- ----------------
100,00    0,000000           0        37         2 total

I tried on a pretty freshly deployed SLE 15.3 too, this time using /usr/bin/time, just for a change and...

/usr/bin/time ./time_vdso 
2.36user 7.65system 0:10.02elapsed 99%CPU (0avgtext+0avgdata 1284maxresident)k
0inputs+0outputs (0major+62minor)pagefaults 0swaps

TZ=Europe/Rome /usr/bin/time ./time_vdso 
0.89user 0.00system 0:00.89elapsed 99%CPU (0avgtext+0avgdata 1504maxresident)k
0inputs+0outputs (0major+68minor)pagefaults 0swaps

So, again, the high 'system' value confirms we're drowning in syscalls.

Of course, this is a typical micro-benchmark and results should be interpreted as such. I mean, real programs do not sample the time _all_the_time_ in super-tight loops. And indeed, benchmarks and time-sensitive applications will use things like clock_gettime() for measuring time intervals, interact with timers, etc.

Still, localtime()/time() are probably pretty popular when it comes to logging (e.g., on webservers... and in fact, I gave a quick look at the code of ngnix, and they're there). Surely excessive logging would be bad for other reasons, but I think that something like this means that even "normal logging" has more impact on performance than it could/should have.
Comment 1 Dario Faggioli 2021-03-01 01:01:28 UTC
(In reply to Dario Faggioli from comment #0)
> Still, localtime()/time() are probably pretty popular when it comes to
> logging (e.g., on webservers... and in fact, I gave a quick look at the code
> of ngnix, and they're there). Surely excessive logging would be bad for
> other reasons, but I think that something like this means that even "normal
> logging" has more impact on performance than it could/should have.
>
Speaking about possible solutions, I'm not sure what to propose, TBH.

Setting TZ, according to documentation (https://www.gnu.org/software/libc/manual/html_node/TZ-Variable.html) should not be necessary in the vast majority of the case, so I don't think we can rely on users/admin doing that themselves that often.

We can just always set it up ourselves, but then what if the localtime really changes? This is probably rare enough, but it it happens, apps and services will start to log wrong timestamps (or will have to be rebooted) which we definitely don't want.

Speaking with performance people, the idea of rime and/or rate limit the stat() calls (e.g. no more than once a minute) came up.

So, thoughts?
Comment 2 Fabian Vogt 2021-03-01 08:49:04 UTC
I've seen applications which stat localtime pretty often during startup and operation, but avoiding them made no discernible difference. I don't have any actual data anymore, but I don't think that this will be a bottleneck in reality.
Comment 3 Markéta Machová 2021-03-05 12:41:34 UTC
As it is Fedora bug, it is probably not about packaging.

Looking into it, I guess our problem is to ensure TZ is set. Timezone Makefile does "TZDEFAULT = $(TOPDIR)/etc/localtime", which is the only reasonable option. All over, I think this bug is not about timezone. Screening folks, could you please ping it to someone more competent?

From my point of view, if we could append a line "TZ=':/etc/localtime'; export TZ" to our users' .profile file it would do the trick (given the workaround on the Fedora bug). But I don't know how complicated this is, I would think it is nearly impossible given the variety of users we have.
Comment 4 Markéta Machová 2021-04-13 16:19:25 UTC
Andreas, do you have anything to say, as a maintainer of glibc?
Comment 5 Andreas Schwab 2021-04-14 12:59:51 UTC
https://sourceware.org/bugzilla/show_bug.cgi?id=24004
Comment 6 Markéta Machová 2021-04-15 08:59:15 UTC
Thanks, perfect. So it is known glibc bug, in this case I am assigning it to you.