Bug 1219376

Summary: texlive-latex: LaTeX extremely slow for many small files (fixed in upstream latex2e)
Product: [openSUSE] openSUSE Tumbleweed Reporter: Tobias Burnus <burnus>
Component: OtherAssignee: E-mail List <screening-team-bugs>
Status: RESOLVED UPSTREAM QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: burnus, werner
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
Whiteboard:
Found By: --- Services Priority:
Business Priority: Blocker: ---
Marketing QA Status: --- IT Deployment: ---
Attachments: boo1219376.patch

Description Tobias Burnus 2024-01-31 07:39:12 UTC
LaTeX is extremely slow if many small files are processed – the problem seems to be that it keeps searching up files without caching them.

I think the fix is the following (merged):
https://github.com/latex3/latex2e/pull/1063

Please consider applying. For the news file, included in the commit:

----------------------------------------------
\section{Code improvements}

\subsection{Performance in checking file existence}

The additon of hooks, etc., to file operations had a side-effect in that
multiple checks were made that the file existed. In larger documents using
lots of files, these filesystem operations caused non-trivial performance
impact. We now cache the existence of files, such that these repeated filesystem
calls are avoided.
----------------------------------------------
Comment 1 Dr. Werner Fink 2024-01-31 07:54:34 UTC
From https://github.com/latex3/latex2e/pull/1063

*-----------------*
READ ME FIRST: Please understand that in most cases we will not be able to merge a pull request because there are a lot of internal activities needed when updating the LaTeX2e sources. If you have a code suggestion please discuss it with the team first.
*-----------------*

Next is that dtx files requires some more work to get a patch for the source code ... I'll investigate that further
Comment 2 Dr. Werner Fink 2024-01-31 08:06:42 UTC
I really prefer a patch against tex/latex/base/latex.ltx ... otherwise we have to wait for TeXLive 2024
Comment 3 Dr. Werner Fink 2024-01-31 08:50:15 UTC
Created attachment 872340 [details]
boo1219376.patch

Please test this
Comment 4 Dr. Werner Fink 2024-01-31 09:23:57 UTC
Remember to run e.g.

  fmtutil --all

as otherwise the patched latex.dtx and latexrelease.sty are not included
Comment 5 Tobias Burnus 2024-01-31 12:11:39 UTC
Thanks for the patch!

BTW: It seems as if https://github.com/latex3/latex2e/pull/1082/files is needed as follow is needed (linked at the original pull request - as is the associated bug report).

* * *

Unfortunately, it did not help as much as hoped for – building the OpenMP specification still takes 16min - which give or take a few seconds - was the time I measure also without the patch.(*) Bummer!

I tried it also with my local patch variant but it didn't help.

[For comparison, with TeXLive 2022 (on Ubuntu 22), it takes 41.871s; I don't have openSUSE TeXLive < 2023 numbers.]


Thus, either it does not help for this testcase or it requires some other change I have missed or it doesn't help at all?


I wonder whether the patch still makes sense - or whether we should just wait for the next TeXLive release, hoping that the old version will recover the previous performance. - It looks as if TeXLive 2024 will arrive in two months.

* * *

(*) Also looking at the 'strace' output, the number of
  'access', 'newfstatat', 'newfstatat'
triples did not seem to be lower.  (I count 7 to 11 of those triples per file before it is finally opened ("openat"). All with identical arguments, all with return value 0 and no other code in between. – It seems as if a single 'access' and a single 'newfstatat' should have been enough.)
Comment 6 Dr. Werner Fink 2024-01-31 12:21:14 UTC
Be aware that latex.ltx as well as latexrelease.sty *are* generated from ltfiles.dtx:

 grep ltfiles.dtx tex/latex/base/latex*
 tex/latex/base/latex.ltx:%% ltfiles.dtx  (with options: `2ekernel')
 tex/latex/base/latex.ltx:%%% From File: ltfiles.dtx
 tex/latex/base/latexrelease.sty:%% ltfiles.dtx  (with options: `latexrelease')
 tex/latex/base/latexrelease.sty:%%% From File: ltfiles.dtx
Comment 7 Tobias Burnus 2024-01-31 12:27:19 UTC
> Be aware that latex.ltx as well as latexrelease.sty *are* generated from > ltfiles.dtx:

Yes – but that's the reason that both your attachment and my experiments did include it. And I also checked that /var/lib/texmf/web2c/*/*.fmt are all updated via fmtutil --all after those changes.
Comment 8 Dr. Werner Fink 2024-01-31 12:29:48 UTC
Then it looks like something is missed here ... maybe in the web and change of the web sources and/or web2c interface
Comment 9 Dr. Werner Fink 2024-02-06 14:25:16 UTC
Let's wait on TeXLive 2024