Bugzilla – Bug 1219376
texlive-latex: LaTeX extremely slow for many small files (fixed in upstream latex2e)
Last modified: 2024-02-06 14:25:16 UTC
LaTeX is extremely slow if many small files are processed – the problem seems to be that it keeps searching up files without caching them. I think the fix is the following (merged): https://github.com/latex3/latex2e/pull/1063 Please consider applying. For the news file, included in the commit: ---------------------------------------------- \section{Code improvements} \subsection{Performance in checking file existence} The additon of hooks, etc., to file operations had a side-effect in that multiple checks were made that the file existed. In larger documents using lots of files, these filesystem operations caused non-trivial performance impact. We now cache the existence of files, such that these repeated filesystem calls are avoided. ----------------------------------------------
From https://github.com/latex3/latex2e/pull/1063 *-----------------* READ ME FIRST: Please understand that in most cases we will not be able to merge a pull request because there are a lot of internal activities needed when updating the LaTeX2e sources. If you have a code suggestion please discuss it with the team first. *-----------------* Next is that dtx files requires some more work to get a patch for the source code ... I'll investigate that further
I really prefer a patch against tex/latex/base/latex.ltx ... otherwise we have to wait for TeXLive 2024
Created attachment 872340 [details] boo1219376.patch Please test this
Remember to run e.g. fmtutil --all as otherwise the patched latex.dtx and latexrelease.sty are not included
Thanks for the patch! BTW: It seems as if https://github.com/latex3/latex2e/pull/1082/files is needed as follow is needed (linked at the original pull request - as is the associated bug report). * * * Unfortunately, it did not help as much as hoped for – building the OpenMP specification still takes 16min - which give or take a few seconds - was the time I measure also without the patch.(*) Bummer! I tried it also with my local patch variant but it didn't help. [For comparison, with TeXLive 2022 (on Ubuntu 22), it takes 41.871s; I don't have openSUSE TeXLive < 2023 numbers.] Thus, either it does not help for this testcase or it requires some other change I have missed or it doesn't help at all? I wonder whether the patch still makes sense - or whether we should just wait for the next TeXLive release, hoping that the old version will recover the previous performance. - It looks as if TeXLive 2024 will arrive in two months. * * * (*) Also looking at the 'strace' output, the number of 'access', 'newfstatat', 'newfstatat' triples did not seem to be lower. (I count 7 to 11 of those triples per file before it is finally opened ("openat"). All with identical arguments, all with return value 0 and no other code in between. – It seems as if a single 'access' and a single 'newfstatat' should have been enough.)
Be aware that latex.ltx as well as latexrelease.sty *are* generated from ltfiles.dtx: grep ltfiles.dtx tex/latex/base/latex* tex/latex/base/latex.ltx:%% ltfiles.dtx (with options: `2ekernel') tex/latex/base/latex.ltx:%%% From File: ltfiles.dtx tex/latex/base/latexrelease.sty:%% ltfiles.dtx (with options: `latexrelease') tex/latex/base/latexrelease.sty:%%% From File: ltfiles.dtx
> Be aware that latex.ltx as well as latexrelease.sty *are* generated from > ltfiles.dtx: Yes – but that's the reason that both your attachment and my experiments did include it. And I also checked that /var/lib/texmf/web2c/*/*.fmt are all updated via fmtutil --all after those changes.
Then it looks like something is missed here ... maybe in the web and change of the web sources and/or web2c interface
Let's wait on TeXLive 2024