Bug 1214365 - awk not installed by default
Summary: awk not installed by default
Status: RESOLVED WONTFIX
Alias: None
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Patterns (show other bugs)
Version: Current
Hardware: Other Other
: P5 - None : Normal (vote)
Target Milestone: ---
Assignee: Dominique Leuenberger
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-08-17 13:41 UTC by Eric Blake
Modified: 2024-04-29 16:23 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---
dimstar: needinfo?


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Blake 2023-08-17 13:41:15 UTC
During CI builds of libnbd, we got an interesting failure when using a Docker image of Tumbleweed that was based on a fairly bare-bones install plus the normal build dependencies documented for libnbd:
https://gitlab.com/nbdkit/libnbd/-/jobs/4878710589

Line 25 in prepping the OS shows:
...
Checking cache for x86_64-opensuse-tumbleweed-prebuilt-env-protected...
Downloading cache.zip from https://storage.googleapis.com/gitlab-com-runners-cache/project/24336857/x86_64-opensuse-tumbleweed-prebuilt-env-protected 
Successfully extracted cache
...
Line 31 does $ cat /packages.txt, which shows neither busybox nor gawk in the set of installed packages
...
Line 633 during the ./configure run shows:
...
config.status: creating podwrapper.pl
./config.status: line 1496: awk: command not found
config.status: error: could not create podwrapper.pl

Most Autoconf/automake-generated configure scripts depend on SOME form of awk being present in the system (it generally does not have to be GNU awk; busybox fits the bill).  But the fact that the bare-bones installation has no awk at all violates assumptions made by the GNU Coding Standards that awk is generically available:
https://www.gnu.org/prep/standards/standards.html#Utilities-in-Makefiles

and as earlier releases of OpenSUSE satisfied that, it seems like an accident that at least some form of 'awk' is no longer installed by default for a barebones Tumbleweed install.

I was able to work around it by adding 'awk' to the set of packages installed in the CI system, as shown here:
https://gitlab.com/libvirt/libvirt-ci/-/merge_requests/424
but the libvirt-ci folks wondered if more than just libnbd is going to be affected (GNU-based configure scripts assuming that awk is omnipresent is just the tip of the iceberg), and if it is instead a bug in Tumbleweed that should be fixed.  Hence this report.
Comment 1 Stefan Hundhammer 2023-08-17 14:03:20 UTC
For a moment I thought you were reporting a duplicate of bug #1214277, but that's not the case; albeit both have a common cause.

As a user, I agree with you in that I'd very much prefer to have common Linux / Unix tools installed.

But OTOH GNU awk is not longer a lightweight package anymore with its 3.3 MB installed size (albeit that includes 1.2 MB of doc + info and 1.1 MB of locale data; see 'qdirstat pkg:/gawk'), and not too many other packages explicitly require it:

% rpm -q --whatrequires gawk 
fonts-config-20200609+git0.42e2b1b-150000.4.10.1.noarch
tuned-2.10.0-150400.19.10.noarch
tcsh-6.20.00-4.15.1.x86_64
rpm-build-4.14.3-150300.55.1.x86_64

% rpm -q --whatrequires awk 
docbook_4-4.5-2.18.noarch
plymouth-scripts-0.9.5~git20210406.e554475-150400.3.8.1.noarch
desktop-file-utils-0.26-150400.1.7.x86_64
yast2-installation-4.6.7-lp155.1.1.noarch


So I can understand to some extent the rationale why it's no longer installed by default. Installing "busybox" and (!) creating its symlinks might be an alternative (https://bugzilla.suse.com/show_bug.cgi?id=1214277#c13).

But one way or the other, this is not a bug of the installer, it's the default software patterns for our different products. Reassigning to component "patterns".
Comment 2 Stefan Hundhammer 2023-08-17 14:06:19 UTC
Since this is about users building their own software and using "configure" scripts, maybe requiring 'awk' in one of the development patterns like "Base Development" (the one with gcc, autoconf, automake etc.) would be a solution.
Comment 3 Eric Blake 2023-08-17 14:38:31 UTC
Having Base Development pull in some form of 'awk' seems reasonable, as that is the environment most likely to be used when running a configure script from a tarball.
Comment 4 Dominique Leuenberger 2023-08-17 15:53:17 UTC
sure - I can add gawk (or /usr/bin/awk, which can be gawek or busybox-awk) as a dep to patterns-devel-base-devel_basis

But as you build your own container: are you even installing this pattern already? I am pretty sure this basic pattern will prove to be 'too large' (as it targets interactive dev work, it also recommends subversion and git)

> zypper info --requires --recommends patterns-devel-base-devel_basis

Information for package patterns-devel-base-devel_basis:
--------------------------------------------------------
Repository     : Main Repository (OSS)
Name           : patterns-devel-base-devel_basis
Version        : 20170319-11.2
Requires       : [18]
    glibc-devel
    gcc
    make
    bison
    zlib-devel
    binutils
    ncurses-devel
    makeinfo
    libtool
    patch
    automake
    m4
    autoconf
    flex
    gettext-tools
    cpp
    gdbm-devel
    pattern() = basesystem
Recommends     : [22]
    libstdc++-devel
    gcc-c++
    git
    subversion
    gmp-devel
    pkg-config
    patch
    pam-devel
    openldap2-devel
    fdupes
    gperf
    bin86
    binutils-devel
    libaio-devel
    libosip2-devel
    db-devel
    e2fsprogs-devel
    gcc-info
    glibc-info
    sparse
    libapparmor-devel
    libgcj-devel

IMHO you're better off explicitly installing awk into your CI container (which seems to be derived from tumbleweed)
Comment 5 Eric Blake 2023-08-17 18:01:58 UTC
(In reply to Dominique Leuenberger from comment #4)
> sure - I can add gawk (or /usr/bin/awk, which can be gawek or busybox-awk)
> as a dep to patterns-devel-base-devel_basis
> 
> But as you build your own container: are you even installing this pattern
> already? I am pretty sure this basic pattern will prove to be 'too large'
> (as it targets interactive dev work, it also recommends subversion and git)
> 
> > zypper info --requires --recommends patterns-devel-base-devel_basis

Looking more at the CI setup, here's what I see:

https://gitlab.com/libvirt/libvirt-ci/-/blob/master/lcitool/facts/targets/opensuse-tumbleweed.yml?ref_type=heads
says to start with:
install:
  unattended_scheme: autoyast
  url: http://download.opensuse.org/tumbleweed/repo/oss
containers:
  base: registry.opensuse.org/opensuse/tumbleweed:latest

then it runs a project-specific script to pull in additional packages, such as:
https://gitlab.com/nbdkit/libnbd/-/blob/master/ci/buildenv/opensuse-tumbleweed.sh?ref_type=heads
which runs:
     zypper dist-upgrade -y
     zypper install -y \
            autoconf \
...

Right now, patterns-devel-base-devel_basis is not in the list, but I can easily tweak things to explicitly add either a pattern or awk to that list when building for Tumbleweed.  In other words, I've already worked around the build failure, but it seems odd that the failure came up in the first place, as every other distro (including OpenSUSE Leap 15) has at least some form of awk installed from the get-go.


> 
> Information for package patterns-devel-base-devel_basis:
> --------------------------------------------------------
> Repository     : Main Repository (OSS)
> Name           : patterns-devel-base-devel_basis
> Version        : 20170319-11.2
> Requires       : [18]
>     glibc-devel
>     gcc
>     make
>     bison
>     zlib-devel
>     binutils
>     ncurses-devel
>     makeinfo
>     libtool
>     patch
>     automake
>     m4
>     autoconf
>     flex
>     gettext-tools
>     cpp
>     gdbm-devel
>     pattern() = basesystem

The libnbd CI setup is using most of these packages, but not quite all of them (for example, not zlib-devel); it is indeed manually pulling in just the packages it needs (gcc, make, glibc-devel, ...) rather than installing a pattern.  And I get your point that a CI system runs better when it is as lightweight as possible, which is different from a typical user that really will install a desktop environment setup with the right patterns to pull in the usual defaults.  Still, it does not scale as well to make lots of CI projects start having to add explicit dependencies on awk when they have not been doing so in the past.

Fortunately, the Autoconf/Automake configure scripts are happy with busybox awk, which is indeed lighter than gawk.

> IMHO you're better off explicitly installing awk into your CI container
> (which seems to be derived from tumbleweed)

Yep, that's what I've done, regardless of the outcome of this bug.
Comment 6 Dominique Leuenberger 2023-08-18 08:10:42 UTC
the base container is just that - a base.

there exists also a opensuse/distrobox-packaging container for people working on SUSE packages (more containers can be provided

I don't think the base container should gain more dependencies on its own. awk is not really more different to the system than any other command

I did add awk to the devel_basis pattern though
  https://build.opensuse.org/request/show/1104642
this should end up in Tumbleweed in a couple of days.

(wontfix because the container won't be changed to have awk - even though some of the discussion lead to a change in some pattern)
Comment 7 Michal Suchanek 2024-04-29 16:23:39 UTC
Shouldn't autoconf depend on awk since it assumes it's present?