Bugzilla – Bug 946067
Corruption in RPM DB after fresh 42.1M2 install
Last modified: 2016-03-31 17:27:33 UTC
Immediately after installing, I get the following persistent error when querying all installed RPMs: thebes:~ # rpm -qa | wc error: rpmdbNextIterator: skipping h# 1558 Header V3 RSA/SHA256 Signature, key ID 3dbdc284: BAD 1998 1998 63954 thebes:~ # The output looks normal; if anything is missing, it is not obvious to me. I have seen similar errors before, but never after a fresh install. For the save_y2logs tarball from this installation, see bug#945929.
Well, what is header 1558? Please do: rpm -qi --querybynumber 1558 --nodigest --nosignature
(The output basically means that the pubkey that was used to sign the package was not imported into the rpmdb)
Here's what I get (with description omitted): rogers@thebes> rpm -qi --querybynumber 1558 --nodigest --nosignature rpm -qi --querybynumber 1558 --nodigest --nosignature Name : libreoffice Version : 4.3.5.2 Release : 12.1 Architecture: x86_64 Install Date: Mon 14 Sep 2015 03:56:27 PM EDT Group : Productivity/Office/Suite Size : 222033984 License : Apache-2.0 and Artistic-1.0 and BSD-3-Clause and BSD-4-Clause and GPL-2.0+ and LPPL-1.3c and LGPL-2.1+ and LGPL-3.0 and MPL-1.1 and MIT and SUSE-Public-Domain and W3C Signature : RSA/SHA256, Thu 27 Aug 2015 12:22:01 PM EDT, Key ID b88b2fd43dbdc284 Source RPM : libreoffice-4.3.5.2-12.1.src.rpm Build Date : Thu 27 Aug 2015 11:49:58 AM EDT Build Host : cloud138 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/openSUSE:42 URL : http://www.documentfoundation.org/ Summary : A Free Office Suite (Framework) Description : . . . Distribution: openSUSE:42 rogers@thebes>
Hmm, that's the standard key, so it *is* in the database. The problem must be somewhere else. Do you get more information when you do: rpm -q -v -v -v --querybynumber 1558 Thanks!
Could you please also attach the output of: /usr/lib/rpm/rpmdumpheader libreoffice-4.3.5.2-12.1 (This assumes you have the deltarpm package installed)
Here's the rpm output: thebes:~ # rpm -q -v -v -v --querybynumber 1558 D: package record number: 1558 D: loading keyring from pubkeys in /var/lib/rpm/pubkeys/*.key D: couldn't find any keys in /var/lib/rpm/pubkeys/*.key D: loading keyring from rpmdb D: opening db environment /var/lib/rpm cdb:private:0x201 D: opening db index /var/lib/rpm/Packages 0x400 mode=0x0 D: locked db index /var/lib/rpm/Packages D: opening db index /var/lib/rpm/Name nofsync:0x400 mode=0x0 D: read h# 2007 Header SHA1 digest: OK (121869ebdc8e8ddf647915a929aa17a985500257) D: added key gpg-pubkey-3dbdc284-53674dd4 to keyring D: Using legacy gpg-pubkey(s) from rpmdb D: read h# 1558 Header V3 RSA/SHA256 Signature, key ID 3dbdc284: OK libreoffice-4.3.5.2-12.1.x86_64 D: closed db index /var/lib/rpm/Name D: closed db index /var/lib/rpm/Packages D: closed db environment /var/lib/rpm thebes:~ # The rpmdumpheader output is almost 700K, so I will add it as an attachment.
Created attachment 648623 [details] rpmdumpheader output
Wait a sec, why didn't you get that 'Signature, key ID 3dbdc284: BAD' error when you did the --querybynumber? So you get an error with 'rpm -qa', but no error if you query just the single header??
Actually, it gets even more wierd: I'm now seeing two errors, fingering different headers: thebes:~ # rpm -qa --last | head error: rpmdbNextIterator: skipping h# 624 Header V3 RSA/SHA256 Signature, key ID 3dbdc284: BAD error: rpmdbNextIterator: skipping h# 638 Header V3 RSA/SHA256 Signature, key ID 3dbdc284: BAD baloo5-tools-5.9.2-5.1.x86_64 Wed Sep 16 10:08:45 2015 yast2-nfs-client-3.1.16-4.2.noarch Wed Sep 16 10:08:44 2015 yast2-metapackage-handler-3.1.4-5.1.noarch Wed Sep 16 10:07:57 2015 libKF5WindowSystem5-lang-5.12.0-3.2.noarch Wed Sep 16 10:07:57 2015 avahi-lang-0.6.31-11.1.noarch Wed Sep 16 10:07:56 2015 gpg-pubkey-3dbdc284-53674dd4 Wed Sep 16 10:07:07 2015 perl-Bio-Graphics-2.24-5.noarch Tue Sep 15 15:21:36 2015 perl-Data-Stag-0.11-2.noarch Tue Sep 15 15:21:19 2015 perl-DBIx-Roles-1.04-3.noarch Tue Sep 15 15:00:52 2015 perl-SQL-Abstract-1.22-1.x86_64 Tue Sep 15 14:52:13 2015 thebes:~ # This also shows that I haven't installed or upgraded any packages since I submitted the bug report (nor do I recall removing any). And if I do "rpm -q -v -v -v --querybynumber 624" (or 638), it keeps complaining about the same header until I stop it with C-c.
Weird. Does 'valgrind rpm -qa' show anything suspicious?
Nothing that looks suspicious to me (though I have no experience interpreting valgrind output). Here are the summaries: ==2648== HEAP SUMMARY: ==2648== in use at exit: 3,755 bytes in 4 blocks ==2648== total heap usage: 88,648 allocs, 88,644 frees, 119,589,832 bytes allocated ==2648== ==2648== LEAK SUMMARY: ==2648== definitely lost: 0 bytes in 0 blocks ==2648== indirectly lost: 0 bytes in 0 blocks ==2648== possibly lost: 0 bytes in 0 blocks ==2648== still reachable: 3,755 bytes in 4 blocks ==2648== suppressed: 0 bytes in 0 blocks ==2648== Rerun with --leak-check=full to see details of leaked memory ==2648== ==2648== For counts of detected and suppressed errors, rerun with: -v ==2648== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Running with greater levels of detail doesn't show anything more that jumps out at me. Note that I am still getting the same warnings about skipping headers 624 and 638 after doing "zypper install valgrind".
I experienced a similar problem: oct:~ # zypper dup --from openSUSE-Tumbleweed-Oss --from openSUSE-Tumbleweed-Non-Oss ... ( 8/1039) Instaluje se: bundle-lang-kde-en-13.2-12.4 ..............................................................[chyba] Instalace bundle-lang-kde-en-13.2-12.4 selhala: Chyba: Subprocess failed. Error: RPM selhalo: Příkaz byl ukončen s kódem 1. (A) Přerušit, (R) opakovat, (I) ignorovat? [a/r/i] (a): r Instalace bundle-lang-kde-en-13.2-12.4 selhala: Chyba: Subprocess failed. Error: RPM selhalo: error: unpacking of archive failed on file /usr/share/doc-bundle/kde/HTML/en/amarok/Face-smile.png: cpio: rename error: bundle-lang-kde-en-13.2-12.4.noarch: install failed error: bundle-lang-kde-en-13.2-12.2.noarch: erase skipped (A) Přerušit, (R) opakovat, (I) ignorovat? [a/r/i] (a): i ( 9/1039) Instaluje se: geda-doc-1.8.2-5.4 ........................................................................[chyba] Instalace geda-doc-1.8.2-5.4 selhala: Chyba: Subprocess failed. Error: RPM selhalo: error: unpacking of archive failed on file /usr/share/doc/packages/geda-gaf/wiki/geda-bom_readme.ru.html: cpio: rename error: geda-doc-1.8.2-5.4.noarch: install failed Instalace geda-symbols-1.8.2-5.4 selhala: Chyba: Subprocess failed. Error: RPM selhalo: error: unpacking of archive failed on file /usr/share/gEDA/sym/4000/4000-1.sym: cpio: rename error: geda-symbols-1.8.2-5.4.noarch: install failed error: geda-symbols-1.8.2-5.3.noarch: erase skipped (A) Přerušit, (R) opakovat, (I) ignorovat? [a/r/i] (a): i ( 11/1039) Instaluje se: glibc-i18ndata-2.22-2.2 ...................................................................[chyba] Instalace glibc-i18ndata-2.22-2.2 selhala: Chyba: Subprocess failed. Error: RPM selhalo: error: unpacking of archive failed on file /usr/share/i18n/charmaps/ANSI_X3.110-1983.gz: cpio: rename error: glibc-i18ndata-2.22-2.2.noarch: install failed error: glibc-i18ndata-2.22-1.1.noarch: erase skipped (A) Přerušit, (R) opakovat, (I) ignorovat? [a/r/i] (a): Došlo k chybě v průběhu instalace nebo odstraňování nebo po instalaci či odstranění balíčků: Installation aborted by user ... oct:~ # LANG=C zypper dup --from openSUSE-Tumbleweed-Oss --from openSUSE-Tumbleweed-Non-Oss Target initialization failed: Failed to cache rpm database (1).
In my case, it has something to do with the file system, brtfs or snapper: oct:~ # cp -a /var/lib/rpm /var/lib/rpm.save cp: cannot create directory '/var/lib/rpm.save': No space left on device oct:~ # df /var/lib Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 41945088 37520524 4374356 90% /
Well, rpm can't do much against broken file systems...
David: What do you think about comment 12? Real disk space taken: 58% Disk space taken due to existing automatic snapshots: 90% However it reported 10% free space, it was not possible to add any more files. rpm failed on mv.
Do you have output of 'btrfs fi df' ? This looks like metadata is exhausted while there's still space in the data.
David: No, I don't. After snapper delete, everything works again. I'll do that if it happens again.
It happened again now. But today df shows 100%, so it is not exactly the same case as before. Disk full problem is apparently caused by snapper/zypp configuration keeping lots of obsolete snapshots. And I again seen obscure RPM error: oct:~ # zypper dup Warning: You are about to do a distribution upgrade with all enabled repositories. Make sure these repositories are compatible before you continue. See 'man zypper' for more information about this command. Download (curl) error for 'http://download.opensuse.org/tumbleweed/repo/oss/content': Error code: Write error Error message: Failed writing body (2636 != 2920) Abort, retry, ignore? [a/r/i/? shows all options] (a): ABORT request: Aborting requested by user oct:~ # df / Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 41945088 41854180 28 100% / oct:~ # btrfs fi df / Data, single: total=37.22GiB, used=37.22GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=2.75GiB, used=2.20GiB GlobalReserve, single: total=512.00MiB, used=0.00B oct:~ # for i in $(seq 100) ; do snapper delete $i ; done 2>/dev/null oct:~ # sleep 120 oct:~ # df / Filesystem 1K-blocks Used Available Use% Mounted on /dev/sdb1 41945088 26101036 14139828 65% / oct:~ # btrfs fi df / Data, single: total=37.22GiB, used=23.73GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=2.75GiB, used=880.70MiB GlobalReserve, single: total=304.00MiB, used=0.00B oct:~ #
And reproduced with bad free report: oct:~ # df / Souborový systém 1K bloků Užito Volné Uži% Připojeno do /dev/sdb1 41945088 40815764 1087420 98% / oct:~ # btrfs fi df / Data, single: total=37.72GiB, used=36.68GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=2.25GiB, used=1.74GiB GlobalReserve, single: total=512.00MiB, used=0.00B oct:~ # uname -r 4.5.0-1-default oct:~ # btrfs fi us / Overall: Device size: 40.00GiB Device allocated: 40.00GiB Device unallocated: 1.00MiB Device missing: 0.00B Used: 38.42GiB Free (estimated): 1.04GiB (min: 1.04GiB) Data ratio: 1.00 Metadata ratio: 1.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:37.72GiB, Used:36.68GiB /dev/sdb1 37.72GiB Metadata,single: Size:2.25GiB, Used:1.74GiB /dev/sdb1 2.25GiB System,single: Size:32.00MiB, Used:16.00KiB /dev/sdb1 32.00MiB Unallocated: /dev/sdb1 1.00MiB oct:~ # => 2 bugs are here 1) btrfs reports bad free 2) snapper is ill configured and repeatedly fills the disk. After removing all unneeded snapshots, I am back at 65%.
This time, the system left unbootable after the error, and I had to re-install. But rpm seems to have no chance to prevent it, as it gets incorrect information about free space.