Bug 118717 - remove duplicate files from ftp servers
Summary: remove duplicate files from ftp servers
Status: RESOLVED FIXED
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: Other (show other bugs)
Version: unspecified
Hardware: All All
: P5 - None : Enhancement (vote)
Target Milestone: ---
Assignee: Roman Drahtmueller
QA Contact: E-mail List
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2005-09-24 18:48 UTC by Forgotten User OS1JNCFbCX
Modified: 2008-11-12 09:50 UTC (History)
2 users (show)

See Also:
Found By: Beta-Customer
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
The mentioned script (3.36 KB, text/plain)
2005-09-24 19:59 UTC, Forgotten User OS1JNCFbCX
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Forgotten User OS1JNCFbCX 2005-09-24 18:48:30 UTC
This is actually not a specific issue with SUSE Linux 10.1 but a generic issue
with contents on ftp servers ftp.suse.com and ftp.opensuse.org.

On both servers there are actually many files stored multiple times in various
directories. It is likely that the number of duplicate files will improve more
and more in the near future since you created two separate update trees for the
almost identical releases 10.0 and 10.0-OSS.

I have hacked a little script that scans specified directories and replaces
duplicates of regular files with hard links. If you run this script on a regular
basis on the primary staging servers of suse.com and opensuse.org you could
prevent much unnecessary sync load and disk usage on the mirror network.

I have tested the script on a mirror of ftp.suse.com/pub/suse and removed 3GB(!)
of duplicate files that way. For ftp.opensuse.org/pub/opensuse the saving was
about 700MB.

Actually the first run of the script will take some time because MD5 sums must
be calculated for every file. Further runs are much faster because the tool
stores already calculated MD5 sums in a cache file.
Comment 1 Forgotten User OS1JNCFbCX 2005-09-24 19:59:28 UTC
Created attachment 50785 [details]
The mentioned script
Comment 2 Roman Drahtmueller 2008-11-12 09:50:02 UTC
Fixed, for the most part.