Bug 181602 - download timeout not active ?
Summary: download timeout not active ?
Status: RESOLVED FIXED
: 105996 181924 182167 (view as bug list)
Alias: None
Product: SUSE Linux 10.1
Classification: openSUSE
Component: libzypp (show other bugs)
Version: Final
Hardware: Other Other
: P5 - None : Critical (vote)
Target Milestone: ---
Assignee: Marius Tomaschewski
QA Contact: Mauro Parra Miranda
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-06-04 12:22 UTC by Graham Anderson
Modified: 2007-01-22 12:52 UTC (History)
9 users (show)

See Also:
Found By: Other
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
trace dumps, zmd-*.log (4.83 MB, application/x-bzip)
2006-06-04 12:25 UTC, Graham Anderson
Details
New zmd logfiles (4.42 MB, application/x-tbz)
2006-06-13 09:49 UTC, Michael Calmer
Details
Proposed zypp transfer timeout fix (5.37 KB, patch)
2006-06-14 09:08 UTC, Marius Tomaschewski
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Graham Anderson 2006-06-04 12:22:15 UTC
I installed AJ's latest test builds (2006-06-02) of the update stack.

Service refresh/maintenance at first seemed to be fine but this morning update-status helper consumed 100% of available memory and 50% swap of available swap ( approx 800MB free RAM from available 1GB total, approx 500MB swap from 1GB available )

I managed to attach a trace to the udpate-status process a couple of times before I had to kill zmd/update-status to get a usable system back.

In the attached archive are two trace dumps, zmd-*.log, and an excert from top

If it's of any relevance, /var/lib/zmd/zmd.db has ballooned in size to 586MB
Comment 1 Graham Anderson 2006-06-04 12:25:42 UTC
Created attachment 87054 [details]
trace dumps, zmd-*.log
Comment 2 Tambet Ingo 2006-06-05 16:11:47 UTC
I think I saw a similar bug report last week but I can't find it. The cause was that yast adds zypp services over and over again using slightly different uri each time, so the database had like 20 copies of each zypp repositories.
Comment 3 Graham Anderson 2006-06-05 16:29:48 UTC
These are my current registered services

# | Status | Type | Name                    | URI
--+--------+------+-------------------------+-----------------------------------
1 | Active | ZYPP | SUSE-Linux-10.1         | dir:///home/cenuij/suse/cd1
2 | Active | ZYPP | SUSE-Linux-10.1-Updates | http://suse.inode.at/pub/update...
3 | Active | ZYPP | SUSE-Linux-10.1-online  | http://ftp4.gwdg.de/pub/opensus...
4 | Active | YUM  | guru                    | http://ftp.skynet.be/pub/suser-...
5 | Active | YUM  | packman                 | http://packman.inode.at/suse/10.1/
6 | Active | YUM  | KDE3-Backports          | http://software.opensuse.org/do...

The only service that's been added since I installed latest test build was #6, all services were added with 'rug sa' with the exception of the update service which was added with suse_register script.

would a copy of my zmd.db be of further help?
Comment 4 Klaus Kämpf 2006-06-06 09:54:47 UTC
*** Bug 181924 has been marked as a duplicate of this bug. ***
Comment 8 Klaus Kämpf 2006-06-06 10:22:04 UTC
rpm -q --changelog libzypp| head
* Thu Jun 01 2006 - kkaempf@suse.de
- compute status for scripts and messages so their freshens get
  properly honored (aj with postgresql-server)
- rev 3494

* Thu Jun 01 2006 - dmacvicar@suse.de
- revert not-used-yet rpmdb timestamp, as
  it broke rpmdb::init(). (#180040)
- rev 3490
Comment 9 Klaus Kämpf 2006-06-06 10:26:03 UTC
Hmm, running YaST shows no problems in setting up the sources. It must be something in update-status ?!
Comment 10 Klaus Kämpf 2006-06-06 10:32:21 UTC
Its the sqlite database with > 800MB in size.
Comment 11 Klaus Kämpf 2006-06-06 10:33:16 UTC
Its NOT the zypp cache, its update-status recreating its world from the sqlite database.
Comment 12 Klaus Kämpf 2006-06-06 10:53:22 UTC
It looks like parse-metadata was called for the catalogs without removing previous entries from the database:
sqlite> select count(*) from resolvables where catalog='ftp://10.10.0.100/install/SLP/SUSE-10.1-DVD9-RC5/i386/DVD1?alias=SUSE-Linux-10.1-DVD9-x86-x86_64-10.1-0-20060511-104056';
173240
Comment 13 Klaus Kämpf 2006-06-06 11:44:39 UTC
I added a "delete from resolvables where catalog=..." to the helpers whenever resolvables are written to a catalog.
Comment 15 Joe Shaw 2006-06-07 21:10:14 UTC
*** Bug 182167 has been marked as a duplicate of this bug. ***
Comment 16 Michael Calmer 2006-06-13 09:45:29 UTC
Klaus seems not realy working.

I have now a update-status process which is running since 5 Days. It does not consume so much memory, but it does not finish.

I found this in zmd-messages:
08 Jun 2006 21:08:27 INFO  RedCarpetBackend+RCProgress terminate called after throwing an instance of 'zypp::Exception'
08 Jun 2006 21:08:27 INFO  RedCarpetBackend+RCProgress   what():  Can't check if source has changed or not. Aborting refresh.
08 Jun 2006 21:10:50 INFO  RedCarpetBackend     Updating status of patches...

The result is, that zmd does not answer rug commands nor zen-updater commands.

I attach the zmd-*.logs
Comment 17 Michael Calmer 2006-06-13 09:49:51 UTC
Created attachment 88954 [details]
New zmd logfiles
Comment 22 Klaus Kämpf 2006-06-13 11:27:59 UTC
Its stuck in download:

opy):612 ./media.1/directory.yast
opy):645 URL: http://download.opensuse.org/distribution/SL-10.1/inst-source/media.1/directory.yast
opy):702 dest: /var/adm/mount/AP_0x00000006/media.1/directory.yast
opy):703 temp: /var/adm/mount/AP_0x00000006/media.1/directory.yast.new.zypp.ednbkN

-> Marius


And it downloads because your source cache is broken:
> ls -l /var/lib/zypp/cache/Source.hDjLLN
total 0

Comment 23 Klaus Kämpf 2006-06-13 11:34:10 UTC
yast sw_single also fails to start up properly because of the broken cache on the system.
Comment 26 Marius Tomaschewski 2006-06-13 14:30:34 UTC
Reassigning to maintainer of libcurl.
Comment 27 Klaus Kämpf 2006-06-13 14:35:47 UTC
IMHO, we should set CURLOPT_TIMEOUT to a rather large value (like 24h), but we should set it nevertheless as this bug report shows.

Short calculation: 24h = 86400 seconds, assuming 1k/sec transfer speed (9600 baud) will download 88MB in one day. Kernel (or OpenOffice) updates need a better connection.

Ideally, make it a sysconfig variable  (/etc/sysconfig/onlineupdate), default to 24h if not set, set no timeout if variable is set to 0.

Reassign to Marius
Needinfo to libcurl maintainer
Comment 29 Michal Marek 2006-06-13 15:02:46 UTC
So, do I understand it correctly that you need to timeout when there is no
incomming data for a longer period of time, but you don't want to break slow
downloads of large packages?

I think a progress callback might help you, it should get called peridically
and you could check whether the amount of downloaded data (second parameter of
the callback) changes. If it stays same for a longer time, return non-zero
and the transfer will be aborted.

See http://curl.haxx.se/libcurl/c/curl_easy_setopt.html#CURLOPTPROGRESSFUNCTION

Hope that helps.
Comment 30 Michal Marek 2006-06-13 15:45:38 UTC
BTW I see you already use the progress callback in libzypp, it just allways
returns 0 (if I read the source correctly). What about implementing the check
described above in the MediaCurl::progressCallback (or the report object)?
Comment 31 Marius Tomaschewski 2006-06-14 08:28:17 UTC
Yes, I'm currently implementing it. It looks like it would be.
Comment 32 Marius Tomaschewski 2006-06-14 09:08:39 UTC
Created attachment 89293 [details]
Proposed zypp transfer timeout fix
Comment 33 Marius Tomaschewski 2006-06-14 09:13:24 UTC
Additionally, I plan to add a repository of default / global settings
to the MediaManager, so they'll be used if there is no url parameter.
Comment 34 Marius Tomaschewski 2006-06-14 10:56:32 UTC
Thanks Michal!

I've submitted the above patch to SVN head - it seems to work fine.
Michael, let me know if you find any problems while your tests.

The current transfer default is set to 180 seconds and can be changed
using the "timeout" url query parameter (max 3600 sec, 0 to disable).
We will discuss where to put the global / default setting for it (e.g.
some sysconfig file).

Additionally to the timeout, the receiver of the DownloadProgressReport
can also abort the transfer by returning 1 in the progeess method.
Comment 35 Marius Tomaschewski 2006-06-14 12:44:17 UTC
In STABLE now.
Comment 36 Klaus Kämpf 2006-06-23 14:33:10 UTC
*** Bug 187779 has been marked as a duplicate of this bug. ***
Comment 37 Michael Andres 2007-01-22 12:52:43 UTC
*** Bug 105996 has been marked as a duplicate of this bug. ***