Bugzilla – Bug 105796
beagle crawl runs inconveniently, uses excessive memory
Last modified: 2005-09-02 18:22:33 UTC
beagle (or its mono run daemon) slows my testmachine (i386, 1.3GHz, 256mb) down so its nearly totally unusable! As beagle (and its daemon) get started by default on a fullinstall => i get a quite unusable desktop
Are you using the stock kernel? If so inotify should be enabled and the beagled overhead should be minimal
Frank: What does 'top' report? Is beagle consuming lots of CPU, or is the slowness caused by excessive memory consumption that leads to swapping?
Also, can you see if a task named BuildIndex is running from cron?
I just fixed a really nasty inefficiency in the evolution mail backend, so if you use IMAP on evolution and have a ton of mail there, this may have been the issue.
I see the beagle-crawl-system running from cron. And it ate up almost 800M memory, just before I killed it.
hi, we always use the stock kernel for testings, but the overhead was not minimal but more like very extreme. Top showed: both, beagled (or its mono-processes) took most cpu and took nearly all memory of the system. Its' surely also no evolution-mail problem. This happens on a fresh install after the first login (so no evolution ever used and no mails on the system).
I think beagle-crawl-system had a lot to do with this; it was starting within 15 minutes of the first system boot, when it was intended to be something that runs in the middle of the night as it is heavily IO bound. I've fixed it upstream so that it'll only run at 4:30 am... this will go into beta4.
105784 is turning into a dupe of this, though that wasn't apparent initially. Installing beagle-index has been proposed in that bug. I'm going to mark that one as a dupe of this. I'm also going to make this bug more widely visible. Joe: people are also reporting insane memory usage (AJ reports 990 MB at one point), which makes the process even more IO bound once it starts thrashing. Any idea what's up there? FWIW, everyone who has provided details seems to be describing the cron crawl (perhaps specifically in docs) rather than home directory indexing.
*** Bug 105784 has been marked as a duplicate of this bug. ***
Adjusting severity, component, and summary.
Adding some dropped Cc:'s.
Joe, is the help system crawler also using ionice?
Mark: I'm not sure what's up with the memory size. My guess is a bug in the filter. I'll take a look. JP: beagle-build-index isn't using the ionice stuff. I'll add that, it's a one line addition.
Apparently at least one of the memory-bloat bugs was from indexing a home directory rather than docs, which opens up the list of candidate buggy filters (and makes it harder to reproduce -- contents of home directories vary wildly). I tried this on a fresh beta3 install with a home directory including a directory full of .txt, .pdf, .html, etc. (mounted in ~/data), and mono-beagled is currently consuming 102m virt, 86m res, 13m shr, having seemingly finished indexing.
Filters won't cause beagled to grow, it'd affect the index-helper. If beagled is growing, it's more likely a bug in a backend than a filter. Narrowing it down with --list-backends and --allow-backend and --deny-backend would give us a good idea which it is. Also running beagled with --debug-memory will cause it to log memory usage data.
The ionice part is checked in for beta 4 as well as the cron job adjusted to a better time.
Can anyone still replicate this bug and assist in narrowing down the backend causing the problem?
beagle-crawl-system seems to behave much better in bete4 (with and without beagle-build-index.rpm). Running beagled with $HOME=/usr also seems to be fine. I cannot comment how beagle behaves with a real world $HOME though. BTW: With the smaller footprint now, why don't you use cron.daily? A homeuser usually never ever runs his computer at 4.30 am.
I think that Joe's changes have solved this problem, so I'm closing the bug as FIXED.