Bugzilla – Bug 315999
System.Threading.Timer 20x slower than MSFT's
Last modified: 2007-10-16 10:29:03 UTC
---- Reported by bmaurer@users.sf.net 2004-09-11 00:39:30 MST ---- This test takes 20x longer on Mono than on MSFT's runtime: using System; using System.Threading; class X { static void Main () { for (int i = 0; i < 1000; i ++) { Timer t = new Timer (new TimerCallback (Foo), i, 1000, 0); // do stuff t.Dispose (); } } static void Foo (object o) {} } The problem is that we create a thread for each timer, while MSFT appears to only create one thread for the lifetime of the program. ---- Additional Comments From raf@ophion.org 2006-06-25 14:40:47 MST ---- Created an attachment (id=166759) Single thread timer scheduler ---- Additional Comments From gonzalo@ximian.com 2006-07-30 20:12:31 MST ---- If the callback throws, the scheduler thread is killed. ---- Additional Comments From bmaurer@users.sf.net 2006-07-31 06:33:58 MST ---- Style wise, this patch needs some work: + static object sync_obj = new object(); + static object sync_obj2 = new object(); How are those used? Why is TimerScheduler a class? it seems like it could be static methods in Timer? Also, why a TimerJob class. This is a 1<->1 relation with a Timer. Variables and methods are named with mixed conventions. Make things consistent, at the very least. In general, no underscore_names for methods. Also, enum values shouldn't be C_CONSTANT_NAMED. + readonly int TIME_SLICE = 10 ; // 10 msec Should be const + log("could not properly signal timer-scheduler, waiting..."); + Thread.Sleep(5); THis is *really* *really* dubious. You need to do something like "calcuate the next time you will wake up, and wait on an object with that timeout. Pulse the object if a new timer is added". This will wake up the CPU far too often, and (for example) waste battery power. It is a blocker for this patch. + void log(string str) { + if (Environment.GetEnvironmentVariable("MONO_TIMER_DEBUG") != null) Should probably be cached. ---- Additional Comments From bmaurer@users.sf.net 2006-07-31 06:36:47 MST ---- Never mind this comment: THis is *really* *really* dubious. You need to do something like "calcuate the next time you will wake up, and wait on an object with that timeout. Pulse the object if a new timer is added". This will wake up the CPU far too often, and (for example) waste battery power. It is a blocker for this patch. The sleep is dubious, but it's not polling the cpu like i thought it was. It's 3 am here ;-) ---- Additional Comments From raf@ophion.org 2006-08-01 12:32:07 MST ---- Awesome feedback guys... my responses are prefixed by ">>" * If the callback throws, the scheduler thread is killed. >> perfect catch, I"ll add a catch all "catch Exception" in order to correct this. Style wise, this patch needs some work: + static object sync_obj = new object(); + static object sync_obj2 = new object(); How are those used? >> sync_obj is used to prevent race conditions on the "GetInstance" call of the singleton TimerScheduler class. sync_obj2 is used to make signaling the scheduler class sane so only one signal is sent at a time and a decision can be made on the signal so signals are "batched" or discarded .With that said, I'm not sure why I have 2 separate sync objects, I should be able to combine them without issues. Why is TimerScheduler a class? it seems like it could be static methods in Timer? Also, why a TimerJob class. This is a 1<->1 relation with a Timer. >> Timer scheduler is a singleton class because it has a 1-1 mapping with a timer-scheduler thread which is the thread in charge of firing all the timer events, so it is a 1 to n mapping between timer-scheduler threads and timers. Now with regards to why it needs to be a class of its own instead of static methods in Timer i guess it's merely a matter of choice... I believe I chose that route just to keep all the scheduler logic in one place, which made it easier for me to merge it with the Timer class. Of course, if you guys want, I can made the conversion. I believe TimerJob follows under the same logic, I just wanted to keep the scheduler specific meta data in one place. Variables and methods are named with mixed conventions. Make things consistent, at the very least. In general, no underscore_names for methods. Also, enum values shouldn't be C_CONSTANT_NAMED. >> I'll correct all that + readonly int TIME_SLICE = 10 ; // 10 msec Should be const >> same here + log("could not properly signal timer-scheduler, waiting..."); + Thread.Sleep(5); THis is *really* *really* dubious. You need to do something like "calcuate the next time you will wake up, and wait on an object with that timeout. Pulse the object if a new timer is added". This will wake up the CPU far too often, and (for example) waste battery power. It is a blocker for this patch. >> Yeah the logic is slightly counter intuitive but it actually performs really well under most conditions. The timer logic is actually very very loosely base on Quartz for java (I guess the main loop logic is the same). From my testing you start seeing issues once the number of number of timers is very large and the period of the events is small which puts pressure on the scheduler to run more often and doesn't give it a chance to sleep. Also, if the "period" of the Timer is shorter than 10 msec (the TIMER_SLICE base), we start having issues as well, and that is not as simple of a problem to solve as it may seem. I did some performance observations that you might find interesting here: http://ophion.org/index.php?gadget=Blog&action=SingleView&id=9 + void log(string str) { + if (Environment.GetEnvironmentVariable("MONO_TIMER_DEBUG") != null) Should probably be cached. >> agreed >> I'm going to post a new patch tonight reflecting those changes, I just need to test removing the "sync_obj" to make sure it doesn't create any problems. Also Miguel had suggested to change the patch to not use thread pool threads, so I might make that change as well but I'm not sure how that is going to affect performance. Once again thanks for the very insightful feedback! - raf ---- Additional Comments From bmaurer@users.sf.net 2006-08-01 15:10:00 MST ---- Please put the "how locks are used" comments in the file itself. I'd like to see the refactoring done to remove the two extra classes. I think the mapping just causes more code. ---- Additional Comments From raf@ophion.org 2006-08-10 11:51:58 MST ---- Created an attachment (id=166760) new patch ---- Additional Comments From raf@ophion.org 2006-08-10 11:52:43 MST ---- changelog for the attached patch: * fixed all cosmetic issues (let me know if I missied something) * got rid of inner classes (there's only 1 left: Runner) * Added logic to fire events using non-threadpool threads (that's the default bevahior but you can set an env variable to makes the scheduler use the thread pool again) * Added caching logic to the log method * Removed sync_obj2 Let me know how committable this one is! ---- Additional Comments From miguel@ximian.com 2006-08-12 13:24:21 MST ---- ccing gonzalo ---- Additional Comments From gonzalo@ximian.com 2006-08-12 14:06:45 MST ---- On Monday I'll check this out. ---- Additional Comments From gonzalo@ximian.com 2006-10-03 15:01:59 MST ---- When running the tests, mono hangs. One of the threads is here: "Timer-Scheduler" tid=0x0xb693bba0 this=0x0x33c9a0: at (wrapper managed-to-native) System.Object.__icall_wrapper_mono_object_new_specific (intptr) <0x00004> at (wrapper managed-to-native) System.Object.__icall_wrapper_mono_object_new_specific (intptr) <0xffffffff> at Enumerator.get_Current () [0x0003d] in /mono-HEAD/src/trunk/mcs/class/corlib/System.Collections/Hashtable.cs:956 at System.Threading.Timer.SchedulerThread () [0x0006e] in /mono-HEAD/src/trunk/mcs/class/corlib/System.Threading/Timer.cs:213 at (wrapper delegate-invoke) System.MulticastDelegate.invoke_void () <0xffffffff> at (wrapper runtime-invoke) System.Object.runtime_invoke_void (object,intptr,intptr,intptr) <0xffffffff> ---- Additional Comments From raf@ophion.org 2006-10-04 00:46:56 MST ---- Gonzalo, can you rerun the test with MONO_TIMER_DEBUG=1? I'm going to look into this shortly. - raf ---- Additional Comments From raf@ophion.org 2006-10-05 01:46:36 MST ---- Gonzallo, I'm looking into this further and on my 1 cpu machine the only Timer test that fails is MonoTests.System.Threading.TimerTest.TestDisposeOnCallback which was marked as [NotWorking] all the other tests are passing. Can you post the name of the failing test? Also, can you try running it with the timer debug flag on? ---- Additional Comments From raf@ophion.org 2006-11-05 14:01:46 MST ---- Created an attachment (id=166761) updated patch fixing Dispose deadlock issue. ---- Additional Comments From raf@ophion.org 2006-11-05 14:04:36 MST ---- I believe I found and fixed the deadlock issue you encountered, please give the attached patch a shot. Also, I've emailed the mono-dev mailing list to start a discussion about the patch. ---- Additional Comments From gonzalo@ximian.com 2006-11-28 20:04:50 MST ---- Adding this patch makes the JIT say: ** (process:23595): ERROR (recursed) **: file domain.c: line 177 (mono_jit_info_table_find): assertion failed: (ret == 0) aborting... when running make run-test in corlib ---- Additional Comments From raf@ophion.org 2006-12-11 19:12:16 MST ---- thanks gonzallo, I'll look into this shortly and I'll try to ping you in IRC after I have time to research the JIT error. Imported an attachment (id=166759) Imported an attachment (id=166760) Imported an attachment (id=166761) Unknown operating system unknown. Setting to default OS "Other".
Created attachment 178610 [details] rewritten patch The patch doesn't easily apply anymore, so I rewrote it, with a few simplifications and a few bug fixes (mostly locking on Jobs and avoiding the Abort calls). I'll apply in a day or two if there are no objections.
Fixed in svn. For reference, the new code creates 10K timers in 1.5 seconds on my box. The old code dies with a "can't create thread" exception after 6 seconds. There are more optimizations possible, but this is already a big improvement (in the stability of the code, too).