Hendry Luk — Sheep in Fence

September 22, 2011

My Daily WTF: Is Timer Broken?

Filed under: Software Development — Hendry Luk @ 1:36 am
Tags: ,

I’ve been working with Task Parallel Library quite extensively lately, and one common thing I need to do is to schedule my tasks with some kind of timing intervals, and System.Threading.Timer is the exact tool for this.

Problem starts when my task randomly just go to a mysteriously halt. The problem with doing asynchronous stuff is the premise of asynchronous itself: when any of your tasks is in a wait state, it will drop its current thread back to the thread pool, and therefore you won’t find anything in the debugger when your tasks do come to a mysterious halt. There’s no facility in the Visual Studio debugger that allows you to possibly find the last call-stacks of your halting tasks.

I spent almost the whole day sprinkling a lot of Trace information all over the code, and I finally managed to narrow down to my little Delay() method that uses System.Threading.Timer. For some reason, my Timer would occassionaly fail to fire back.


new Timer(_=> Trace.WriteLine("fired back"), null, timespan, TimeSpan.FromMilliseconds(-1));

Most of the times, the timer fires back successfully, except in some rare occasions where apparently it never comes back at all. After pulling my hair in frustration until I had little left, I decided to make a desperate stab and change the code to this:


new Timer(_=> Trace.WriteLine("fired back")).Change(timespan, TimeSpan.FromMilliseconds(-1));

I wasn’t expecting anything magical to come out of this, but it did! It fixed the bug. My timer now reliably fires back 100% of the times! I can’t explain why, but I’ll definitely file a bug report to Microsoft and see if they can come up with any logical explanation. I can’t provide a reproduction test for this though, as this bug only happens when I plug a lot of moving parts to the system. Whenever I try to reduce the complexities (to isolate the issue), the bug disappears. It almost seems like a dirty secret that the universe doesn’t want you to see.

Anyway, that has been an utterly frustrating 10 hours, and I’m not sure whether it’s a relief or anger that I’m feeling right now, but at least this pile of coffee mugs will finally clear out.

About these ads

2 Comments »

  1. Unfortunately you probably didn’t fix it. What you describe is symptomatic of a subtle race condition somewhere else in your code. Not saying there aren’t lurking wolves, but…

    Comment by J — May 30, 2012 @ 3:26 pm | Reply

    • J, unfortunately that does not seem to be the case. I wasnt quick to make that conclussion.
      I had produced a diagnostic report to prove with best possible certainty that the Timer got initialized (with a constant timespan) but (a rare number of them) did not deliver its contractual duty (firing the callback) under extreme stress tests in a controlled environment.
      Btw no Timer instance ever reused (as you see in the code, it always instantiates new instance everytime) which rules out race condition (in my code, that is).
      I do want to believe Timer is not broken, but all findings so far have only proven otherwise.

      Comment by Hendry Luk — May 30, 2012 @ 11:46 pm | Reply


RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

The Rubric Theme. Create a free website or blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: