Parallel (mis)Adventures

August 18th, 2017 Posted by Embarcadero 4 comments

In one of our customer projects, we had an Android app that was receiving broadcasts on a background thread (I’ll call this the Receiver thread). On receipt of a message, the Receiver thread needed to do some processing on that message and then update the main thread (the UI thread).

The initial solution to this was pretty simple, call TThread.Synchronize from the Receiver thread, passing in an anonymous method that updates the UI Thread.

Testing this on Windows worked fine, however running it on Android revealed a problem.

Occasionally if the Receiver thread took too long processing the message, or more likely, updating the UI thread, Android would decide that it had stopped responding and would kill it.

Rather than reproducing the code here from the actual app, which has lots of other unrelated things in it getting in the way, I’ve mimicked this situation by having an Anonymous thread (taking the place of our Receiver thread) loop from 1 to 100 and using Synchronize to update a Listbox in the UI thread with the current value (taking the place of our Messages). So the code looks like this:


So, in our mimicked scenario, if our anonymous method in our call to Synchronize took too long to run, Android would come along and kill our Receiver thread.

No problem. If you’ve spent much time with TThread you’ve probably noticed the Queue method. This allows you to do an Asynchronous Synchronize. Yes, I realise that sounds like it might be a contradiction in terms, but give me a better way to describe it and I’ll use it. If you look under the covers of TThread.Queue, it adds your anonymous method to a TList and then returns. At some point in the future the UI thread will grab it off the list and execute it, but you don’t have to wait around for that to happen. Why a TList and not a TQueue, given the method is called Queue and all? No idea, I was a bit surprised too, but it doesn’t matter for our purposes.

Awesome, that should speed up our Receiver thread, and it was an easy change as Queue and Synchronize have identical arguments. Just change Synchronize to Queue and go home.

No, don’t go home! Run your tests!

Run your tests, because this exposed another problem that was hidden by the synchronous nature of Synchronize. Can you spot it?

I’ll give you a clue: With anonymous methods, variable capture is by reference, not by value.

Got it now?

In the code above, in the anonymous method I pass to Synchronize, I’m capturing a reference to the loop variable I. That works fine because Synchronize blocks until my anonymous method runs, so when it gets added to the Listbox, I still has the same value as it did back in the Receiver thread.

However, change Synchronize to Queue and it no longer blocks. On the first pass of my loop, I will have a value of 1. When we call Queue, the anonymous method captures a reference to I and gets added to a List. At that point my loop goes around again and I is now 2. By the time the anonymous method we passed in on the first pass of my loop gets executed by the UI thread, who knows how many loops we’ve done and what value I actually has. In fact, you can see in a screenshot.

That’s right. Our entire loop was finished and the Receiver thread had moved on before even the first anonymous method had executed. Variable Capture by Reference had resulted in all the data from the earlier iterations being lost.

I’ve seen people bitten by this with anonymous methods before, even without threads. Your anonymous method is not reading the value of the variable at the time you defined it, it’s reading the value at the time you execute it. This is not a bug, it’s just how variable capture works.

So what to do? Well, I’m going to show you two solutions to this.

Solution 1 : If the variable changing is the problem, don’t let the variable change.

Instead of capturing I directly, in each loop iteration add the value of I into a collection and let the anonymous method capture the collection. The collection reference is the same from iteration to iteration, so our problem goes away. In our scenario we wanted our messages processed in the order they were received (ie. First In, First Out) so we used a TQueue. It’s going to have items added to it in one thread (the Receiver thread) and items removed in another thread (the UI thread) so it needs to be a threadsafe queue. If I modify the code like so (FQueue is a field in my Receiver thread):

and when we run it we get something that looks a lot healthier:

In the actual Android app, the message processing also had the potential to be a little time consuming, so we wanted to parallelise both the processing and the adding to the Queue. We ended up spawning a TTask to do that, rather than another Anonymous thread, which had the added benefit of not overwhelming our poor phone with lots more threads than CPU cores. Instead, the PPL will use a pool of worker threads to handle it, and tune the number of threads at runtime. We can also change TThread.Queue back to TThread.Synchronize, as it no longer matters if this blocks as it is being done in a TTask.

To summarise this approach, capturing a reference to the threadsafe queue is fine as it never changes. The content inside the queue changes, but given enqueue and dequeue are threadsafe that’s not a problem.

That works, but it might be overkill in some cases to instantiate a whole other container to hold the values. So let’s look at an alternative.

Solution 2: Execute the Anonymous Method immediately

This one might be slightly less obvious. In fact, I owe credit for this one to my colleague, Sergey. I’d seen this technique in a Javascript context before, but never thought of applying it to this particular problem.

Like I said before, your anonymous method is not reading the value of the variable at the time you declared it, it’s reading the value at the time you execute it. So this solution relies on the idea of executing it immediately (and therefore converting it from a reference to a value), in the context of the Receiver thread. Once we have the value, we can then use it from a second anonymous method, the one passed to TTask.Run.

Here’s the code:

There’s a fair bit going on here, so let’s unpack it:

  • The call to TTask.Run expects a TProc, which is an anonymous procedure with no parameters (line 5)
  • Rather than passing a TProc in directly, we’re instead defining another anonymous method (the wrapper method) that takes a const Integer parameter and returns a TProc (the task method). It is defined starting on line 5 and you can see on line 7 where it returns an anonymous procedure with no parameters (the task method).
  • Inside the task method, we use variable capture to grab a reference to the Value parameter of our wrapper method, rather than I directly (line 13).
  • All we’ve done so far is define the wrapper method. The trick is on line 16, where we immediately execute it, passing in I as the parameter. The wrapper method executes, converting our reference to I to a Value, which is then safe to be captured by the returned TProc as it is “cut off” from the loop variable I.

If it makes it easier to understand, this could be written out in stages like this:

Whichever version you prefer, the point is using the wrapper anonymous method to return the TProc allows us to safely reference the data without the next loop overwriting it.

Wrap-up

So, what are the lessons from this story?

There are probably a few:

  • Anonymous Methods capture variables by reference, not by value.
  • Parallelisation complicates things, but not beyond your understanding if you simplify the problem down. Besides, these days it’s becoming very difficult to avoid parallelisation completely, so you may as well get used to it. Write some examples, set some breakpoints and use the IDE’s Thread View to get an understanding of what’s going on. Also, if you haven’t noticed it before, the Breakpoint view has a Thread column, so when a breakpoint fires you can tell which thread you’re in. Very helpful for debugging where exactly an anonymous method is executing.
  • Run your tests before you declare victory and go home. You do have tests, right?
Tags: , , ,

4 comments

Jennifer says:

If order of execution isn’t important, this should be ok, right?

for I := 1 to 100 do
LTask := TTask.Run( procedure
var J : Integer;
begin
J := I;
TThread.Synchronize(nil, procedure
begin Listbox1.Items.Add(J.ToString) end
);
end
);

Malcolm says:

Unfortunately not. I tried that one, and it’s better, but not fixed.

If you run that, instead of seeing 101’s all through the listbox, you’ll see a few lower numbers, then the rest are 101’s 🙂

Like this:

23
47
86
101
101
101
101
101

Basically, you’re grabbing it earlier than in the Synchronize, but there is still a gap before your Task actually gets scheduled when the loop counter is incrementing.

Jennifer says:

Would TParallel.For make a difference? (again, not keeping order)

Malcolm says:

Well, possibly, but remember the for loop is just me faking what’s actually happening, which is receiving broadcasts. No for loop involved.

Leave a Reply