Monday, May 02, 2011

The Basics of Task Parallelism via C#

Preface
The trend towards going parallel means that .NET Framework developers should learn about the Task Parallel Library (TPL). But in general terms, data parallelism uses the input data to some operation as the means to partition into smaller pieces. The data is divvied up among the available hardware processors in order to achieve parallelism. It is then often followed by replicating and executing some independent operation across these partitions. It is also typically the same operation that is applied concurrently to the elements in the dataset.
Task parallelism takes the fact that the program is already decomposed into individual parts – statements, methods, and so on – that can be run in parallel. More to the point, task parallelism views a problem as a stream of instructions that can be broken into sequences called tasks that can execute simultaneously. For the computation to be efficient, the operations that make up the task should be largely independent of the operations taking place inside other tasks. The data-decomposition view focuses on the data required by the tasks and how it can be decomposed into distinct chunks. The computation associated with the data chunks will only be efficient if the data chunks can be operated upon relatively independently. While these two are obviously inter-dependent when deciding to go parallel, they can best be learned if both views are separated. A powerful reference about Tasks re Compute-bound asynchronous operations is Jeffrey Richter’s book, “CLR via C#, 3rd Edition.” It is a good read.

In this brief article we will focus on some of the characteristics of the System.Threading.Tasks Task object. To perform a simple Task, create a new instance of the Task class, passing in a System.Action delegate that represents the workload that you want performed as a constructor argument. You can explicitly create the Action delegate so that it refers to a named method, use an anonymous function, or use a lambda function. Once you have created an instance of Task, call the Start() method, and your Task is then passed to the task scheduler, which is responsible for assigning threads to perform the work. Here is example code:

using System;
using System.Threading.Tasks;
public class Program {
 public static void Main() {
// use an Action delegate and named method
Task task1 = new Task(new Action(printMessage));
// use an anonymous delegate
Task task2 = new Task(delegate { printMessage() });
// use a lambda expression and a named method
Task task3 = new Task(() => printMessage());
// use a lambda expression and an anonymous method
Task task4 = new Task(() => { printMessage() });
task1.Start();
task2.Start();
task3.Start();
task4.Start();
Console.WriteLine("Main method complete. Press <enter> to finish.");
Console.ReadLine();
 }
private static void printMessage() {
  Console.WriteLine("Hello, world!");
 }
}

To get the result from a task, create instances of Task, where T is the data type of the result that will be produced and return an instance of that type in your Task body. To read the result, you call the Result property of the Task you have created. For example, let's say that we have a method called Sum. We can construct a Task object, and we pass for the generic TResult argument the operation's return data type:

using System;
using System.Threading.Tasks;
public class Program {
 private static Int32 Sum(Int32 n)
 {
  Int32 sum = 0;
  for (; n > 0; n--)
  checked { sum += n; } 
  return sum;
 }
 public static void Main() {
   Task<int32> t = new Task<int32>(n => Sum((Int32)n), 1000);
   t.Start();
  t.Wait(); 
        // Get the result (the Result property internally calls Wait) 
        Console.WriteLine("The sum is: " + t.Result);   // An Int32 value
    }
 }

Produces:
The sum is: 500500

If the compute-bound operation throws an unhandled exception, the exception will be swallowed, stored in a collection, and the thread pool is allowed to return to the thread pool. When the Wait method or the Result property is invoked, these members will throw a System.AggregateException object. You can use CancellationTokenSource to cancel a Task. we must rewrite our Sum method so that it accepts a CancellationToken, after which we can write the code, creating a CancellationTokenSource object.

Read more: Codeproject