Sunday, August 22, 2010

.Net FW 4: Data Parallelism

I just started to look at the .NET Framework 4, yeah I know I’m late, and everything new it brings us so I thought I’d do a small post about the Data Paralellism and show a small example of Parallel.For loop and the comparison with a standard for loop. The code bellow is pretty simple, it additions integers from 0 to 2 million (exclusive) and put the result in a long variable and it displays the result and the elapsed time in milliseconds for each loop.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;
using System.Diagnostics;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch watch = new Stopwatch();

            int[] nums = Enumerable.Range(0, 2000000).ToArray();
            long total = 0;

            watch.Reset();
            watch.Start();
            for (int i = 0; i < nums.Count<int>(); i++ )
            {
                total += nums[i];
            }
            watch.Stop();

            Console.WriteLine("The total is {0}", total);
            Console.WriteLine("'For' loop completed in " + watch.ElapsedMilliseconds.ToString() + " milliseconds.");
            Console.WriteLine();

            total = 0;
            watch.Reset();
            watch.Start();

            Parallel.For<long>(0, nums.Length, () => 0, (j, loop, subtotal) =>
            {
                subtotal += nums[j];
                return subtotal;
            },
                (x) => Interlocked.Add(ref total, x)
            );

            watch.Stop();
            
            Console.WriteLine("The total is {0}", total);
            Console.WriteLine("'Parallel.For' loop completed in " + watch.ElapsedMilliseconds.ToString() + " milliseconds.");
            Console.WriteLine("Press any key to exit");
            Console.ReadKey();
        }
    }
}
As you can see in the following screenshot, The Parallel.For loop is much faster than the standard for loop.


Using Parallelism is usually faster but in case of very simple processing it can in fact be a little bit longer due to the parallelism layer that is added. I also suggest having a look at the other available classes from the System.Threading.Task Namespace as the TaskFactory, which is used to create asynchronous operations and can be a good alternative to the BackGroundWorker.