Thursday, April 18, 2013

Parallelizing a Long Running Service

We will write some simple sample code that will simulate a long running service call.
You are going to use the PayrollServices.GetPayrollDeduction() method which is provided with the begin solution of this article. This is the type of long running code that you would ultimately like to run in parallel.

  1. Download the begin solution from uploaded.net
  2. Open Microsoft Visual Studio 2010 from Start | All Programs | Microsoft Visual Studio 2010 | Microsoft Visual Studio 2010.
  3. Open the solution file ParallelExtLab.sln located under begin solution.
    Note: This solution contains a starting point for your work, and includes a helper class EmployeeList which holds the data you’ll be working with.
  4. In Visual Studio, open the Program class and navigate to its Main() method. First, we will need to create a list of employees to work on, so add a class variable and initialize it in the Main() method:
    C#
    class Program
    {
        private static EmployeeList employeeData;
       
        static void Main(string[] args)
        {
           employeeData = new EmployeeList();

           Console.WriteLine("Payroll process started at {0}", DateTime.Now);
           var sw = Stopwatch.StartNew();

           // Methods to call

           Console.WriteLine("Payroll finished at {0} and took {1}",
                                 DateTime.Now, sw.Elapsed.TotalSeconds);
           Console.WriteLine();
           Console.ReadLine();
        }
    }
  5. Now add the following method to the Program class. This method will use a standard for loop to iterate through a list of Employees, as provided by the pre-built code and call the long-running PayrollServices.GetPayrollDeduction() method. The code should look like:
    C#
    private static void Ex1Task1_ParallelizeLongRunningService()
    {
        Console.WriteLine("Non-parallelized for loop");

        for (int i = 0; i < employeeData.Count; i++)
        {
            Console.WriteLine("Starting process for employee id {0}",
                employeeData[i].EmployeeID);
            decimal span =          
                PayrollServices.GetPayrollDeduction(employeeData[i]);
            Console.WriteLine("Completed process for employee id {0}" +        
                "process took {1} seconds",
                employeeData[i].EmployeeID, span);
            Console.WriteLine();
        }
    }
  6. Call the method Ex1Task1_ParallelizeLongRunningService from Main().
    C#
    static void Main(string[] args)
    {
        ...
        // Methods to call
        Ex1Task1_ParallelizeLongRunningService();
        ...
    }
  7. Build and run the application.
  8. You should see that the employees are all processed in order of their IDs, similar to the following (the exact time to complete will vary):
    Output from non-parallel calls to a long running service
    Output from non-parallel calls to a long running service
  9. To work with the parallelization features, add the following method to the Program class.  This code uses the For() method from the static Parallel object:
    C#
    private static void Ex1Task1_UseParallelForMethod()
    {
        Parallel.For(0, employeeData.Count, i =>
        {
            Console.WriteLine("Starting process for employee id {0}",
                               employeeData[i].EmployeeID);
            decimal span =
                PayrollServices.GetPayrollDeduction(employeeData[i]);
            Console.WriteLine("Completed process for employee id {0}",
                              employeeData[i].EmployeeID);
            Console.WriteLine();
        });
    }
  10. Replace the current method calls from Main() with a call to Ex1Task1_UseParallelForMethod() method.
    C#
    static void Main(string[] args)
    {
        ...
        // Methods to call
        Ex1Task1_UseParallelForMethod();
        ...
    }
    Note: In the above code snippet you might see something you’re unfamiliar with:
      Parallel.ForEach(employeeData, ed => { /* …interesting code here… */ });
    That bit of code is a C# 3.0 language feature known as a lambda statement, which is an idea stemming from lambda calculus. In short, lambdas are a shorthand notation for anonymous delegates and/or closures. 
  11. Build and run the application.
  12. You should observe that the employees are not necessarily processed in the order of their IDs. You’ll also notice that multiple calls to the GetPayrollDeduction() method are made before the first call returns. And finally, you should observe that by running the calls in parallel, the entire job completed much faster than when run in serial:
    Output from parallel calls to a long running service
    Output from parallel calls to a long running service
    Note: Because the loop is run in parallel, each iteration is scheduled and run individually on whatever core is available. This means that the list is not necessarily processed in order, which can drastically affect your code. You should design your code in a way that each iteration of the loop is completely independent from the others. Any single iteration should not rely on another in order to complete correctly.
    1. The Parallel Extensions library also provides a parallel version of the foreach structure. The following code demonstrates the non-parallel way to implement this structure. Add the following method to the Program class.
      C#
      private static void Ex1Task1_StandardForEach()
      {
          foreach (Employee employee in employeeData)
          {
              Console.WriteLine("Starting process for employee id {0}",
                  employee.EmployeeID);
              decimal span =
                  PayrollServices.GetPayrollDeduction(employee);
              Console.WriteLine("Completed process for employee id {0}",
                  employee.EmployeeID);
              Console.WriteLine();
          }
      }
    2. In the Main() method, replace the Parallel.For(…) loop with the following code:
      C#
      static void Main(string[] args)
      {
          ...
          // Methods to call
          Ex1Task1_StandardForEach();
          ...
      }
    3. Build and run the application.
      Note: You should observe that the employees are once again processed in the order of the IDs. Also take note of the total amount of time required to complete this job (the exact time required will vary)
      Output from non-parallel for…each implementation
      Output from non-parallel for…each implementation
      1. To utilize the Parallel Extensions implementation of the for…each structure you’ll need to change the code to use the ForEach() method from static Parallel class:
        C#
        private static void Ex1Task1_ParallelForEach()
        {
            Parallel.ForEach(employeeData, ed =>
            {
                Console.WriteLine("Starting process for employee id {0}",
                    ed.EmployeeID);
                decimal span = PayrollServices.GetPayrollDeduction(ed);
                Console.WriteLine("Completed process for employee id {0}",
                    ed.EmployeeID);
                Console.WriteLine();
            });
        }
      2. Replace the current method calls from Main(), with a call to Ex1Task1_ParallelForEach method.
        C#
        static void Main(string[] args)
        {
            ...
            // Methods to call
            Ex1Task1_ParallelForEach();
            ...
        }
      3. Build and run the application.
      4. You will again observe that the employees are not necessarily processed in order of their ID and because each loop is run in parallel, each iteration of the loop is run individually on whatever core is available. Also, since the application is utilizing all available cores the job is able to complete faster than when run in a serial manner.
        Output from parallel for…each implementation
        Output from parallel for…each implementation
          Note: The Parallel Extensions Library also provides a useful Invoke() method, accessed via the static Parallel class, that allows parallel execution of anonymous methods or lambda expressions. To help illustrate how to use the Invoke method we’ll examine a common tree-walking algorithm and then see how it can be parallelized to reduce the total time needed to walk the entire tree.
          In our example we will walk an employee hierarchy and call the GetPayrollDeduction() method for each employee we encounter.
        1. Replace the current method calls from Main(), with a call to Ex1Task1_WalkTree() method. This code instantiate the employee hierarchy and call the tree walker method.
          C#
          static void Main(string[] args)
          {
              ...
              // Methods to call
              Ex1Task1_WalkTree();
              ...
          }
          C#
          private static void Ex1Task1_WalkTree()
          {
              EmployeeHierarchy employeeHierarchy = new EmployeeHierarchy();
              WalkTree(employeeHierarchy);
          }
        2. Add the following method to the Program class:
          C#
          private static void WalkTree(Tree node)
          {
              if (node == null)
                  return;

              if (node.Data != null)
              {
                  Employee emp = node.Data;
                  Console.WriteLine("Starting process for employee id {0}",
                      emp.EmployeeID);
                  decimal span = PayrollServices.GetPayrollDeduction(emp);
                  Console.WriteLine("Completed process for employee id {0}",
                      emp.EmployeeID);
                  Console.WriteLine();
              }

              WalkTree(node.Left);
              WalkTree(node.Right);
          }
        3. Build and run the application.
        4. You should observe the employees are processed in order of their IDs. Also note the total amount of time required to walk the tree (the exact time required will vary):
          Output from a non-parallel tree walker
          Output from a non-parallel tree walker
            Note: The tree has been structured so that the data will be written out in ID order when the tree is walked using the non-parallel algorithm provided above.
          1. To walk the tree in a parallel manner remove the two calls to WalkTree() at the end of the WalkTree() method and replace them with a call to the Invoke() Method of the static Parallel class:
            C#
            private static void WalkTree(Tree node)
            {
                if (node == null)
                    return;

                if (node.Data != null)
                {
                    Employee emp = node.Data;
                    Console.WriteLine("Starting process for employee id {0}",
                        emp.EmployeeID);
                    decimal span = PayrollServices.GetPayrollDeduction(emp);
                    Console.WriteLine("Completed process for employee id {0}",
                        emp.EmployeeID);
                    Console.WriteLine();
                }

                Parallel.Invoke(delegate { WalkTree(node.Left); }, delegate { WalkTree(node.Right); });
            }
          2. Build and run the application.
          3. You should observe that the employees in the tree are no longer processed in the same order and that several nodes start processing before others have completed. Also note that it took less time to walk the entire tree.
            Output from a parallel tree walker
            Output from a parallel tree walker
            Note: The Invoke() method schedules each call to WalkTree() individually, based on core availability. This means that the tree will not necessarily be walked in a predictable manner. Again, keep this in mind as you design your code.

          No comments:

          Post a Comment

          Automatic Traffic Exchange

          YallaTech Facebook page