Using custom aggregation - C#. NET

Aggregation is where many data items are processed to produce a single result. Some aggregations are used so frequently that there are dedicated LINQ/PLINQ methods to support them—Sum(), Average(), Count(), and so on.

The ParallelEnumerable class supports a version of the Aggregate() extension method that is unique to PLINQ and provides support for custom parallel
aggregation. The PLINQ Aggregate() method version takes four functions or
lambda expressions as arguments. Here’s what these functions do:

  • Define the initial value for the result: this is executed once for the aggregation.
  • Process each data value: When the data value is processed, the function takes the subtotal for the current Task and the present data item as arguments. This is executed for each data value.
  • Process each per-Task subtotal: When the data value is processed, take the subtotal and the overall total as arguments. This is executed for each Task assigned to perform the aggregation.
  • Process the final result: This is executed once for the aggregation. The best way of understanding the Aggregation() is with an example. This demonstrates a simple aggregation, which calculates half the sum of the squares of the first 10,000 integer values.

PLINQ Custom Aggregation

For the first function, we initialize the result, which is a double. We use lambda expressions in the listing, so to create a new double with a value of 0, we simply call this:

The second function allows me to process each data value and add it to the running total for the current Task. If we want to add the square of the current data value to the total, we call the following:

The subtotal is a double (the type of the result), and the item is an int (the type of the source data array). The third function is called when each Task finishes processing data; it allows us to combine the result value generated by each Task with the overall result. We want to sum the per-Task results together, so we call the following function:

The arguments total and subtotal are both doubles. Finally, we have the chance to do any final processing of the overall result before it is returned as the aggregated result. In the case of the example, we want to return only half of the total, so we call this:

If you read the example again, you will see that it shares some characteristics with the use of TLS in parallel loops; see the previous chapter for details. Although creating custom parallel aggregations can be confusing at first glance, the ability to do so can be a useful and powerful tool much like PLINQ overall.


All rights reserved © 2018 Wisdom IT Services India Pvt. Ltd DMCA.com Protection Status

C#. NET Topics