Monday, 26 May 2014

How to efficiently add BigDecimals

Anyone who deals with monetary values knows that double/float won't cut the mustard and if you deal with prices and FX rates, then BigDecimal is the only real option.

This comes with a lot of potential issues, BigDecimal methods do not handle null very well (i.e. not at all) and sometimes a bug crops up because BigDecimal returns new instances.

So the ObjectLabKit Util package will help, but here is a question for you... what is an efficient way to sum a list of BigDecimal coming from a Class.

Assume that we have a list of 500 Test instances and that we need to sum the Test.value and that value could be null.

We shall run the test 1,000 times.

Option 1: Use Total in a for loop

Option 2: Use Total with java8 forEach

Option 3: Use Total and java8 map()

Option 4: Use Java8 map and reduce

Option 5: Use Java8 map, reduce and accumulator

Option 6: Use Java8 and home-made Collector

Option 7: Use Java8 and ObjectLabKit Calculator

Option 8: Use Java8 and Parallel Stream

So what are the results?

On my 2012 MacBook Pro for a list of 500 Test instances.
AlgoAverage (ms)Min (ms)Max (ms)
Use Total in a for loop0.104
Use Total with java8 forEach0.1040
Use Total and java8 map()0.106
Use Java8 map and reduce002
Use Java8 map, reduce and accumulator002
Use Java8 and home-made Collector0.106
Use Java8 and ObjectLabKit Calculator002
Use Java8 and Parallel Stream0.1010

First of all, the value generated is the same for every algo, so no bug there it seems.

The results are quite similar except for the Max value, implying a greater deviation in the results. I've used JAmon for measuring min/max and average time.

Surprisingly, it seems that forEach has at least 1 execution at 40ms, which is way above the rest. Otherwise using the ObjectLabKit Calculator seems a good compromise between having to write the reduce correctly (! watch out if the BigDecimal on the right is null!) and using the raw map/reduce. 

The Parallel Stream is not as efficient, as it takes some time to coordinate the tasks and split the list. let's see if it gets any different with more data. 

On my 2012 MacBook Pro (QuadCore) for a list of 50,000 Test instances and the parallelStream is then becoming the most efficient.

AlgoAverage (ms)Min (ms)Max (ms)
Use Total in a for loop1020
Use Total with java8 forEach1.1048
Use Total and java8 map()2.1140
Use Java8 map and reduce1.219
Use Java8 map, reduce and accumulator1.2110
Use Java8 and home-made Collector1.4112
Use Java8 and ObjectLabKit Calculator1.2111
Use Java8 and Parallel Stream0.6017

So it looks like, when using single thread, that the RAW use of and reduce is the most efficient but one has to remember how to write it:

  final BigDecimal reduce =
                (a, b) -> b != null ? a.add(b) : a);

Using the parallelStream (when suitable) reduces the average to 0.5ms but the max is 18ms
  final BigDecimal reduce = list.parallelStream()
                (ab) -> b != null ? a.add(b) : a);

Full code available here at GitHub Gist

No comments :