Don’t let mathematical operations be stronger than you

Posted: July 13, 2011 in Optimization Pills
Tags: ,

Recently, I dived into a complicated code which aim was to compute some mathematical functions employed by more tangled subsystems. The code was very cool and I immediately thought that it was written by one (or more) scientific computing expert because of its sophistication and correctness. You’re wondering why I dived into that code because of the old saying “what works should never be changed“. Another time, efficiency cooked our goose!

I didn’t know if the code was very slow or terribly slow, anyhow, it could be improved, that was sure.

To let you taste some evidences of mathematical computations cost, these are the average clock cycles required by an IBM Power3 to compute common operations (thanks to CASPUR):

 Instruction 32 bit 64 bit Integer Product 3-4 3-9 Integer Division 21 34 Real Sum and Product 3-4 3-4 Real Division 14-21 18-25 Real Square Root 14-23 22-31

It’s clear we have to avoid expensive operations like divisions and try to reduce them to simpler ones, like sum and product. I’ll give you an example:

```// Matrix Normalization

for (int i=0; i<N; ++i)
for (int j=0; j<N; ++j)
c[i][j] = a[i][j] * b[i][j]/(norm)
```
This code performs N*N products and N*N divsions. Can we do better? Sure:
```
// Matrix Normalization

rev_norm = 1.0/(norm);

for (int i=0; i<N; ++i)
for (int j=0; j<N; ++j)
c[i][j] = a[i][j] * b[i][j] * rev_norm

```

This time we need 2*N*N products but only one division. Try on your architecture and measure the results – take into account potential approximation errors. Another time, efficiency is about trade-off.

This kind of optimization is called Strength Reduction, because the target is to reduce the cost (time and/or memory) of the computation.

Other useful considerations about integer divisions (same rules apply to modulo calculations):

• integer division by a power of 2 can be done with a shift operation, which is much faster;
• integer division by a constant is faster than division by a variable;
• integer division by a constant is faster if the dividend is unsigned.

Another notable example is the computation of powers:

```
pow(a,4) <=> a * a * a * a;

```

If you execute two simple loops like the following:

```for (int i=0; i<10000000; ++i)
toCompute = a * a * a * a;

for (int i=0; i<10000000; ++i)
toCompute = pow(a,4);
```

it should be quite evident the big difference in terms of time: I measured something like 0.02 secs for the former loop and 0.586 for the latter!

The code I told you about at the beginning of this post was full of pows and other similar “expensive” functions. It was easy to change them and to achieve better results.

I’m not going to show you wagons of tricks, you can both search the web and read a good book/manual, like Optimizing software in C++ by Agner Fog (that is free).

Final considerations: sometimes it can be convenient to rewrite by hand some mathematical functions. Something like sin() is coded in order to equally minimize the error on the whole domain of the real function. But if you are interested in a certain range (relatively little) or smaller precision then you can reduce time of computations by using:

• Series,
• Approximate Formulas,
• Polynomial and Rational Fitting,
• Trigonometric Formulas.
Bear in mind numerical properties:
• Accuracy,
• Error accumulation,
• Errors distribution on the function domain.
For example, sin(x) is defined on [-π,π]. But if the range of x is limited, it can be better to make use of a few-terms series. For x in [-0.1,0.1] we have:

 Time Error Max Gain Runtime Library 0.41’’ – – III order 0.19’’ 8.3e-7 54% V order 0.24’’ 1.5e-10 42% VII order 0.28’’ 5.0e-11 32%

The accused code had a lot of calls to standard functions despite of the variables lie in a very limited range. This led me to use one of the techniques I mentioned above.

I hope this brief survey can help you to consider alternatives, especially talking about efficiency. Don’t fed up if I repeat: Efficiency is about trade-offs. Indeed, the whole software is about trade-offs, but sometimes the efficiency is such  a underhand requirement that your client expects although he/she didn’t say anything about it.