Can you explain why the following occurs in Python? (The culprit is roundoff, but why does it seem like 1.1 is not affected but 1.2 is?) >>> (1.1 + 1) - 1 1.1 >>> (1.2 + 1) - 1 1.2000000000000002 >>> 1.2 + (1 - 1) 1.2 Try the same with 1.3 and 1.4
There was recently a very good article on scientific computing, defined loosely as the dark art, as it may have seemed to the uninitiated, of deriving solutions to equations, dynamical systems, or what-not that would have made your Mechanics professor scream in horror at the thought that they will need to somehow "solve" these systems. Of course, in the rich computer world today, almost any problem imaginable (exaggeration of course!) can already be solved by some existing tool. Hence more and more, the focus gets shifted from "how do I solve this differential equation" to "what do I ask google?"
My dad once told me of a glorious time "back in [his] days" when every respectable engineering institution would have a crash course on this dark art on top of Fortran. Of course, my dad is only 43, and that was only 19 years ago.
Even now, when computer science departments everywhere no longer believes in the necessity in forcing all of their graduates to have a basic grasp on numerical analysis, there is still some draw in the subject that either makes people deathly afraid of it or embrace it as their life ambition. I am personally deathly afraid of it. Even then, there are quite many cute gems in the field, and as such, I am still very much so attracted to the field.
Scientific computing is the all encompassing field involving the design and analysis of numerical methods. I intend to start a survey of some of the basic (but also most useful) tools such as methods that: solve linear and nonlinear systems of equations, interpolate data, compute integrals, and solve differential equations. We will often do this on problems for which there exists no "analytical" solution (in terms of the common transcendental functions that we're all used to).
##Safety First In an ideal world, there would be a direct correspondence between numerical algorithms their implementation. Everything would work out of the box and there would be no need to worry that, even if you've implemented the on-paper algorithm correctly, it would somehow behave "differently".
Of course, this isn't the case. We've all heard of the age old saying that computers are finitary, and therefore it cannot represent all real numbers, specifically, there's no way to represent irrationals, and in most of the cases, we do not fully represent all rationals either. Instead, we round numbers to a certain digit.
On the surface, this doesn't seem too unfortunate. You probably did the same in your Chemistry lab report, to even more horrendous precision than what your computer will likely do for you. If you aced your Chemistry lab, then this will likely seem like a perfectly good scheme. But if you, like me, were troubled by the "sig fig" rules, then you are probably a little wary.
In fact, more often than not, you will not be bothered by the lack of a full spectrum of real numbers to choose from. However, when these little nasty "roundoff" errors are the culprit, they are often resolved through hours upon hours of debugging and general sense of hopelessness. To inherit a roundoff bug from someone else is like contracting the spanish flu: either you know what you're doing and your system (with a little bit of agonizing) successfully resolve it, or you just won't have any idea what you're going to do (and die in the spanish flu metaphor). Think of this article as a vaccination against the roundoff bugs :)
Suppose little Gauss lived in the modern age. little Gauss' teacher wanted to surf the internet, so he assigned all of his students the following integral to evaluate: $$ \int_0^1 x^{100} e^{x-1} dx $$
Being the clever alter-ego of the boy who immediately saw
Why, this is a simple recursive function! Little Gauss was absolutely thrilled, he has at his disposal a programmable calculator capable of python (because he's Gauss, he can have whatever the fuck he wants), and he quickly coded up the recursion:
import math def s(k): if k == 0: return 1 - math.exp(-1) else: return 1 - k*s(k-1)
He quickly verified the first few cases by hand, and upon finding his code satisfactory, he computed the 100th element of the sequence as -1.1599285429663592e+141
and turned in the first few digits.
His teacher glanced at his solution, and knowing that there's no way little Gauss could have done his school work with such proficiency, immediately declared it wrong. Upon repeated appeal, the teach finally relented and looked up the solution in his solution manual and, bewildered... again told little Gauss that he was WAAAAY off. Unfortunately for our tragic hero, this would not be a modern day retelling of a clever little boy who outsmarted his teacher. No, this is a tragic story of a clever little boy who succumbed to a fatal case of the roundoff bugs.
In fact, had his teacher been a little bit more clever, he would have been able to immediately see why Gauss' solution wouldn't have worked. It's immediately obvious that between
Furthermore, if a function
So what went wrong? Well, whenever you're in trouble, just make a plot!
Everything seems to be going fine until around
It turns out that there was nothing wrong with little Gauss' method and the integral is perfectly well-behaved. The culprit lies in the fact that
Before we dig into the floating point encoding underlying most modern computing platforms, let's talk about errors. Suppose that we're computing the value of something and the true value of that computation is "stored" in a variable
Since there's this notion of inexactness, we can talk about the "error" of our computation. There are two kinds of errors:
####Absolute Error
This is just the difference between the true value of the computation
First, we can't compute the absolute or relative errors, because if we can, then we would have know the true value of the computation already! Errors are purely analytic objects that can help us determine how well-behaving our computations are. Second, in the analysis of floating point roundoff, we will typically exclusively use relative error. This may seem counter-intuitive, but it has a few nice properties that simplifies error analysis, as we will see.
One more thing to add, if we allow
Suppose that, through some series of computations, that we have the computed values of
Wow, what a mouthful. This defies intuition, as you would expect error to accumulate additively. Of course, this is true of the absolute errors: $$ \widehat{x+y} = \hat x + \hat y = x + y + e_x + e_y = (x+y) + e_{x+y} $$ but this no longer holds when you consider the relative error.
So why use relative error at all for analysis? It turns out that while convenient here, it becomes less tractable when reasoning about roundoff. Whenever you do an addition operation in floating point, you accumulate a small bit of absolute error from that operation itself! This absolute error unfortunately depends on the value of the computations itself much like relative error does, so sooner or later, you're going to need to start reasoning about relative error. Worry not, we will develop a systematic formula for reasoning about the propagation of relative error that will boil down to high school level algebra (and some calculus).
Now, we've done addition. What about subtraction? You should easily verify for yourself that
$$
\widehat{x-y} = (x-y)(1 + \delta_{x-y})
$$
where the relative error
Let's now derive the propagated relative error of multiplication: $$ \widehat{x \times y} = (x\times y)(1+\delta_{x\times y}) = \hat x \times \hat y = x(1+\delta_x)\times y(1+\delta_y) $$
again, solving for
It is traditional to assume that the relative errors of your computations are small (
The propagation schemes we've talked about so far are all fine and dandy, but if we already have
Well, let's just call
If you've had a little bit of calculus, then the section heading should probably give the answer away. We've already reasoned earlier that we don't care about second order error terms because they're so insignificant. But then, we can express
Summarized, suppose that we want to run the computation
Suppose that we've computed
We're looking to compute
$$
\widehat{\cos(x+y)} \approx \cos(\widehat{x+y})(1 + \delta_{\cos(x+y)})$$
We can simplify this to
From our table of error propagation above, we see that $$ \delta_{\cos(x+y)} \approx -\tan(x+y)(x+y)\delta_{x+y} $$ so we're just left with $$ \widehat{\cos(x+y)} \approx \cos(x+y)\left(1 - 2\tan(x+y)(x+y)\delta_{x+y} \right) $$
From our table of error propagation, we see that
$$
\delta_{x+y} = \frac{x\delta_x + y\times 0}{x + y} = \frac{x}{x+y} \delta_x
$$
which finally simplifies our computed value as
$$
\widehat{\cos(x+y)} \approx \cos(x+y)\left(1 - 2\tan(x+y)(x+y) \frac{x}{x+y} \delta_x \right)
$$
$$
= \cos(x+y)\left(1 - 2\tan(x+y)x \delta_x \right)
$$
$$
= \cos(x+y)(1 + 2\cot(1)\delta_x)
$$
so that the final relative error is
This concludes our brief overview of error propagation. We'll start on IEEE floating point encoding of rational numbers and how to avoid errors in computer arithmetic next time.
Lee, I love following your work. Wish I would have started programming at a younger age...