Computational Science and Engineering

High-productivity through interactivity

Theodore Omtzigt

What Every Computer Scientist Should Know About Floating Point

In a private discussion compliance with IEEE-754 came up. Being a hardware guy, I love that stuff, so here is a paper that I typically point people to when they ask about the issues surrounding floating point math. The second paper is a nice historical paper with good references.

One of the reasons that GPUs (and Cray Vector machines) have more raw performance compared to an x86 is that they play with the FP round-off rules. When you don't need to be compliant you can simplify the FPU significantly and thus fit more of these units on a given die. Also, the fact that GPUs are specialized execution engines optimized for image manipulations makes them favor single precision floating point. Higher precision will always be an after thought on GPUs since that will not help them compete in their core business of 3D graphics. With AMD, Intel, and NVIDIA competing for the same 250M unit/year GPU business, none of these players can allocate double precision FPUs on their graphics processors without getting beaten badly by the others and thus lose market share.

Share

Attachments:

Reply to This

Replies to This Discussion

After a long battle, Cray eventually did clean up its floating-point.

Note that the SSE extensions to x86 also "play with" the FP rounding rules. A lot of compilers also have their "go really fast flag" default to turn off some important IEEE 754 features, like gradual underflow and certain exceptions. It can be annoying to figure out the right compiler flags that restore those features, while still optimizing your code.

Reply to This

This sure brings back memories. I designed, debugged, and microcoded floating point units for arithmetic and GPU applications years ago. The Weitek 1032/1033/1066 are probably before your time!
Just a few comments, the X86/87 was internally a microcoded 80 bit ALU as I remember making it quite different than the highly parallel direct logic implementations. It is an interesting subject. I'll see if I can dig up some papers.
http://www.cs.berkeley.edu/~wkahan/ieee754status/754story.html

I have suggested that Goldberg paper to others myself, it is quite good.

Pete Basel

Reply to This

RSS

About

Theodore Omtzigt Theodore Omtzigt created this social network on Ning.

Create your own social network!

Locations of visitors to this page

Latest Activity

Hamza and Jen Thomas are now friends
October 8
Pete Basel and Vincent Granville are now friends
September 7
September 5
September 5

Badge

Loading…

© 2009   Created by Theodore Omtzigt on Ning.   Create Your Own Social Network

Badges  |  Report an Issue  |  Privacy  |  Terms of Service