Lynne Teaches Tech: What’s an integer overflow?

most people count using a decimal system. the lowest digit is 0, followed by 1, 2, and so on, through to 9. when you’re counting up and you reach nine, you need to add another digit. there’s no way to express ten with only one digit, so you use two digits to write 10.

computers use a binary number system instead. the lowest digit is 0, followed by 1, and that’s it! so when you count to one, you need to add another digit to get to two, which is written as 10 in binary.

in decimal, from left to right, the digits mean ones, tens, hundreds, thousands, ten thousands… this means that a 4 in the third position (followed by two zeroes) means four hundred. from left to right, binary digits mean ones, twos, fours, eights… a 1 in the fourth position means eight, which is written as 1000. decimal uses powers of 10, binary uses powers of 2.

let’s say you can only write two digits on a piece of paper. you can easily write numbers like 12 and 8 and 74, but what about 100? there’s nothing you can do. but let’s assume you aren’t aware of that, and you’re a computer following a simple algorithm calculating 99+1. first, you increment the least significant digits, which is the leftmost one. this leaves you with 90, and you need to carry the one. so you increment the other nine, and carry the one, leaving you with 00. normally, you’d just write the one you’ve been carrying and end up with 100, which is the correct answer. however, you only have space for two digits, so you can’t continue. thus, you end up saying that 99 plus 1 equals 0.

an eight bit number has room for eight binary digits. this means that if the computer is at 11111111 (255 in decimal) and tries to add 1 again, it ends up with zero. this is called an integer overflow error – the one that the computer has been carrying has “overflowed” and spilled out, and is now lost. the number has wrapped around from 255 to 0, as if the numbers were on a loop of paper. underflow is the opposite of this problem – zero minus one is 11111111.

so if adding to the highest number possible should create zero, why does it sometimes give a negative number instead? this is due to signed integers. an unsigned integer looks like this:
7279
a signed integer looks like +628 or -216. a computer doesn’t have anywhere special to put that negative (or positive) sign, so it has to use one of the bits in the number. 1111 might mean -111, for example.

(n.b. the method of signing integers described below is “offset binary”. there are other methods of doing this as well, but we’ll focus on this one because it’s intuitive.)

if we want to represent negative numbers, we can’t start at zero, because we need to be able to go lower than that. in binary, there are sixteen different possible combinations of four digits/bits, from 0000 to 1111 – zero to fifteen. instead of treating 0000 as zero, we can move zero to the halfway point between 0000 and 1111. since there are sixteen positions between these two numbers, there’s no middle. (the middle between one and three is two, but there’s no whole number middle between one and four.) we’ll settle for choosing 1000 to be our zero, which means there are eight numbers below zero and seven numbers above it. if we treat zero as positive, we have eight negative and eight positive numbers to work with. our number range has now gone from 0 to 15, to -8 to 7. we can’t count as high, but we can count lower.

in such a system, 1111 would be 7 instead of 15, just as 1000 is 0 instead of 8. when adding one to 1111, it overflows to 0000, which means -8 with our system. this is why adding to a high, positive number can produce a low, negative number. positive numbers that overflow to negative ones are signed integers.

overflow and underflow bugs are the root of many software issues, ranging from fascinating to dangerous. in the first game in sid meier’s civilization series, ghandi had an aggressiveness score of 1, the lowest possible. certain political actions reduced that score by 2, which caused it to underflow and become 255 instead – far beyond the intended maximum – which gave him a very strong tendency to use nuclear weaponry. this bug was so well-known and accidentally hilarious that the company decided to intentionally make ghandi have a strong affinity for nukes in almost all the following games. some arcade games relied on the level number to generate the level, and broke when the number went above what it was expecting. (the reason behind the pac-man “kill screen” is particularly interesting!) for a more serious and worrying example of integer overflow, see this article: https://en.wikipedia.org/wiki/Year_2038_problem (unlike Y2K, this one is an actual issue, and has already caused numerous problems)

the first image is a chart explaining two methods of representing negative numbers with four bits (the one used in this post is on the left). the second is a real-world example of an “overflow”.

thanks so much for reading!

view original post