Wednesday, February 10, 2010
The Ellis Island Bug
A couple of years ago, I developed a version of a well-known reasoning exercise. It's a simple exercise, and I implemented it as a really simple computer program. I described it to James Bach, and suggested that we put it in our Rapid Software Testing class.
James was skeptical. He didn't figure from my description that the exercise would be interesting enough. I put in a couple of little traps, and tried it a few times with colleagues and other unsuspecting victims, sometimes in person, sometimes over the Web. Then I tried the actual exercise on James, using the program. He helpfully stepped into one of the traps. Thus emboldened, I started using the exercise in classes. Eventually James found an occasion to start using it too. He watched students dealing with it, had some epiphanies, tried some experiments. At one point, he sat down with his brother Jon and they tested the program aggressively, and revealed a ton of new information about it—many of which I hadn't known myself. And I wrote the thing.
Experiential exercises are like peeling an onion; beneath everything we see on the surface, there's another layer that we can learn about. Today we made a discovery; we found a bug as we transpected on the exercise, and James put a name on it.
We call it an Ellis Island bug. Ellis Island bugs are data conversion bugs, in which a program silently converts an input value into a different value. They're named for the tendency of customs officials at Ellis Island, a little way back in history, to rename immigrants unilaterally with names that were relatively easy to spell. With an Ellis Island bug, you could reasonably expect an error on a certain input. Instead you get the program's best guess at what you "really meant".
There are lots of examples of this. We have an implementation of the famous triangle program, written many years ago in Delphi. The program takes three integers as input, with each number representing the length of a side of a triangle. Then the program reports on whether the triangle is scalene, isoceles, or equilateral. Here's the line that takes the input:
function checksides (a, b, c : shortint) : string
Here, no matter what numeric value you submit, the Delphi libraries will return that number as a signed integer between -128 and 127. This leads to all kinds of amusing results: a side of length greater than 127 will invisibly be converted to a negative number, causing the program to report "not a triangle" until the number is 256 or greater; and entries like 300, 300, 44 will be interpreted as an equilateral triangle.
Ah, you say, but no one uses Delphi any more. So how about C? We've been advised forever not to trust input formatting strings, and to parse them ourselves. How about Ruby?
Ruby's String object supplies a to_i method, which converts a string to its integer representation. Here's what the Pickaxe says about that:
to_i str.to_i( base=10 ) → int
Returns the result of interpreting leading characters in str as an integer base base (2 to 36). Given a base of zero, to_i looks for leading 0, 0b, 0o, 0d, or 0x and sets the base accordingly. Leading spaces are ignored, and leading plus or minus signs are honored. Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0 is returned. The method never raises an exception.
We discovered a bunch of things today as we experimented with our program. The most significant thing was the last two sentences: an invalid number is silently converted to zero, and no exception is raised!
We found the problem because we thought we were seeing a different one. Our program parses a string for three numbers. Depending upon the test that we ran, it appeared as though multiple signs were being accepted (+--+++--), but that only the first sign was being honoured. Or that only certain terms in the string tolerated multiple signs. Or that you could use multiple signs once in a string—no, twice. What the hell? All our confusion vanished when we put in some debug statements and saw invalid numbers being converted to 0, a kind of guess as to what Ruby thought you meant.
This is by design in Ruby, so some would say it's not a bug. Yet it leaves Ruby programs spectacularly vulnerable to bugs wherein the programmer isn't aware of the behaviour of the language. I knew about to_i's ability to accept a parameter for a number base (someone showed it to me ages ago), but I didn't know about the conversion-to-zero error handling. I would have expected an exception, but it doesn't do that. It just acts like an old-fashioned customs agent: "S-C-H-U-M-A-C... What did you say? Schumacher? You mean Shoemaker, right? Let's just make that Shoemaker. Youse'll like that better here, trust me."
We also discovered that the method is incorrectly documented: to_i does raise an exception if you pass it an invalid number base—37, for example.
There are many more stories to tell about this program—in particular, how the programmer's knowledge is, at best, is a different set compared to what empirical testing can reveal. Many of the things we've discovered about this trivial program could not have been caught by code review; many of them aren't documented or are poorly documented both in the program and in the Ruby literature. We couldn't look them up, and in many cases we couldn't have anticipated them if they hadn't emerged from testing.
There are other examples of Ellis Island bugs. A correspondent, Brent Lavelle, reports that he's seen a bug in which 50,00 gets converted to 5000, even if the user is from France or Germany (in those countries, a comma rather than a period denotes the decimal, and they use spaces where we use commas).
Now: boundary tests may reveal some Ellis Island bugs. Other Ellis Island bugs defy boundary testing, because there's a catch: many such tests would require you to know what the boundary is and what is supposed to happen when it is crossed. From the outside, that's not at all clear. It's not even clear to the programmer, when libraries are doing the work. That's why it's insufficient to test at the boundaries that we know about already; that's why we must explore.
James was skeptical. He didn't figure from my description that the exercise would be interesting enough. I put in a couple of little traps, and tried it a few times with colleagues and other unsuspecting victims, sometimes in person, sometimes over the Web. Then I tried the actual exercise on James, using the program. He helpfully stepped into one of the traps. Thus emboldened, I started using the exercise in classes. Eventually James found an occasion to start using it too. He watched students dealing with it, had some epiphanies, tried some experiments. At one point, he sat down with his brother Jon and they tested the program aggressively, and revealed a ton of new information about it—many of which I hadn't known myself. And I wrote the thing.
Experiential exercises are like peeling an onion; beneath everything we see on the surface, there's another layer that we can learn about. Today we made a discovery; we found a bug as we transpected on the exercise, and James put a name on it.
We call it an Ellis Island bug. Ellis Island bugs are data conversion bugs, in which a program silently converts an input value into a different value. They're named for the tendency of customs officials at Ellis Island, a little way back in history, to rename immigrants unilaterally with names that were relatively easy to spell. With an Ellis Island bug, you could reasonably expect an error on a certain input. Instead you get the program's best guess at what you "really meant".
There are lots of examples of this. We have an implementation of the famous triangle program, written many years ago in Delphi. The program takes three integers as input, with each number representing the length of a side of a triangle. Then the program reports on whether the triangle is scalene, isoceles, or equilateral. Here's the line that takes the input:
function checksides (a, b, c : shortint) : string
Here, no matter what numeric value you submit, the Delphi libraries will return that number as a signed integer between -128 and 127. This leads to all kinds of amusing results: a side of length greater than 127 will invisibly be converted to a negative number, causing the program to report "not a triangle" until the number is 256 or greater; and entries like 300, 300, 44 will be interpreted as an equilateral triangle.
Ah, you say, but no one uses Delphi any more. So how about C? We've been advised forever not to trust input formatting strings, and to parse them ourselves. How about Ruby?
Ruby's String object supplies a to_i method, which converts a string to its integer representation. Here's what the Pickaxe says about that:
to_i str.to_i( base=10 ) → int
Returns the result of interpreting leading characters in str as an integer base base (2 to 36). Given a base of zero, to_i looks for leading 0, 0b, 0o, 0d, or 0x and sets the base accordingly. Leading spaces are ignored, and leading plus or minus signs are honored. Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str, 0 is returned. The method never raises an exception.
We discovered a bunch of things today as we experimented with our program. The most significant thing was the last two sentences: an invalid number is silently converted to zero, and no exception is raised!
We found the problem because we thought we were seeing a different one. Our program parses a string for three numbers. Depending upon the test that we ran, it appeared as though multiple signs were being accepted (+--+++--), but that only the first sign was being honoured. Or that only certain terms in the string tolerated multiple signs. Or that you could use multiple signs once in a string—no, twice. What the hell? All our confusion vanished when we put in some debug statements and saw invalid numbers being converted to 0, a kind of guess as to what Ruby thought you meant.
This is by design in Ruby, so some would say it's not a bug. Yet it leaves Ruby programs spectacularly vulnerable to bugs wherein the programmer isn't aware of the behaviour of the language. I knew about to_i's ability to accept a parameter for a number base (someone showed it to me ages ago), but I didn't know about the conversion-to-zero error handling. I would have expected an exception, but it doesn't do that. It just acts like an old-fashioned customs agent: "S-C-H-U-M-A-C... What did you say? Schumacher? You mean Shoemaker, right? Let's just make that Shoemaker. Youse'll like that better here, trust me."
We also discovered that the method is incorrectly documented: to_i does raise an exception if you pass it an invalid number base—37, for example.
There are many more stories to tell about this program—in particular, how the programmer's knowledge is, at best, is a different set compared to what empirical testing can reveal. Many of the things we've discovered about this trivial program could not have been caught by code review; many of them aren't documented or are poorly documented both in the program and in the Ruby literature. We couldn't look them up, and in many cases we couldn't have anticipated them if they hadn't emerged from testing.
There are other examples of Ellis Island bugs. A correspondent, Brent Lavelle, reports that he's seen a bug in which 50,00 gets converted to 5000, even if the user is from France or Germany (in those countries, a comma rather than a period denotes the decimal, and they use spaces where we use commas).
Now: boundary tests may reveal some Ellis Island bugs. Other Ellis Island bugs defy boundary testing, because there's a catch: many such tests would require you to know what the boundary is and what is supposed to happen when it is crossed. From the outside, that's not at all clear. It's not even clear to the programmer, when libraries are doing the work. That's why it's insufficient to test at the boundaries that we know about already; that's why we must explore.
