Hundred year mistakes

My manager and I got off on a tangent in our most recent one-on-one on the subject of the durability of design mistakes in programming languages. A particular favourite of mine is the worst of the operator precedence problems of C; the story of how it came about is an object lesson in how sometimes gradual evolution produces weird results. Since he was not conversant with all the details, I thought I might write it up and share the story today.

First off, what is the precedence of operators in C? For our purposes today we’ll consider just three operators: &&, & and ==, which I have listed in order of increasing precedence.

What is the problem? Consider:

int x = 0, y = 1, z = 0;
int r = (x & y) == z; // 1
int s = x & (y == z); // 0
int t = x & y == z;   // ?

Remember that before 1999, C had no Boolean type and that the result of a comparison is either zero for false, or one for true.

Is t supposed to equal r or s?

Many people are surprised to find out that t is equal to s! Because == is higher precedence than &, the comparison result is an input to the &, rather than the & result being an input to the comparison.

Put another way: reasonable people think that

x & y == z

should be parsed the same as

x + y == z

but it is not.

What is the origin of this egregious error that has tripped up countless C programmers? Let’s go way back in time to the very early days of C. In those days there was no && operator. Rather, if you wrote

if (x() == y & a() == b)
  consequence;

the compiler would generate code as though you had used the && operator; that is, this had the same semantics as

if (x() == y)
  if (a() == b)
    consequence;

so that a() is not called if the left hand side of the & is false. However, if you wrote

int z = q() & r();

then both sides of the & would be evaluated, and the results would be binary-anded together.

That is, the meaning of & was context sensitive; in the condition of an if or while it meant what we now call &&, the “lazy” form, and everywhere else it meant binary arithmetic, the “eager” form.

However, in either context the & operator was lower precedence than the == operator. We want

if(x() == y & a() == b())

to be

if((x() == y) & (a() == b))

and certainly not

if((x() == (y & a())) == b)

This context-sensitive design was quite rightly criticized as confusing, and so Dennis Ritchie, the designer of C, added the && operator, so that there were now separate operators for bitwise-and and short-circuit-and.

The correct thing to do at this point from a pure language design perspective would have been to make the operator precedence ordering &&, ==, &. This would mean that both

if(x() == y && a() == b())

and

if(x() & a() == y)

would mean exactly what users expected.

However, Ritchie pointed out that doing so would cause a potential breaking change. Any existing program that had the fragment if(a == b & c == d) would remain correct if the precedence order was &&, &, ==, but would become an incorrect program if the operator precedence was changed without also updating it to use &&.

There were several hundred kilobytes of existing C source code in the world at the time. SEVERAL HUNDRED KB. What if you made this change to the compiler and failed to update one of the & to &&, and made an existing program wrong via a precedence error? That’s a potentially disastrous breaking change.

You might say “just search all the source code for that pattern” but this was two years before grep was invented! It was as primitive as can be.

So Ritchie maintained backwards compatibility forever and made the precedence order &&, &, ==, effectively adding a little bomb to C that goes off every time someone treats & as though it parses like +, in order to maintain backwards compatibility with a version of C that only a handful of people ever used.

But wait, it gets worse.

C++, Java, JavaScript, C#, PHP and who knows how many other languages largely copied the operator precedence rules of C, so they all have this bomb in them too. (Swift, Go, Ruby and Python get it right.) Fortunately it is mitigated somewhat in languages that impose type system constraints; in C# it is an error to treat an int as a bool, but still it is vexing to require parentheses where they ought not to be necessary were there justice in the world. (And the problem is also mitigated in more modern languages by providing richer abstractions that obviate the need for frequent bit-twiddling.)

The moral of the story is: The best time to make a breaking change that involves updating existing code is now, because the bad designs that result from maintaining backwards compat unnecessarily can have repercussions for decades, and the amount of code to update is only going to get larger. It was a mistake to not take the breaking change when there were only a few tens of thousands of lines of C code in the world to update. It’s fifty years since this mistake was made, and since it has become embedded in popular successor languages we’ll be dealing with its repercussions for fifty more at least, I’d wager.


UPDATE: The most common feedback I’ve gotten from this article is “you should always use parentheses when it is unclear”. Well, obviously, yes. But that rather misses the point, which is that there is no reason for the novice developer to suppose that the expression x & y == z is under-parenthesized when x + y == z works as expected. The design of a language should lead us to naturally write correct code without having to think “will I be punished for my arrogance in believing that code actually does what it looks like it ought to?” 

41 thoughts on “Hundred year mistakes

  1. In my first job, I was writing C using a K&R (i.e., pre-ANSI) C compiler (http://www.aztecmuseum.ca/compilers.htm#cpm80). It was very primitive, though we did have “&&”. I still write C# code with superfluous parentheses all the time. Often, when I get code review comments about my overly parenthetical code, I’ll write out something like you have above (though with C# it’s harder to mess up) and ask whether my parentheses make my intentions clearer than the “clean” code does.

    • Thanks for that note; I made the mistake of looking up the precedence tables for those languages on the internet which is of course full of lies with ads arranged around it. I’ll correct the text.

  2. This reminds me of a (possibly apocryphal) anecdote I heard about the Makefile format. Relatively early on, the whole “tabs vs spaces” thing caught people out, and a breaking change was suggested. The response was “But there are *dozens* of Makefiles out there already! We can’t break them!” And so everyone else had to pay the cost forever…

  3. I vaguely remember that in some pre-release version of C# that there was no break statement requited in switch/case, and instead one was required to use goto for fall-through. I remember thinking, “hey, that’s a great idea”. Then, of course, it didn’t make it to release because it was deemed too “breaking” from other languages.
    Did you work on C# that early? Is this memory only my imaginings or can you confirm this?

    • Your vague memory is true right up to the point where you remember that this feature was cut! It made it into release and you can use it today. C# does not require “break” at the end of a switch section, and if you want to implement fall-through, you use “goto case whatever”.

      The rule in C# is that the “end point” of a switch section must not be “reachable”. This implies that every path through a given switch section must end in a statement classified as having an unreachable end point. Those statements are break, continue, goto, throw, return, and any loop where the loop condition is the constant true.

      Try it!

      switch(x)
      {
      case 0: return “zero”;
      case 1: throw new Exception();
      case 2: while(true) {} //weird but legal!
      case 3: goto case 1;
      }

      Perfectly legal, no breaks.

      • Fair enough. But what I was remembering was that even when the end of the switch section was reachable the break wasn’t required. Most likely a false memory.

  4. Pingback: The Morning Brew - Chris Alcock » The Morning Brew #2942

  5. Interesting but I disagree that the change should have been made then. The change should have been made when compilers became sophisticated enough to be able to have flags for implementing both behaviors and reasonable diagnostics – Definitely for C89. It’s certainly not Ritchie’s fault that later languages with only a weak link to C copied its precedence rules..

    • I’m not interested in finding a person to blame; Ritchie made decisions that he felt were right at the time and regretted them later, and as you note, other language designers chose to copy that regrettable design of their own accord.

      That said, it is indisputable that this bad design has become entrenched and that it continues to be a “gotcha”. I am interested in learning how that happens with an eye to preventing these problems in the future.

      • An enviable goal but probably impossible. I also think that the problems of today and tomorrow are more to do with entrenched bad library decisions than bad language decisions. C++ shows the problems both ways in that it has suffered for years for not standardising essentials like networking because nobody could agree on the perfect abstraction and yet went ahead with iostreams that are now considered a handicap and the error category stuff that nobody knows or understands.

  6. I can’t tell you how many times I’ve seen somebody get the precedence wrong when mixing logical, bitwise, and arithmetic operations and yet discover that the code actually does the right thing. For example:

    // INCORRECT CODE
    if (x & MASK == MASK) do_something();

    But if MASK happens to be 1, then this works because MASK == MASK is 1 and the result of the bitwise and is then implicitly tested for != 0.

  7. This is why my C instructor in college said, “If you ever find yourself going to look up the operator precedence, please don’t. Use parens. They are free.” I live by this rule today.

    • I like the sentiment, but code is written once and read many times over its lifetime. For every bug fix and every enhancement lots of code needs to be read quickly to see whether it is relevant to the work in hand. Hence reaching for the precedence tables is often necessary.

      • I’m not following your train of thought here. Surely if code is going to be read (or maintained) frequently over its lifetime then the best way to write it is the way that does not require the reader to look something up in a table! Try to write code so that it is obviously correct to any reader and it will be more likely to be correct and maintainable.

        • I entirely agree, writing code that is obviously correct or that clearly shows (perhaps with brackets) what is wanted is best. But not everyone writes to that standard, also I have seen coding standards that do not allow redundant brackets.

          My point is that a lot of programmers look after old code and some old code uses “interesting” expressions that can be difficult to understand.

  8. ISTM the original sin here is that in C the conceptual type bool was really just a magic value of physical type int. Most of the rest of the pain flows directly from that decision.

    Perhaps I’m naive here, but in modern languages which distinguish bool from int and especially in strongly typed ones, e.g. C#, this == over &^ precedence decision is much less of a logical fault. In one point of view, the purpose of the comparison operators is to convert two subexpressions of arbitrary type into a single value of type bool. e.g.

    var x = bool1 & float2 == float3 // What’s the order of evaluation?

    From a type perspective it makes no sense to evaluate the & first. It makes even less sense in this example where foo is an arbitrary type with no plausible implicit (or even explicit) conversion to bool:

    var x = bool1 & foo2 == foo3 // What’s the order of evaluation?

    There are many examples in language design where doing the right thing in complex cases forces our hand in simpler cases. Simpler cases where a first-glance appraisal would suggest a different solution. In this blog the many discussions about corner cases in C# dynamic, type inference, and all the rest attest to the importance of handling the complex case correctly and all other cases consistently, perhaps at the expense of first-glance “intuitiveness” in the simple case. e.g.

    var x = bool1 & int2 == int3 // Given the above, _now_ what’s the sensible order of evaluation?

    As to the larger point about fixing v1.0 (or v0.5) design miscues early I agree that sooner is better.

    But understand that we’re also looking at survivor bias here. C became successful. We’ve all seen software products that underwent lots of feature turbulence forcing backcompat problems in the early days. Many of those failed in the marketplace. I personally have built products as an early adopter on what turned out to be an unstable stack where keeping up with the breaking changes killed our profitability and our technical credibility. Backcompat, even early backcompat, is a two-edged sword for sure. As the great Raymond Chen so often shows us.

    • Though you make good points, I note that bool1 & bool2 == bool3 is legal in C#, so the problem still exists even if we have a type system that discriminates between int and bool.

  9. Funny, you seem to be writing that “=” and “==” have equivalence when they do not. “==” is a comparison command and implies “if x = y” which in any sane world has primacy over mathematical operations. However, you are right in that this is confusing notation due to the use of “==” in math proofs as a way to replace “≡” because getting to the symbol in ASCII was impractical. It must just be burned into my old brain that comparison operators only look at the immediate variables and not expressions.

    • I’m not following your point here; this article does not discuss the difference between assignment and comparison and I don’t understand the distinction you’re making between variables and expressions.

  10. This is why all languages have parenthesis () to control order of operation. Anything more implicit than the elementary school order of operations should use parenthesis to avoid ambiguity.

  11. “how does the novice…” like anything else, from experience or instruction. One of the first things you learn when you learn a second programming language is they differ in operator precedence. So you use parens. Mike Ober is right, though doesn’t go far enough. Wrap * and + . Certainly wrap * and / . Saves brain cells. Solves the problem.

    With regard to 100 year advice, two words: “Survivor bias”. Fix or patch? That’s a question you’ll be asking until you meet that great coder in the sky. You’ll get the answer wrong in proportion to the number of times you must make the choice. Don’t fret.

    Thanks for the analysis and history lesson!

  12. The other “interesting” design decision brought out in this article was the one to originally treat “&” as short-circuiting in conditional expressions but as fully evaluative in all other circumstances.

    These were still early days in the development of high level languages, so the designers’ lack of foresight (from our privileged POV) is understandable. As well we can clearly see two mental biases at work as this was done.

    One: Everyone using these tools will be a true expert. Said another way, the total body of knowledge required to use the language correctly will be reasonably small so all users can master all of it. So language inconsistencies and corner cases are fine. That may even have been true in the early 1970s; it certainly isn’t in 2020.

    Two: Coders in C will think like assembly language coders do (in 1970, now “did” in 2020). Short-circuit evaluating a conditional construct is totally how it was done by hand back in the day. On the rare occasions you needed full evaluation you wrote the extra half-page of code to make it so. So any given coder’s decision at any given point was all-but guaranteed to be conscious & deliberate. Higher level languages are like power tools.They enable us to cause bigger damage with smaller slips. Which is why the “pit of success” mentality of the C# (& BCL) design crew has been so valuable and powerful.

    I wonder what unconscious biases modern language & toolchain developers are baking into new products today that will have our grandkids scratching their heads and wondering WHAT could we have been thinking when we did THAT?!

    • This is a great insight. In the early 1990s Brian Kernighan gave a talk at my school and expressed your first point quite explicitly. He noted that he often wrote small programs that used *every* language feature of C, and that he could not imagine writing a program of any size that used every feature of C++.

      Your last para is very much on the minds of the C# design team. My favourite example is that VB added XML literals and C# did not. The design process of VB encourages immediate value to current line-of-business programmers. The design process of C# encourages a longer-term vision where the details of current technology trends are not baked into the language. We didn’t know whether XML would be the most popular human-readable serialization format for six months or six decades, and it turned out to be closer to the former. But we do believe that sorting/searching/grouping/joining data regardless of format will be valuable forever. That’s why LINQ expresses those ideas in a format-neutral, timeless manner, rather than simply being SQL embedded in C# as it was originally conceived.

  13. > Remember that C has no Boolean type; …

    In fact it does, since the 1999 edition of the ISO C standard. It’s called “_Bool” (or “bool” if you have “#include “. But the equality and relational operators still yield values of type int, with the value 0 for false and 1 for true.

  14. This is why I always use parenthesis for expressions that mix AND and OR conditions, and get irritated when software removes “unnecessary” parenthesis from large expressions (I’m looking at *you*, Jira).

    Not as terrible, but I’ve always thought that changing the logical not operator from tilde to the exclamation point was a mistake going from BCPL to B that’s carried forward into C, C++, C# and now everywhere. For such an important modifier (literally the opposite of what’s written), an exclamation point is a very easy to overlook character.

    if (!ImpossiblyLongMethodNameOrSomeBigParenthetical)

    vs

    if (~ImpossiblyLongMethodNameOrSomeBigParenthetical)

    Some languages do use other characters. Lua, for example, uses ~= for (not equal) or the word “not” in conditionals (“if not foo then”). I’d love to see tilde become more common.

    • Or even better, “not”. Again, repurposing arbitrary punctuation to serve as operators seems very natural to us because we’ve all been raised in the culture of languages influenced by C for fifty years. But it really is quite horrid. We have centuries of using + – x / for the basic operations, and sure, & is reasonable for “and”. But why should * mean both “multiply” and “dereference”? Why is ! or ~ naturally thought of as “not”? These conventions serve as gatekeepers that keep out the uninitiated from our priesthood. Programming is already hard enough, and we don’t have to go full COBOL to make it a little easier to read.

    • Indeed, there are many interesting possible mitigations. But the point is: ideally we would not have to spend valuable effort building compilers today to mitigate defects created fifty years ago in a different language; better to prevent the problem in the first place. (And I note that the error message is misleading, as it implies that the correct fix is to parenthesize the comparison; that’s maybe not the right fix!)

      • Sure. I got your point – was just being curious to try it in D. And yes, the message can be misleading – would probably be better phrased somewhat like “either … or ….” – but at least it’s (much) better than nothing, esp. when thinking of all the pain that it prevents. The nice part of this detail may well be that this is active reality, not just a possibility.

  15. Another reason I always use parenthesis – NO AMBIGUITY.

    It also irks me when I see an open source projects impose coding styles that deny use of explicit parenthesis. It’s only a matter of time until someone reads or writes code the wrong way with this mindset, which only has the advantage of saving a few bytes of not-precious memory.

    • I agree with you that liberal use of parentheses is a good practice most programmers acquire.

      I think the ultimate solution might be to dispense with complex precedence rules – which tend to be subtly different between languages anyway – and standardise on left-to-right evaluation (i.e. reading order) in the absence of parentheses.

  16. Remarkable! My guess would also have been that (x & y == z) == ((x & y) == z). The worst thing about it is maybe not even that many other important programming languages copy this behavior, but that others don’t. Lesson learned: Always be careful when using & with ==.

  17. Very good writeup and interesting article! Thank you for the level of details. I think I want to be part of those discussion.

    But, I am not sure this is a problem in Java. I checked like this

    $ /tmp $ cat Sample1.java
    class Sample1 {

    public static void main(String[] args) {
    int x = 0, y = 1, z = 0;
    int r = (x & y) == z; // 1
    int s = x & (y == z); // 0
    int t = x & y == z; // ?
    }
    }

    $ /tmp $ javac Sample1.java
    Sample1.java:5: error: incompatible types: boolean cannot be converted to int
    int r = (x & y) == z; // 1
    ^
    Sample1.java:6: error: bad operand types for binary operator ‘&’
    int s = x & (y == z); // 0
    ^
    first type: int
    second type: boolean
    Sample1.java:7: error: bad operand types for binary operator ‘&’
    int t = x & y == z; // ?
    ^
    first type: int
    second type: boolean
    3 errors

    $ /tmp $ javac -version
    javac 11.0.9

    $ update-alternatives –list javac
    /usr/lib/jvm/java-11-amazon-corretto/bin/javac

    • Thanks for the note; I think you might have missed my point though. The problem that Java fixes is to make Boolean a distinct type from int. If you replace int with bool and 0 with false and 1 with true, does that work in Java?

  18. I had a previous comment, it must have been lost while trying to login.
    IMO the biggest error is continuing to store source code as plain text instead of a structured format that stores the Concrete syntax tree together with the intended language version. The most pragmatic fix that can be applied immediately is to mandate that all source file start with a #pragma-like directive that states the language version.

Leave a comment