Thursday, August 9, 2012

What's really wrong with 6÷2(1+2)? (Part (14÷7(5-3)+2)÷3)

If you haven't already, start with Part (14÷7(5-3)+2)÷3, where I introduced the problem. Then, in Part (14÷7(5-3)+2)÷3, I began discussing what's actually going on, and introduced a few other mathematical ideas. Now, in Part (14÷7(5-3)+2)÷3, I will answer the question posed at the end of the last part. When given 12+33+72, which addition do you start with: 12+33, or 33+72?

WHAT?! Addition is associative?!
Well in an incredibly climactic turn of events that none of you were expecting, it turns out that it doesn't matter. When adding three numbers, you can add them in any order you want. But let's take an example which isn't so friendly. Pretend you have a function "avg" that takes the average of two numbers: avg(2, 4) = 3, avg(14, 17) = 15.5, and so on. And just like add(2, 3) can also be written as 2+3, let's say avg(2, 3) can be written as 2#3.

Now what's 1#5#21? Well if you do 1#5 first, you get 3, and 3#21 = 12. But if you do 5#21 first, you get 13, and 1#13 = 7! So you do it in a different order, and you get a completely different answer (note: if you decided to break the "only two at a time" rule and find the average of all three at once, you'd get 9, another completely different answer)

Now, if instead you did it with functions, you couldn't write something unclear like 1#5#21; you have to choose which average is done first. Your options are avg(avg(1, 5), 21), and avg(1, avg(5, 21)). When you write it like this, there's no ambiguity, because it's really clear what has to be done first. And this reflects the actual mathematical ideas, which include knowing which functions depend on the results of other ones. 1#5#21, on the other hand, doesn't. This is what I meant when I said that the mathematical expressions we're used to don't actually reflect the essence of what they try to describe - function notation succeeds in that far better than regular notation does.

Now, it would technically be possible to rewrite all elementary textbooks in function form. But when you realize that something as simple as 2(5+3-4) would then be written as mult(2, sub(add(5, 3), 4)), it begins to get a little hard on the eyes trying to figure out what's inside of what. And so people use expressions like 2(5+3-4), because they're easier. Simply put, these expressions are a shorthand. They're a way of abbreviating mathematical ideas to make them easier to read and write.

I did a Google Image Search for "AH"
and this was one of the results. I don't
really know why. But mm, dumplings
And whenever you abbreviate something, you lose information. For example, a couple weeks ago I had a Facebook conversation and someone mentioned seeing something on "AH." Unfortunately, I didn't know what that acronym stood for at the time. So I looked the acronym up, and found it could have meant anything from Art History to Adaptive Hypermedia to American Health to Apocalyptic Harbingers to Artificial Horizon to... Later he clarified that he was referring to "Alternate History." But since I didn't know the context, the shortened version was incredibly ambiguous. Making a shorter, easier-to-read version ended up taking away certainty.

And that's exactly what happens with things like 6÷2(1+2). It's a shortened version of the actual mathematical idea behind it, and because of that, it's lost some of its information.

Again, sometimes losing information isn't so bad. Like in the case of 2+5+4. In the end, it doesn't matter which numbers you add first, because you'll get the same answer. But in a case like 2×5+4, how do you know which to do first? Is it add(mult(2, 5), 4) = 14, or mult(2, add(5, 4)) = 18?

Two things were done to try to help the shorthand become slightly more accurate: brackets, and order of operations. Brackets are used to state clearly "everything in here needs to be considered together. You can't do anything to just part of this while leaving the rest behind." Technically, if you had enough brackets, the shorthand could be just as clear as the original function form. For example, add(1, add(add(2, 3), 4))) and (1+((2+3)+4)) mean exactly the same thing, order and all. So people could have just demanded that every operation symbol (+, -, ×, ÷) have its own pair of brackets, and there would be no trouble. But as in the case of (1+((2+3)+4)), this can look almost as messy as the function version, when it doesn't need to be; 1+2+3+4 works just as well, even if it's not clear what to do first, because it doesn't matter what you do first. It's a shorter abbreviation that still works just as well, so why worry about all the extra brackets? But if you drop that requirement, then you need to know what to do when two different operations aren't separated by brackets, when the order does matter - like in 2×5+4.

This is where the order of operations comes from. It's a convention for reading a shorthand notation, so we can all agree that it refers to the same idea. People decided that certain functions were "more important" than others, and gave them different priorities. There are actually some good reasons behind the order*, but it turns out that things would work just as well if the order of operations was completely flipped! You'd just need to learn how to read and write in the new system.

*One example of a good reason is the distributive property, which describes how multiplication and addition relate to each other: a×(b+c) = a×b+a×c. If addition had priority, this would have to be written a×b+c = (a×b)+(a×c) - you'd need an extra pair of brackets.

Excuse me waiter, but I believe
I ordered a sled.
This is what I mean when I say that why it bugs me that people get so uptight about the order of operations. A lot of them seem to think that they're arguing over unchangeable facts. But no, it's way less significant than that. Some people might think that there's a cultural element involved, but that there's a single correct standard accepted today (Even though words can change over time, "I through the ball over their" is just wrong). But no, it's less than that. Some people might think that there are two acceptable versions, but that either one should be seen as, in a way, "as good as possible" ("favourite" is British English, "favorite" is American English - and there's no better way to express the same meaning, so these words are as good as it gets). But I'd still say it's even less than that; I'd say people are arguing over a shorthand notation which doesn't completely describe what's going on. It's like they're arguing over whether "sld" means "sold," "sled," "solid," or "salad." If there's any argument, it's not a problem with English, and it's not a problem with whoever disagrees with your interpretation; it's a problem who came up with the abbreviation. It's up to him to explain why he thought "sld" worked better than writing the whole thing out, or choosing another abbreviation that's more clear.

So what do I say to whoever first wrote 6÷2(1+2)? Explain why you chose that abbreviation. I can understand why you wouldn't write it out fully (as either mult(div(6, 2), add(1, 2)) or div(6, mult(2, add(1, 2)))), but if you're going to use an abbreviation, why didn't you use an abbreviation that causes less argument, like (6÷2)(1+2) or 6÷(2(1+2))? Or better yet, get rid of that division sign altogether and use fraction notation instead - it makes very clear what gets divided by what. You could even use an entirely different system of shorthand, like Polish Notation (where the choices would be × ÷ 6 2 + 1 2 or ÷ 6 × 2 + 1 2) or Reverse Polish Notation (6 2 ÷ 1 2 + × or 6 2 1 2 + × ÷), which in many ways are better than the system we usually use* because no brackets or order of operations are necessary to be perfectly clear about what's going on. So with so many great, unambiguous options, why oh why did you choose 6÷2(1+2)?

*which, in case you're wondering, is called Infix Notation

Of course, the person who came up with that expression probably had a very good reason: to mess with people. And for that, I applaud him, because he has succeeded immensely.

Yes, it is important to come up with a standard convention that we can all agree on, but the reason for this is so that we can spend more time on the math and less time decoding stuff. So when faced with something ambiguous, especially when it doesn't have to be, it's entirely against the point to spend so much time arguing about how to interpret it. Until we see a mathematical expression that is inherently ambiguous - that is, there's no better way to write it - the proper response should be to teach people how to express their mathematical ideas in a non-ambiguous way. Just as how a good English teacher shouldn't be teaching grammar in order to show students how to write really complex, convoluted sentences that are still "correct," but rather teaching students how to write clearly.

Jingle Bells, Batman Spells
In Part (14÷7(5-3)+2)÷3, I asked why so many people who would claim to hate math would get so riled up over a discussion like what the answer to 6÷2(1+2) is. And I guess I answered my own question. The reason they can get so involved in the topic, even though they hate math, is because they're not doing math at all. You want to do math?* Then tell me what you're actually trying to say, and then we can use that to have an epic discussion about commutativity and associativity and inverses and divisibility and factorization and modular arithmetic and all sorts of other cool stuff. But don't expect me to argue about what you're trying to say.

*Most people: PLEASE NO

That about sums it up for now. If you have any additions, questions, things I left out, things you think I got dead-wrong, or correct answers to 6÷2(1+2), please let me know!

Tuesday, August 7, 2012

What's really wrong with 6÷2(1+2)? (Part (14÷7(5-3)+2)÷3)

In Part (14÷7(5-3)+2)÷3 (read it first if you haven't already), I introduced a problem: what does 6÷2(1+2) equal? The post ended with the rather strange, and seemingly exaggerated, assertion that "6÷2(1+2)" is not actually math at all. Here, in part (14÷7(5-3)+2)÷3, the mystery will be resolved!

This word is edible
Now, saying that "6÷2(1+2)" isn't actually math might come as a bit of a shock... until you realize that the word "pizza" isn't (usually) edible. That is, just as "pizza" is used to describe a food, even though the word itself isn't actually a food, "6÷2(1+2)" is just a way of communicating a mathematical idea - the symbols themselves aren't math. What we have here is a written language, a set of symbols that by themselves mean very little, but offer meaning based on how they're arranged, and on what we as a society have decided the arrangements should represent. In both English and this math language, you can make nonsense like "uesohfiwjaoiesjd" and "++4×-÷2=-12÷..×.4++3-", short pieces of information like "butter" or "23", longer expressions like "a stack of pancakes" or "94+2", or sentences that actually make a claim, like "Juice is dry" or "17-4=2." In English, you might have different conventions ("color" or "colour") or ambiguous situations ("I caught a butterfly with a net." "Wow, I've never seen a butterfly carrying a net before!") or statements that sound strange but are actually ok ("The horse raced around the barn fell"), so it's not surprising to see things like that happen in math, right? So is the problem just that we're using a written language, which will never be as precise as we want?

Nope. It is actually possible to produce a written language that's consistent and unambiguous; the problem is deeper than that. Here's the difference between "pizza" and "6÷2(1+2)" (besides the fact that one has tomato sauce, and the other is pizza). The word "pizza," in a sense, completely captures the nature of what it describes. Of course, I can add other description words if I want to narrow down my type of pizza of course, and I could give synonyms or definitions, or translate it into another language, if I wanted to say the same thing without actually using the word "pizza." But in the end, if you want to describe that flattened dough with toppings, you can't really get any closer than "pizza" (or the same word in some other language), because the word "pizza" was defined to mean exactly that. Because of its definition, it manages to communicate the essence of what it tries to describe. But I claim that the mathematical expressions we all know and love don't.

Before I can get to that though, I need to talk about another mathematical idea. Though students often don't learn about this until late middle school or even high school, it's actually a concept far more basic than things like adding and subtracting: the function. Simply put, a function is any rule that gives an output based on an input. So there's

If this picture makes sense to you,
you may skip the next couple paragraphs.
  • a function that takes any number as input, and gives back its triple as output; 
  • a function that takes any person as input, and gives the number of Facebook friends that person has as output; 
  • a function that takes a word as input and gives its definition as output; 
  • a function that takes a colour as input and gives its complementary colour as output; 
  • a function that takes a word or phrase as input and gives the first Google search result of that word or phrase as output... 
as you can see, functions are really basic, really fundamental. Any time you want to connect two pieces of information, you can think of it as a function. The usual way to write down a function is to give its name, then the input in brackets afterwards. So if we call the Facebook friend function "FBfriends," then FBfriends(Jonathan Love) = 1176. If we call the tripling function "f," then f(4) = 12. Functions can also take in more than one piece of information; so for example, you can describe

  • a function that takes two people and gives the number of mutual friends on Facebook;
  • a function that takes two cities and gives the distance between them;
  • a function that takes a person and a year, and gives the amount of time between that year and the year the person was born;
  • a function that takes three mountains and gives the height of the tallest one;
  • a function that takes twenty countries and gives the average population density of all of them...
As an example of one of these, if we call the function that measures distance between cities "Heretothere," then Heretothere(Moscow, New Delhi) = 4,348 km. Another thing you can do with functions is put the result of one function inside another. For example, I could take the distance from Moscow to New Delhi (Heretothere(Moscow, New Delhi) = 4,348 km) and triple it (f(4,348 km) = 13,044 km). This can also be written as  f(Heretothere(Moscow, New Delhi)) = 13,044 km.

It turns out that all the operations you remember from elementary school are actually functions. Given two numbers, you can add them (we'll call this function "add"), subtract them ("sub"), multiply ("mult"), or divide ("div")*. So for example, add(2, 3) = 5, div(24, 3) = 8, sub(19, 2) = 17. And there are others like powers and roots but I won't get into those for now.

*For various reasons, it actually makes more sense a lot of the time to use a different set of functions. Keep the adding and multiplying, but then have two functions that only take one input: an additive inverse function "ainv" and a multiplicative inverse function "minv," so that ainv(x) = -x, and minv(x) = 1/x. But that's not necessary to my point, so I'll stick with what most people are used to for now.

Now, so far, all we have is a different way of writing the same thing: add(2, 3) and 2+3 have the exact same meaning, as do sub(2, 5) and 2-5, or mult(6, 4) and 6×4. And it seems like all I've done is make things longer and more complicated. But let's see what happens if we go a bit further...

Sorry man, can't squeeze in another input.
How would you write 12+33+72 using functions? The natural reaction might be to just write it as add(12, 33, 72). But wait! The function "add" only takes two inputs, you can't just shove another one in there. If that sounds like just a stupid rule, how would you explain sub(10, 4, 3)? What gets subtracted from what? Or even more confusing, What would Heretothere(Moscow, New Delhi, Paris) be? How would you try to explain the "distance between three cities?" There are ways you could make it work - the shortest path connecting all three, or adding up each of the individual distances, for example - but then you're making a new function. The fact is, if you have a function that takes two inputs, it must take two inputs.

If you look again at 12+33+72, there are two + signs, so you're actually adding twice. Add two of the numbers, and then add the result to the third number. But you'll notice that there are two ways to do that - you can add 12 and 33 first to get 45, and add 72 to that, OR add 33 and 72 to get 105, and add 12 to that. Or, writing out the functions, you can have either add(add(12, 33), 72), or add(12, add(33, 72)), but you have to choose one or the other. So which is it? Find the answer in Part (14÷7(5-3)+2)÷3.

What's really wrong with 6÷2(1+2)? (Part (14÷7(5-3)+2)÷3)

So there's this one math question that has been circulating the internet for a looooong time: What is 6÷2(1+2)? This question (as well as related ones, like 48÷2(9+3)) always raises a huge amount of debate, with literally tens of thousands of people giving their idea - often rather heatedly - of what the answer should be: is it 1 or 9?

Casio  adamantly claims that the answer is 9.
On the other hand, Casio believes that the answer is 1.
Who do you believe?
The debate ends up revolving around the order of operations - when given a bunch of mathematical operations to do, what is done first? Many schools teach mnemonics like "Please Excuse My Dear Aunt Sally" or "BEDMAS" as a rule to follow: Parentheses (or brackets), Exponents, Multiplication/Division, Addition/Subtraction. Those stages must be done in that order. Why? Oh, it's just the rule, everyone knows that. So let's give it a shot with 6÷2(1+2). Obviously brackets come first, so lets do 1+2 to get 3. Now we're at 6÷2×3 (a number written beside the brackets means you multiply it by whatever's inside). No exponents to worry about, but then we run into problems: what happens in the "Multiplication/Division" stage?

If you use "BEDMAS," "D" comes before "M," so you might think to do division first: 6÷2 is 3, so 6÷2×3 = 3×3 = 9. But if you use "Please Excuse..." (also known as "PEMDAS"), "M" comes before "D." So you should do 2×3 first to get 6, so 6÷2×3 = 6÷6 = 1. But WAIT, multiplication and division actually have the same priority, because one is just the inverse of the other - they're the same kind of operation. So in this case we just go left to right, and the division IS first, and so we get 9. Not so fast. While multiplication and division may have the same priority usually, here we have implied multiplication: the problem isn't 6÷2×(1+2), it's 6÷2(1+2). So the 2, right in front of the brackets, is directly tied to the brackets and must be evaluated first. You're right, it is implied multiplication: but the thing in front of the brackets isn't 2, it's 6÷2. Now hold on a second...
Of course, Calvin was a type (c).

On the one hand, I find it hilarious that people get so riled up about this, since pretty much everyone's view of math is either (a) math is a useless subject they force you to learn in school, (b) math is good because it's applicable to science, technology, finance, business, etc., or (c) math has some intrinsic beauty that's worth studying for the same reason as music or literature (guess which camp I'm in?). But people in (c) should realize that the question is way deeper than just "what rule do you follow," people in (b) should be claiming that the person who wrote the formula should be fired for causing a disruption in the work flow, and people in (a)... why do you care at all?!?

But I also get very scared by the fact that this debate rages on so fiercely. Because in all my scourings of the internet, of the hundreds upon hundreds of comments I have read through, I have seen the "order of operations" invoked in almost all of them... but not once have I seen anyone ask why.

It's true, every now and then I come across a person who doesn't take a side between 1 and 9; they either say that there are multiple rules and different people have been taught differently, or (slightly more to the point) the expression is so sloppy, such bad form, that it's just unanswerable; that it's like asking "How manu apokeis is d Reiwhfds?" and expecting a correct response despite all the spelling errors; and that no one uses a division symbol any more because of issues like this unless they have to, preferring instead fraction notation with one number over top of the other. People like this restore a bit of my hope for humanity's future. And yet, they still shove the main problem under the rug: why is an order of operations even necessary?

click to zoom in
The image to the left in particular is striking. A college senior (who seems to me to be the most intelligent in this conversation) claims that either option is valid. The person labelled "me" in the diagram, however, laughs at this, saying "that. is. not. how. math. works." And here's the thing: he raises a very good point. Math is supposed to be consistent; start at the same place and you should be able to end at the same place. You shouldn't be able to get two different answers depending on how you feel or on your teachers' opinions. So how on earth can we have so many different conventions for the order of operations?

Here's what's going on. "College senior" is correct: there are two answers. "Me" is also correct: math does not work that way. The conclusion? What's really wrong with 6÷2(1+2) is that it's not math.

Want to know what I mean by that? Check out Part (14÷7(5-3)+2)÷3.