Lately there's been a lot of buzz in mathy circles about PINs, ever since tech consultant Nick Berry was kind enough to point out, via some awesome and aesthetically pleasing analysis of a few million compromised numbers, just how terribly unimaginative we are at choosing the security codes that unlock, e.g., our checking accounts and alarm systems. (Spoiler Alert: If the digits 1-2-3-4 remind you of your most recent ATM visit, you're in good company.) Of course, not everything we lock up numerically is so dramatic. Most cell phones these days give you the option of requiring a PIN to get them to do anything besides emit a soft glow---which, let's face it, is probably a great idea for the vast majority of the population.
But if you, like me, ever found yourself wishing for a more stylish way to protect your shameful mobile behavior, without all that cumbersome finger-lifting, the Android platform offers a cool alternative to the traditional PIN code: you can secure your phone by tracing a path through a 3 x 3 grid of dots on the touchscreen. Not only is this a combinatorially interesting feature---our very own Matt Lane has done some expositing with respect to the number of possible patterns---it raises some of the same questions addressed by the PIN analysis. Setting aside, for a moment, that there are over three-quarters of a million potential paths, which ones are people more likely to actually choose? Are there patterns in the patterns? Is there an Android analog of 1-2-3-4?
*Editorial Disclaimer: I am not Nick Berry. I do not have immediate access to several databases worth of, well, data. As a result of this non-Berriness, I cannot conjure the whiz-bang graphical representations that are possible when N = 3.4 million. Here's what I do have: a passing acquaintance with Python, an even more tenuous and informal relationship with Mathematica, and a handful of friends and/or social network nodes willing to punch their Droid patterns into Survey Monkey. This concludes the methodology section.*
For the uninitiated, I'll quickly recap the rules for creating a legal Android pattern.
- The pattern must contain at least four dots in the grid.
- A dot cannot be repeated.
- The order in which you visit dots (the direction of the path) matters.
- You can't skip over a dot en route to another one unless the passed-over dot has already been used.
In terms of raw potential, this scheme blows the four-digit PIN (a la the iPhone) clean out of the water by a factor of 75, but that doesn't mean it's actually more secure in practice. After all, even with 10,000 possible four-digit codes, it only takes 20 guesses to crack the passwords of a whopping 27% of the population. In other words, choosing a PIN at random---which is to say uniformly---is not something people are particularly inclined to do. (There are interesting mathematical and psychological reasons why we're not, but for now it's enough to note that our choices are wildly skewed.) Would we observe the same brand of wonkiness in Droid patterns?
Well, let's take a peek. Here are some real-life examples from my most cooperative and least security-conscious pals (and me):
And here are some that I generated randomly with a more or less cooperative Python program:
What do you notice? Besides the fact that they're really small and it's hard to see the arrows? If you're like me, you get the impression that each set has a distinctive flavor.
For one thing, the random paths certainly seem to be longer. The average number of edges in the user-generated paths is just over three. And, remember, three is the absolute smallest number of edges required in a legal pattern of four dots (some graph theory for you: a path on n nodes has n-1 edges). In contrast, the random paths average a little more than 5 edges. In other words, the humans involved were more likely than the computer to do the bare minimum work necessary to open their phones. I suppose that's not particularly surprising---I mean, we have urgent txt msgs 2 snd---but it certainly doesn't bode well for security. After all, if we only choose paths with three edges, then our pattern locks offer no more options than a four-digit PIN. In fact, they offer fewer, since PINs don't have restrictions regarding repetition or which numbers can/cannot follow others. No bueno.
We humans seem to be making an effort to save time in another, less obvious way. Well, it's visually obvious, but maybe not mathematically. Only three of the human patterns contain intersecting edges (it's hard to tell, but the one that looks like a 'T' does, in fact, cross itself). On the other hand, 26 of the 35 computer patterns contain at least one intersection. Here's why that means we're saving time over the computer (some more graph theory): if you pick a set of nodes to visit, the shortest path through all those nodes cannot intersect itself. (If you're feeling nerdy today, check out the Traveling Salesman Problem.) By almost exclusively choosing patterns that don't contain intersections, the humans are making a pretty decent effort at connecting their nodes as efficiently as possible. Of course, just because your path doesn't cross itself, that doesn't guarantee you've found the shortest path...it just means you definitely haven't found the longest. But that's not too shabby from an efficiency standpoint. The Traveling Salesman Problem is actually really (NP, ha!) hard for computers, and you're probably subconsciously fairly good at it for a reasonably small number of points. Gold star.
At first blush, there's one more way it appears that efficient phone-unlocking is built into our DNA: we really, really seem to have an aversion to edges with slope ±2 and ±1/2. Not one single person snuck, say, from node 2 to node 9.
I personally blame it on fat fingers. It takes way more energy and concentration to make that tight little move than it does to hop safely from dot to dot without bypassing any. There's not much margin for error, and the red dots that show up when you make a mistake and nick the wrong point are borderline infuriating...especially on like the third try while you're trying to drive and drink coffee and get Carly Rae Jepsen off the freaking radio before your brain explodes (*Editorial Disclaimer #2: Seriously, please do not do any of those things while driving.*). Basically, it takes longer, which---it's becoming increasingly clear---we truly dislike in the cell phone security department.
So apparently convenience is the name of the game for choosing Android patterns, which is exactly what we see in PINs. 1-2-3-4 is both easy to remember and easy to punch into a keypad. Ditto 2-5-8-0 (straight down the middle column). So are repeated digits or couplets, like 1-2-1-2. Which is why those PINs show up disproportionately in Berry's research. And how does this quest for efficiency manifest itself in the Droid lock? Notice that, 750,000+ possibilities notwithstanding, the human patterns look strikingly similar, even in this woefully tiny sample. This pattern even shows up twice:
If people chose patterns with equal probability, the chances that two identical patterns (with seven edges, no less!) would show up in a sample of 35 is extremely tiny. To make matters worse, its mirror image shows up as well:
If we ignore the numbering of the points and just focus on the shape of the graphs, they are identical. Graph theorists have a word for this: they'd say the two paths are isomorphic. (My apologies to the rigor-hounds out there. I throw myself on the mercy of the court.) When we look at the human-generated patterns, we can lump many of them into just a few isomorphic groupings called equivalence classes.
I will wager you that the equivalence classes above: The Pac Man, The L, The 7, The Tetris Z, The Lower Case C, The Poorly Stapled Staple, and The Question Mark are disproportionately represented in the global Android pattern distribution. Anyone who can demonstrate this phenomenon will win him- or herself a beautiful Mathalicious t-shirt...and ten minutes alone with my Galaxy S3.