In June of 2013, it came to light that the United States government had been tapping into the records of the world's largest internet and telecommunications companies, on a startling scale, to gather information about people who might represent a threat to national security. One of these programs, codenamed PRISM, included direct access to the servers of companies like Google, Microsoft, Facebook, and Apple. The actual reach of the program is unknown, but, in theory at least, a platform like PRISM has the capability of monitoring the internet activities of anyone in the country. All 300 million of us.
Apparently, PowerPoint design is not a prerequisite for NSA employment.
Speculation aside, one thing we can be sure of is that PRISM isn't perfect. It will fail to find bad guys some of the time, and it will mistakenly identify regular, law-abiding citizens some of the time as well. To a lot of people, that's a scary prospect. How scary? Let's think about it.
Let's be charitable in our analysis and assume that a PRISM-like program, having been designed and implemented by some extremely smart people, is really good at what it does. We don't know how good, naturally, but we can choose some hypothetical numbers for now. For instance, assume that the algorithm for identifying a legitimately dangerous individual is very sensitive, and in fact correctly identifies a person who is a threat to national security in 99.99% of cases. That's about as good as Purell is at killing germs...not bad. Let's also say the algorithm is very specific --- meaning that it has a low rate of false positives --- correctly identifying people who are not dangerous in 99.90% of cases. (Again, these figures are obviously made up, but I'll wager that PRISM errs on the side of catching as many bad guys as possible, so it seems at least reasonable for the sensitivity to be higher than the specificity.)
Also available in lemon scent.
With such a program in place, how concerned should the average, non-criminal citizen be? After all, if you're not doing anything wrong, you don't have anything to worry about, right? Hopefully, but maybe that's not obvious. Imagine that someone gets flagged by PRISM as a threat to national security. It seems that an important question to ask is, What's the probability that this person is actually guilty?
Let's begin by calculating the probability that a person is both flagged and dangerous. To do that, it will help to know the probability that some random person is dangerous (whatever that means to the PRISM overlords). Now I have no idea how many people in the country are actively engaged in some kind of terrorist activity, but I can't imagine it's more than 1 in 10,000 (frankly, even that seems crazy high). So that means about 0.01% of the population is bad news. We've already estimated that PRISM flags 99.99% of bad-news humans, so 99.99% of 0.01% of the 300 million people in this country are both flagged and dangerous; that's about 30,000 people.
But what what about the segment of the population that is flagged and not dangerous? Well, 9,999 out of every 10,000 people are just going about their everyday, non-terroristic business, but the false positive rate on PRISM tells us that it will mistakenly flag an innocent person 0.1% of the time. So, 0.1% of 99.99% of the population gets falsely identified, which is about 300,000 people.
Think about these numbers for a minute. A PRISM with the accuracy we described, applied to the entire population of the U.S., would flag a total of 330,000 people, only 30,000 of whom should actually have been flagged. Not only would a huge number of people be subject to some pretty intense, high-level scrutiny and violations of fundamental privacy in the near future...most of them would be innocent. In fact, one quick division tells us that somebody flagged by PRISM in this scenario has only about a 9% chance of being guilty. The initially accurate- and impressive-seeming algorithm isn't looking so great anymore.
Teachers, you may recognize this discussion as an application of Bayes' Theorem, but without the hugely ugly formula. Looking for a classroom conversation involving conditional probabilities? Check out our lesson.