DUPED? Part 2: The Data

There is far more to the NSA story than is being spun by pundits and pols.  So far, the knee-jerk outrage is dominating the conversation and the reputation of an agency that has provided security and safety to the nation for over 60 years is being damaged without warrant.

Part 1: Some Truths About The NSA Story


In the Fall of 2001, my alma mater, the University of South Carolina, was having an uncharacteristically good football season.  The Gamecocks started out 5-0, 4-0 in SEC play with a road win against conference rival Georgia and a dramatic come from behind win over Alabama.

I was trolling the college football message boards to scout the upcoming competition and generally see what everyone was saying about my team when I came across something very curious.  There, deep in a southern college football forum, was an exchange in Arabic.  There was little text because most of the exchanges were images; images of landscapes.  There were beach scenes, meadows, mountains and forests.  On a southern college football bulletin board.  Although I had no clue who the communicants were or where they were located, I did have an idea about the messages.

From Wikipedia:

Steganography includes the concealment of information within
computer files. In digital steganography, electronic communications
may include steganographic coding inside of a transport layer,
such as a document file, image file, program or protocol.

Simply put, messages imbedded in the code of digital photographs and other media.

This is important because it is necessary for people to understand the clever, if nefarious, ways information is being transmitted in the 21st Century.  Of course, there are scores of other ways, but this illustration is among the less complicated.

Before the internet and cell phones, bad guy communications were easy to find, though not necessary easy to understand.  Governments – and particularly their military forces – communicated on specific radio frequencies.  Transmissions were encrypted or otherwise coded to avoid scrutiny by adversaries.  This was a cumbersome means of information sharing, but it was what the technologies of the day permitted.  Once global communications went digital, however, all that changed.

Today, bad guys – be they governments, terrorist organizations, organized crime or in-laws – use the same networks and systems as churches, schools, business, industries and private citizens.  It’s cheaper, faster, global and more efficient than using dedicated circuits or radio waves and with sophisticated software encryption, the world-wide information grid is equally as safe if not more so.

There are some very cumbersome details involved in the collection of intelligence, none more so than in the gathering of that related to communications.  Although most will choose not to believe it, the Constitution is sacred to the people who run and work in the National Security Agency.  The Fourth Amendment is assiduously followed and preached at all levels of the agency.  That alone makes the collection of communications intelligence difficult.

Not only are American human beings considered citizens with Constitutional rights, but so are American-owed companies.  Unless and until an American “citizen” can be proven to the court that it is an agent of a foreign power with reasonable belief to be potentially harmful to the security of the United States, that citizen, be it human or corporation, is protected from scrutiny.

Here’s where things get very sticky.

A huge volume of global internet traffic actually travels through the United States, even if the communicants are not physically in the country.  Servers, cables, satellites and other telecommunications equipment and transmission media are owned by American companies.  Even though those resources may actually reside in another country, ownership provides it citizen status.


Global Internet Protocol Map
Click for larger image

Further, most email and database servers are also American, again, either resident in the United States or owned and operated by American companies.  Once again, those servers are protected by the Fourth Amendment.

This means that Terrorist A in Afghanistan talking to Terrorist B in Indonesia might conduct their conversation over American-owned communications.

The example of the Arabic exchange with images could well be an example of foreign bad guys using American assets to communicate. 

To better understand the gigantic sea of digital data in which bad guys of all types can hide, it’s necessary to understand the size of the currents across which they communicate.  The following statistics are for 2012 and come from Pingdom, a Swedish company that monitor sites and servers on the Internet.

Internet users

  • 2.4 billion – Number of Internet users worldwide.
  • 274 million – Number of Internet users in North America.

Web pages, websites, and web hosting

  • 634 million – Number of websites (December).
  • 51 million – Number of websites added during the year.
  • 43% – Share of the top 1 million websites that are hosted in the U.S.

Social media

  • 1 billion – Number of monthly active users on Facebook, passed in October.
  • 200 million – Monthly active users on Twitter, passed in December.
  • 175 million – Average number of tweets sent every day throughout 2012.


  • 1.3 billion – Number of smartphones in use worldwide
  • 6.7 billion – Number of mobile subscriptions.
  • 5 billion – Number of mobile phone users.


  • 300 million – Number of new photos added every day to Facebook.
  • 5 billion – The total number of photos uploaded to Instagram since its start, reached in September 2012.


  • 14 million – Number of Vimeo users.
  • 200 petabytes – Amount of video played on Vimeo during 2012.
  • 150,648,303 – Number of unique visitors for video to Google Sites, the number one video property (September).
  • 4 billion – Number of hours of video we watched on YouTube per month.

The volume of telephone and internet traffic is mind-boggling, and it grows every year.  THIS is where the bad guys live.  They communicate, educate, recruit, surveil and steal on the same networks that everyone uses.  To further emphasize the point, look at the photo below and imagine that there is a nuclear bomb in one of the vehicles.  This isn’t a fanciful idea since we know such weapons can be made small enough to find in a car.

Which vehicle has the bomb?  Do you profile – a strict no-no in today’s America?  Do you stop every single car and truck on the highway?  You could use some search criteria such as vehicle type and maybe even tag number if you have that information.  How do you know to look on this particular highway and this particular time?  Do you scan every vehicle with radiological “sniffers” as they go by?  What give you the right to scan the vehicles of private citizens?

Now, suppose your information indicates that the bomb will be detonated in the next hour.

What do you do?

The volume of internet and cell phone traffic is exponentially larger than is depicted in the photograph, but the point is the same. 

Of an estimated 5,981,000,00 mobile phone numbers globally, 323,000,000 are in the United States (103% of population). If just one-third (108,000,000) of those phones make ONE 60-second call in a 24-hour period, that would equate to 108 million minutes or 1,800,000 hours or 75,000 days or 205 years of conversation PER DAY every day. Using only 1/3 of U.S. cell phones, between January 1st and December 31st, nearly 75 THOUSAND YEARS of telephone conversation are generated. And these are conservative numbers.

In reality, it’s estimated that there were over 900 billion cell phone calls made just in the United States last year and not all of these are real conversations. What about butt dialing, late night drunk calls, harassing (ring and hang up), dropped calls, voice mail retrieval and just checking in? (If each of those 900 billion calls averaged one minute: 1,712,329 years of conversation)

Consider, too, what else people are doing on these phones in addition to making voice calls.  They’re texting (trillions!), conducting banking, searching the web, making point-of-service payments (I use mine at Starbucks), navigating via GPS, playing online games, watching movies and TV, checking sports scores, sending and receiving video, Tweeting and Facebooking.

What the court order mandated was that Verizon provide NSA with metadata of phone calls.  That is, the technical parameters of the communications – billing records.  What was specifically prohibited was the content of the calls. So, what’s the difference?

The metadata is, as Wikipedia puts it, data about data.  Here is how the court order reads:

IT IS HEREBY ORDERED that, the Custodian of Records shall produce to the National Security Agency (NSA) upon service of this Order, and continue production on an ongoing daily basis thereafter for the duration of this Order….an electronic copy of the following tangible things: all call detail records or “telephony metadata” created by Verizon for communications (i) between the United States and abroad; or (ii) wholly within the United States, including local telephone calls.

Telephony metadata includes comprehensive communications routing information,. including but not limited to session identifying information (e.g., originating and terminating telephone number… and time and duration of call.

Telephony metadata does not include the substantive content of any communication, as defined by 18 U.S.C. ? 2510(8), or the name, address, or financial information of a subscriber or customer.

The content, that which was not allowed and, apparently, not sought by NSA, is the actual conversation – what was being said.  No voices were recorded.  Look at the numbers above.  Why would they want the conversations of “every American?”

What NSA wants is the technical parameters of the global cell phone environment.  When the agency gets information about bad guy communications, it needs to be able to search through the environment and find out what numbers are associated with him.  It’s like looking at for friends of friends of someone on Facebook, or following a Follower of a Follower on Twitter without looking at the posting or tweets.

There is one more point for U.S. citizens to consider.  If NSA is as nefarious as some are making it out to be, why did it even bother to go to a FISA court for permission?


And then there is email.  Best estimates are that over 294 billion per day; 2.8 million every second.

And here’s the kicker; 80-90% of those were spam and viruses.

There are nearly 2 billion legitimate email users and 3 billion email accounts worldwide, more than one in every five persons on the earth.  Of these 730 million were business email accounts.

As afraid of NSA as people seem to be, consider the following from Gigaom.com:

  • Apple: Apple is operating a multiple-petabyte Teradata system. Apple uses the data warehouse to get a better understanding of its customers across product groups. Now every piece of identifiable information — and those iTunes interactions generate a lot of data — goes into the system so the company knows who’s who and what they’re up to.
  • Wal-Mart: The retail giant deployed Teradata’s first-ever terabyte-scale database in 1992, and it has grown, uh, a bit since then. Its operational system was at 2.5 petabytes as of 2008, and is certainly leaps and bounds bigger by now — likely well into the double digits when you consider it operates separate ones for Wal-Mart and Sam’s Club as well as a backup system. The analytics efforts have essentially helped Wal-Mart become a massive consignment shop. It tells suppliers, “You have three feet of shelf space. Optimize it.” And then it gives them any data they could possibly need to determine what’s selling, how fast and even whether they should redesign their packaging to fit more on the shelves.
  • eBay: eBay has two systems in place, and they’re both big. Its primary data warehouse is 9.2 petabyes; its “singularity system” that stores web clicks and other “big” data is more than 40 petabytes. It has a single table that’s 1 trillion rows.

For comparison, the capacity of the NSA Utah Data Center is reported to be about 4,900 petabyes. Considering the global volume of data outlined above and the world-wide scope of its charter, NSA is pretty much on a par with eBay and Wal-Mart.

The point to all this arithmetic is that the mass of information through which the National Security Agency must sift in pursuit of real bad guys is so astronomical that collecting all email and listening to all phone calls is more than just illegal, it’s not possible, efficient, smart or worthwhile. Even a fraction of it poses a huge burden of search and discovery.


NSA combs through telephone billing records and internet traffic looking for bad guys. What happens when they find one?

The intelligence community (law enforcement, State Department, Defense Department, Customs, etc.) have and share databases on foreign bad guys. They collect whatever information they can on them and cross-reference that data with their own specialized intelligence. When that information includes “signals” such as telephone or fax numbers, email addresses and so on, NSA goes to work trying to find related communications in the vast digital universe that covers the globe.

If “A,” known bad guy in Pakistan is identified and, with him, associated telephone numbers, NSA will search the spectrum for communications on those numbers, whether the call originated with them or were made to them. Understandably, any number called by “A” or that calls him will become of interest. What then, if “A” gets a call from “B” in the United States? Should that number be dismissed because it presumably belongs to a U.S. citizen, or should it be considered a number of interest? If this is a puzzle to you, you are a fool.

NSA can not and does not listen to the conversation between “A” and “B” but rather gives “B’s” number and his potential association to “A” to the FBI. By having access to telephone metadata, NSA can scan through it to find out if “B” contacted anyone foreign persons of interests. The FBI will check out – with a warrant – internal U.S. associations.

Were NSA to record and store all the conversations and emails and search histories of just every American, it would take many more Utah Data Centers and a gigantic super-computer farm. The fact is, the agency doesn’t do that. NSA has no interest in pumpkin pie recipes, breakup stories, gossip or a recitation of your trip to Cozumel. It is trying to find bad guys, so face it, NSA just isn’t that into you.



Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s