The cover has been blown on an NSA program which collects data on “nearly everything a user does on the internet” even as the debate rages over the secretive US agency’s mass surveillance of innocent people.
The XKeyscore program covers emails, social media activity and browsing history and is accessible to NSA analysts with little or no prior authorisation, according to aleaked presentation published by The Guardian yesterday.
The slide deck, disclosed by NSA whistleblower Edward Snowden and published alongside an accompanying story, was released just hours before NSA director General Keith Alexander was due to deliver an eagerly anticipated keynote presentation at the Black Hat security conference in Las Vegas.
If you were shocked by the NSA’s Prism program, hold onto your Black Hat: The NSA also operates another system, called XKeyscore, which gives the US intelligence community (and probably most of the US’s Western allies) full access to your email, IMs, browsing history, and social media activity. To view almost everything that you do online, an NSA analyst simply has to enter your email or IP address into XKeyscore. No formal authorization or warrant is required; the analyst just has to type in a “justification” and press Enter. To provide such functionality, the NSA collects, in its own words, “nearly everything a typical user does on the internet.” Perhaps most importantly, though, it appears that HTTPS and SSL might not protect your communications from being snooped on by the NSA.
This information comes from Edward Snowden, the NSA whistleblower who leaked some information and slides to the Guardian newspaper. The NSA has confirmed that XKeyscore exists, but insists that “multiple technical, manual and supervisory checks and balances” prevent the system from being abused. Snowden says, however, that ”it’s very rare to be questioned on our searches, and even when we are, it’s usually along the lines of: ‘let’s bulk up the justification’.” Whether you are more inclined to believe Snowden or the NSA, that’s up to you.
XKeyscore itself consists of 700 servers (running Linux!) situated at 150 different sites around the world, which are constantly scanning and indexing intelligence accrued by NSA’s data gathering tools (which are separate from XKeyscore). As far as we can tell, the data gathering tools are themselves a massive network of servers that are located in data centers around the world. These servers intercept and analyze data that traverses the internet and other communications networks. The bulk of what a typical user does on the internet is transmitted via HTTP (hypertext transfer protocol), and it’s relatively trivial to scoop out the interesting data from a packet of HTTP data. When you send an IM on Facebook, XKeyscore will have no problem working out who the sender and recipient are, and the body of the message. Likewise, when you use a webmail client like Gmail or Hotmail, the sender, recipient, CC, BCC, subject, and body are all easily accessible via HTTP packet sniffing.
As for where the NSA gets this data from, there are three sources: F6 (aka the Special Collection Service), FORNSAT (foreign satellite collection), and SSO (the Special Source Operations division). F6 places eavesdropping equipment in foreign embassies, data centers, and other important communications hubs; FORNSAT intercepts data from foreign satellite links; and SSO deals with everything else, from such as cable and microwave taps.
How XKeyscore extracts information from HTTP sessions
Combined, these three sources harvest an almost-incomprehensible amount of data. According to the leaked slides, some sites produce so much data (20+ terabytes) that they only have space to store it for 24 hours. (Most of these slides are from 2008, though, so they may have upgraded their storage capacity since then.) As of 2012, there were 41 billion records available for analysis by XKeyscore within any given 30-day window.
To use the XKeyscore (XKS) system, an NSA analyst taps in a few search parameters, a “justification” (i.e no formal warrant is required), and presses Enter. XKS can be indexed by email or IP address, name, telephone number, keyword, language, or even the type of web browser. If the search returns an email or IM hit, the analyst can instantly view the contents of that message. Presumably there are other tools/viewers for other kinds of data. Because there’s so much data available, the NSA slides recommend that analysts narrow down their search results using the metadata first.
The slides say that, as of 2008, 300 terrorists had been caught with intelligence from XKS. In 2008, the slides also said that “future” capabilities will include VoIP and EXIF parsing (EXIF being the metadata associated with images, which can contain geolocation data).
What about HTTPS?
To be entirely honest, it isn’t all that surprising that XKS exists. Given the way the internet and its protocols work, it’s relatively easy to eavesdrop on most internet-based communications, and eavesdropping is essentially what the NSA was created for. It is also highly likely, as with Prism, other Western nations have access to XKS — or their own XKS-like systems.
What is surprising is that the slides seem to suggest that VPNs and encrypted links may not be secure. “Show me all PGP usage in Iran” and “Show me all VPN startups in country X, and give me the data so I can decrypt and discover users” seem to be functions available to analysts using XKS. This isn’t a direct admission they’ve broken ciphers such as AES-256 and 3DES, but it would seem that they’ve found some exploitable weaknesses.
This leads us to another important question: Can the NSA eavesdrop on HTTPS traffic? In recent years, many web services have moved to HTTPS as standard (such as Gmail), and in theory the encryption should keep your data safe from prying eyes. As of 2012, though, despite the widespread adoption of HTTPS, XKS still seems to be working as intended. Has the NSA cracked HTTPS? Has the NSA somehow obtained the root SSL certificates from the likes of Symantec and Comodo, so that it can perform man-in-the-middle (MITM) attacks on any website that uses HTTPS?
If HTTPS, PGP, and VPNs have been compromised, and if the NSA really has its insidious tentacles hooked into fiber-optic cables, microwave links, and foreign satellite links, there is almost no way of using the internet or any other communications network without the American and other Western governments snooping on you.
EDITORS NOTE: Big data and ability to correlate is not about crime, but about the influence of money. Nearly all government departments are corrupted by revolving-door jobs and money from corporations – the FDA being about the worst. Let’s say you create a huge public fight against an insurance company that has friends in government. Suddenly, emails where you revealed some odd sexual predilections surface. You lose your job because of it and are too busy surviving to continue the fight. BTW, although it’s supposed to be illegal, US military intelligence does sometimes go to bat for big companies. I found that out through experience.
The Guardian reports that the top secret National Security Agency program allows analysts to search through a database “containing emails, online chats and the browsing histories of millions of individuals”. In the leaked documents, the NSA describes XKeyscore as its “widest-reaching” internet intelligence system.
Targets data in transit
The release is arguably the most significant disclosure about the NSA’s web surveillance operations since the first revelations about the spy agency’s controversial PRISM web data mining program, which collects data from email, chat and VoIP. That program harvests information from users of services provided by Google, Facebook, Apple, Yahoo! and AOL, and was said to have been carried out with the indirect assistance of those companies.
While PRISM involves stored data, XKeyscore appears to involve mining through data in transit, either from the premises of a telco or through a fibre-optic tap. Leaked training materials explain how analysts fill in a simple online form before gaining access to data sorted by identifiers, such as target email addresses. Only a broad justification of the reason for a request, which is reportedly not subject to a review by any court or senior NSA personnel, is needed.
The Guardian reports that the leaked files provide substance to Snowden’s claims that while working as an NSA contractor he “could “wiretap anyone, from you or your accountant, to a federal judge or even the president, if I had a personal email”.
He made those claims in a video interview in early June soon after he outed himself as the source of leaks about the NSA’s secret surveillance programmes.
Analysts can combine XKeyscore with data from other NSA systems to obtain “real-time” interception of a target’s internet activity, said the paper.
“XKeyscore provides the technological capability, if not the legal authority, to target even US persons for extensive electronic surveillance without a warrant,” said The Guardian‘s Glenn Greenwald.
They don’t even need to know who you are to track you down
According to the slides, spooks can query the system by name, telephone number, IP address and keywords as well as email address. Just searching by email address alone will not give a target’s full range of activities on the net, but a range of carefully selected queries are needed to prevent analysts being swamped with an unmanageable dump of information to sort through.
Spooks are advised to use metadata also stored in XKeyscore in order to narrow down their queries. Queries can be mixed and matched in order to try to pin down a group of suspects without even knowing targeting information, such as email addresses.
One example cited in the training document says that XKeyscore can be used to search for someone whose language is out of place in a region, or who is using encryption and “searching the web for suspicious stuff”. Another example states that XKeyscore is the only system that allows analysts to directly target traffic from “VPN startups in country X” to “give me the data so I can decrypt and discover the users”.
“No other system performs this on raw unselected bulk traffic,” the 2008 vintage training manual (marked “Top Secret” and apparently shared only with the NSA’s peers in the UK, Australia, Canada and New Zealand) explains.
XKeyscore also provides a means to index exploitable computers in a specified country, as well as a way of obtaining the email address of persons of interest using Google Earth.
One leaked document describes how the program “searches within bodies of emails, webpages and documents”, including the “To, From, CC, BCC lines” and the ‘Contact Us’ pages on websites”. XKeyscore also also allows analysts to pull together logs of the IP addresses of visitors to specified websites.
An NSA tool called DNI Presenter is used to read the content of harvested emails. The same tool enables analysts to read the content of Facebook private messages.
Content remains on the system for only three to five days, while metadata is stored for 30 days. One leaked document states: “At some sites, the amount of data we receive per day (20+ terabytes) can only be stored for as little as 24 hours.”
However, NSA systems allow flagged data on Xkeyscore to be moved onto other databases such as Pinwale, where material can be stored for for up to five years.
Despite the short shelf life of data stored on XKeyscore in one month last year, the system collected at least 41 billion total records.
NSA training manuals state that 300 terrorists have been captured using intelligence from XKeyscore before 2008, a claim that will doubtless be used to justify the program and criticise its exposure.
In a statement to The Guardian, the NSA said: “NSA’s activities are focused and specifically deployed against – and only against – legitimate foreign intelligence targets in response to requirements that our leaders need for information necessary to protect our nation and its interests.