VMI77 wrote:jimlongley wrote:Even if they are, that is a mind boggling amount of data that they already, according to his testimony, don't have time to go through. By the time they get the ability to sort through it with any level of efficiency we will all be beyond caring.
A bunch of years ago I was the engineer of a network control center. The center monitored the condition (not the content) of a large (195 nodes) T1 network for the State of NY. We were required, per contract, to record and monitor ALL alarms and events on the network, and investigate each an every one, as well as keeping a database of those events, a raw record of those events, and a separate database of the trouble tickets generated and solutions. The amount of data gathered rapidly became so unmanageable that it was considered a joke to threaten someone on the staff with having to go look for a specific incident in the raw record. On top of that we kept a backup copy at a "geographically diverse location."
As the head engineer, I was charged with the responsibility for ensuring that all of the data was stored and accessible, and I hired a database "expert" to program the access to the raw data as well as sorting and storing it in a useable manner. It quickly became obvious that the state of the art was not up to the task, and the programmer was even behind that. (At one point, shortly before he was moved to another job we noticed that the trouble ticket database was taking huge amounts of time to load, and it turned out that what he had done to handle completed trouble tickets was set a "delete flag" on those tickets, so that when you searched for open tickets they got ignored, but every time you accessed the database ALL of it got loaded. One day my lead computer operator, who was not a programmer herself but had some programming ability, decided to see if she could improve the speed by "packing" the database, and the command she issued deleted all of the data with delete flags set. That was when it was decided that our programmer would be better off in a different job and that it was a good thing to have a separate copy of the database off site. Forgot to mention, one salient thing on his resume was his experience with NSA.
We got a new programmer who understood the troubleshooting process and trouble tickets and things got a little better.
All of this with the State of New York looking over our shoulders and nitpicking.
I eventually quit the job and went back to something more comfortable.
I don't think NSA has the ability, now or in the near future, to process that data, and the amount will continue to grow as they sit on it, so I don't much care what they are keeping of mine.
I think the key in your remarks is "a bunch of years ago." A bunch of years ago system modeling in my industry could take hours; now what took hours is done in a few seconds. What took about 30 minutes five years ago takes about 30 seconds now. And we just have PCs, not supercomputers. He's been out of the NSA for over 10 years and there have been major advances in computing ability during that time. And in the context he's talking about search time is not a critical factor. He's not talking about real-time monitoring but looking back through stored data for either legitimate criminal or illegitimate and nefarious politically motivated investigations. In either case it doesn't really matter if it takes an hour, a day, a week, or a month to pull from the database.
Also, this article doesn't address it, but the original court testimony he referred to also included testimony from an AT&T technician about the interface alluded to in the article. They're not just collecting emails, they're collecting everything...URL's visited, streaming audio and video (not the URL, the actual stream), online chats, online phone conversations, purchase data.....everything.
Actually, it was 20+ years ago and T1 was the "ne plus ultra" for networks, but the higher speeds have been offset by higher density of traffic and content. And I realize he's been out of NSA for more than 10 years, and I believe the same applies there.
Dragonfighter wrote:
For years, and I mean yeeeaars, phone traffic has been monitored to flag certain "keywords". All an agency has to do is set a combination of keywords they are looking for and the system will filter millions of emails in a matter of a few hours. So here is the scenario: You are looking for fundamentalist gun owners and you set your filter this particular day to "weapons, 2a, constitutional, (any number of makers' brand names), etc. So you get a few million emails either sent, forwarded or replied to that discuss the 2A in conjunction with weapons and bingo, a list of possible dissidents. The "agency" now simply reconfigures for each group of "inconvenient citizens". They have the data, all they need to do is sift it. And since they are allowed to gather the data without due process, then what prevents some Chicago mobster turned attorney general from sorting out and then setting about the squelching of anyone in their way. The only possible solution is to have that data base eradicated and cause for data collection very narrowly tailored to actual threat profiles.
Sorry, I don't believe that totally, there are just too many competing digital formats, not to mention analog, to do that in real time, which is what monitoring is. If it's being post processed, which using a database implies, then it is not monitoring.
I do believe that monitoring has taken place in some circumstances, I have even participated in both monitoring and failed attempts at monitoring. I actually had a lot of fun monitoring a lottery network back then, we were watching for errors in the lines, much easier to detect than specific words or word patterns, and my partner and myself actually figured out not only the polling sequences and addresses of stations on the network, but could even tell the difference, at a glance, between a quick pick ticket for one game and a selected ticket with choices penciled in, and even what the choices were, as well as which station originated the ticket.
We were sitting there running a test overseen by the State of NY and the computer company, and as Rich and I started saying what kind of ticket was being sent from which machine, the computer company person went ballistic on us, insisting on an immediate meeting with us, the State, and a variety of other players including a contracted security team. The security team basically laughed at them after they explained what the problem was, telling them that hours and hours of repetition watching those lines under test would have resulted in an easy education for anyone smart enough to be able to operate the sophisticated equipment we were using to monitor the lines to begin with. End of meeting, but then a "secrecy oath" to be signed so that we would not "expose" the secrets we had discovered.
------
But if I moved up a couple of levels, or down as the case may be, in the protocols stacks, I not only would have to be able to sort by station address, but which of several T1 "channels" that station was transmitting on, and then which of several available T1s in a T3, and so on, adding levels of complexity until the current state of the art. And that's just in the digital domain without any encryption, if it's analog, then you have to get it all the way back to analog in order to be able to tell what was said.
As a telephone engineer I participated in attempts to automatically monitor voice conversations, and the efforts were laughable. Look at voice recognition menus today, and how many errors are generated. An agency relying on real time monitoring of analog telephone traffic looking for specific keywords had better hope that the person they were monitoring spoke with no accent, at an even rate, and did not have a cold.
Do I believe "they" are trying to do it? Yes, most absolutely. But do I believe they are succeeding, not anywhere nearly to the extent that movies and TV shows would have us believe. And as I have said, the amount of digital data out there is just monstrously huge, and post processing that in all of its various formats is not a logistical problem I would like to even attempt, and I have troubleshot IBM SNA networks using a protocol analyzer that did not decode SNA.