Skip to main content

I was on vacation when the NSA story hit the wire, so this question may have been asked.

Has anyone asked if the NSA is using speech-to-text to store our phone calls into a big database? I mean, specifically asked an Administration official if they are using speech-to-text technology?

If they are doing this, they can deny that they are listening to our phone calls because they are actually reading them.

Pretend that you are a consultant working for the NSA. You know that you cannot hire enough people to listen to every phone call in the United States, so you pay a few million dollars to develop a special purpose computer processor that can analyze the narrow-band digitally-encoded audio that is a modern phone call and do it several orders of magnitude faster than a general purpose processor like the one in your desk top.

Then you put 64 of these processors on one computer board. Then you put 64 of these boards in one computer cabinet. Then you rent a room from Verizon at their main hub and put a few of these cabinets in it. Maybe more than a few. Hell, you have a budget that would make King Midas blush!

Once you have all the text in a database, you then get paid to develop database queries that identify certain conversations as "terrorism related." Then you send the results of your query to the FISA court to get authorization to fetch the full text of the phone call or maybe even the original audio.

You overload the FISA court with hundreds of requests per day. If you have any sense you will behave like a financial derivative designer and make some of the queries so complicated that the regulators can't understand them. After all, we aren't paying you a hundred grand a year for a simple keyword search, are we?

Soon enough, the FISA court is overloaded so they do the sensible thing and issue a warrant to access the original phone call if it gets a hit on any database queries on the "approved list."

Does this scenario seem plausible to anyone else?

EMAIL TO A FRIEND X
Your Email has been sent.
You must add at least one tag to this diary before publishing it.

Add keywords that describe this diary. Separate multiple keywords with commas.
Tagging tips - Search For Tags - Browse For Tags

?

More Tagging tips:

A tag is a way to search for this diary. If someone is searching for "Barack Obama," is this a diary they'd be trying to find?

Use a person's full name, without any title. Senator Obama may become President Obama, and Michelle Obama might run for office.

If your diary covers an election or elected official, use election tags, which are generally the state abbreviation followed by the office. CA-01 is the first district House seat. CA-Sen covers both senate races. NY-GOV covers the New York governor's race.

Tags do not compound: that is, "education reform" is a completely different tag from "education". A tag like "reform" alone is probably not meaningful.

Consider if one or more of these tags fits your diary: Civil Rights, Community, Congress, Culture, Economy, Education, Elections, Energy, Environment, Health Care, International, Labor, Law, Media, Meta, National Security, Science, Transportation, or White House. If your diary is specific to a state, consider adding the state (California, Texas, etc). Keep in mind, though, that there are many wonderful and important diaries that don't fit in any of these tags. Don't worry if yours doesn't.

You can add a private note to this diary when hotlisting it:
Are you sure you want to remove this diary from your hotlist?
Are you sure you want to remove your recommendation? You can only recommend a diary once, so you will not be able to re-recommend it afterwards.
Rescue this diary, and add a note:
Are you sure you want to remove this diary from Rescue?
Choose where to republish this diary. The diary will be added to the queue for that group. Publish it from the queue to make it appear.

You must be a member of a group to use this feature.

Add a quick update to your diary without changing the diary itself:
Are you sure you want to remove this diary?
(The diary will be removed from the site and returned to your drafts for further editing.)
(The diary will be removed.)
Are you sure you want to save these changes to the published diary?

Comment Preferences

  •  It's possible (6+ / 0-)

    Speech-to-text isn't an exact science; all of Google's cloud power still gets speech-to-text wrong on a somewhat regular basis on my Android, and my accent is eastern US.

    They could be doing any number of things, up to and including storing compressed recordings of all conversations. Or they could be doing exactly what they say they're doing and only looking at the metadata once they get a tip that originates outside of the metadata.

    The problem is, we don't know what they're doing. And frankly, given the partial evidence available and their quick defensive reaction, I don't think they deserve any trust when they're answering questions.

    Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves. - William Pitt

    by Phoenix Rising on Tue Jun 18, 2013 at 03:29:14 PM PDT

    •  I wouldn't be surprised if the NSA had (6+ / 0-)

      better speech to text than google.

      And really, we can pretty much assume that they are using it. It's just a question of how broad of a use it it. They could easily store all the phone calls made every day in text format. And that's not mentioning text messaging, which I'd bet is used more than calls.

      If debt were a moral issue then, lacking morals, corporations could never be in debt.

      by AoT on Tue Jun 18, 2013 at 03:34:59 PM PDT

      [ Parent ]

      •  Even if they only vacuumed up a third of the calls (2+ / 0-)
        Recommended by:
        patbahn, radical simplicity

        it would still be a useful intelligence tool.

        And they could expand to get 2/3 next year, etc.

      •  I actually would be surprised (4+ / 0-)
        Recommended by:
        Ender, elfling, Simplify, 207wickedgood

        I'm not saying it's impossible; just that it doesn't seem likely. Speech-to-text doesn't seem like it would be a core skill for the NSA; I think they'd be more likely to farm it out to a company in that space. And any commercial company that could do it that well would want to commercialize the technology.

        Speech-to-text is hard. Doing it with high accuracy for any arbitrary voice is really hard. (Just ask Siri.) Anyone who could do it that well would stand to make a ton of money from it. That alone would make it very difficult to keep the technology's very existence a secret.

        Let us all have the strength to see the humanity in our enemies, and the courage to let them see the humanity in ourselves.

        by Nowhere Man on Tue Jun 18, 2013 at 03:50:31 PM PDT

        [ Parent ]

        •  But it would be so very very useful (1+ / 0-)
          Recommended by:
          radical simplicity

          to a secret government agency.

        •  It isn't new technology. It is just faster. (1+ / 0-)
          Recommended by:
          radical simplicity

          Sort of like how spy satellites take better photos than consumer cameras.

        •  NSA has been known to dabble in software (2+ / 0-)
          Recommended by:
          Ender, radical simplicity

          Remember, this is the agency that gave us some of our strong encryption, and approved our latest official encryption standards. They're the organization that donated a lot of code to Linux for compartmentalized security access. They are the go-to agency within the government when you want to secure your system.

          They've been behind ESCHELON and now PRISM. Is it so much to think that they have better speech-to-text than Google?

          Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves. - William Pitt

          by Phoenix Rising on Tue Jun 18, 2013 at 04:01:20 PM PDT

          [ Parent ]

          •  One does not simply dabble in speech recognition (1+ / 0-)
            Recommended by:
            Ender

            (There's a meme in there somewhere, isn't there?)

            The data mining algorithms in PRISM, if I'm not mistaken, were based on technologies produced by an outside vendor and sold to the NSA. I'd expect the same for speech recognition. Again, I'm not saying it would have to be done this way, but I think it's unlikely that they'd be doing it in-house.

            Let us all have the strength to see the humanity in our enemies, and the courage to let them see the humanity in ourselves.

            by Nowhere Man on Tue Jun 18, 2013 at 07:34:17 PM PDT

            [ Parent ]

        •  Wouldn't need high accuracy (4+ / 0-)

          Siri-level accuracy would be plenty for blanket surveillance.

          Freedom isn't free. Patriots pay taxes.

          by Dogs are fuzzy on Tue Jun 18, 2013 at 04:02:38 PM PDT

          [ Parent ]

          •  Bingo (0+ / 0-)

            Even if there is, say, a 5% miss rate in the conversion, that means 95% of what's said is correctly translated. The software that analyzes the resulting text is still going to have plenty to work with - and if a mis-translation ends up causing a person to be flagged for review, the low-level subject matter experts would be able to check the context to determine whether or not the person was flagged appropriately without ever needing to request a warrant to listen to the actual voice call.

        •  some advantage (1+ / 0-)
          Recommended by:
          Ender

          build a big enough database and you can apply learning algorithms.  

          The typical Speech to text problem starts from zero, but,
          if you have hundreds of conversations from the same person and you have "Profiles" built up (North American, eastern accent, Tenor Voice) you can slowly build up accuracy.

        •  The NSA hires a shit ton of people (2+ / 0-)
          Recommended by:
          Ender, radical simplicity

          who are trained in all the necessary skills to make it happen. And no, it would be rather easy to keep it secret. They're constantly updating it and so they can have their partner release the old versions commercially. It is hard, but commercial speech to text is getting much better.

          If debt were a moral issue then, lacking morals, corporations could never be in debt.

          by AoT on Tue Jun 18, 2013 at 04:26:59 PM PDT

          [ Parent ]

          •  I suppose I should mention (2+ / 0-)
            Recommended by:
            AoT, Ender

            that I currently work in the field. No, I'm not a speech scientist or engineer. I work with them and use their stuff. But it gives me some background in the work that's involved, not to mention the current state-of-the-art. Which, honestly, isn't really all that good if you're critically depending on it. But yes, it is getting better.

            So the NSA is hiring speech scientists and engineers? It's possible. But do you have any proof, or is that an extrapolation on your part? I mean, yeah, they hire a lot of smart people -- but that doesn't mean that they hire cardiologists, either. (Then  again...)

            Let us all have the strength to see the humanity in our enemies, and the courage to let them see the humanity in ourselves.

            by Nowhere Man on Tue Jun 18, 2013 at 07:39:08 PM PDT

            [ Parent ]

    •  Yes. You would have a lot of false translations (1+ / 0-)
      Recommended by:
      JML9999

      and there may be significant lag while audio is waiting to be converted.

  •  What They Say They're Recording Are Just the (5+ / 0-)

    contact data: what number you called, how long you were on, date & time. Not the content of the call itself.

    There are reasons to think they're not recording the complete content of all calls and emails, mainly the astronomical amount of data that is. But much isn't known.

    Given the huge variety of accents heard within the US I would strongly doubt they'd translate voices to text.

    We are called to speak for the weak, for the voiceless, for victims of our nation and for those it calls enemy.... --ML King "Beyond Vietnam"

    by Gooserock on Tue Jun 18, 2013 at 03:39:26 PM PDT

    •  Then Again what they say is the "Least untruthful" (2+ / 0-)
      Recommended by:
      Ender, Gooserock

      answer.

      On the one hand they have the Local Usage Data program who called who when.

      Then again the "Intelligence Community"  have their fingers in the bowels of the long distance network as documented by Risen and Lichtblau. Which means they have their fingers in the bowels of the internet.....

      I want 1 less Tiny Coffin, Why Don't You? Support The President's Gun Violence Plan.

      by JML9999 on Tue Jun 18, 2013 at 03:48:27 PM PDT

      [ Parent ]

    •  Maybe they have a loose definition of metadata (0+ / 0-)

      We have a new loose definition of "immediate" after all.

      Any computer generated output based on the original phone call could be considered metadata.

    •  Speech compresses well. (6+ / 0-)

      The phone system isn't exactly high quality data. A single phone circuit (T1 or ISDN), uncompressed, is only 64kbps. And voices have a limited frequency range.

      Necessity is the plea for every infringement of human freedom. It is the argument of tyrants; it is the creed of slaves. - William Pitt

      by Phoenix Rising on Tue Jun 18, 2013 at 04:03:25 PM PDT

      [ Parent ]

      •  Yeah and Because It's Low Quality It's Also (3+ / 0-)
        Recommended by:
        PeterHug, Ender, Nowhere Man

        more prone to perception errors. That's been cited as a reason cell phones are dangerous in cars even hands free, they make the brain work overtime decoding the sounds, to the distraction of the driver.

        So I'd think if they wanted original data they'd certainly compress it which can be done losslessly, but I wouldn't think they'd want to contaminate it with translation errors.

        We are called to speak for the weak, for the voiceless, for victims of our nation and for those it calls enemy.... --ML King "Beyond Vietnam"

        by Gooserock on Tue Jun 18, 2013 at 04:11:59 PM PDT

        [ Parent ]

      •  Heck, it's hard for ME to tell what people (3+ / 0-)
        Recommended by:
        Kristin in WA, Ender, Gooserock

        are saying via cell phone.

        Government and laws are the agreement we all make to secure everyone's freedom.

        by Simplify on Tue Jun 18, 2013 at 04:44:06 PM PDT

        [ Parent ]

  •  I'd be very surprised if the NSA weren't doing (1+ / 0-)
    Recommended by:
    Ender

    some sort of blanket speech-to-text conversion in order to do automated analysis and a search for keywords, at least on overseas calls.

    •  for years . . . (4+ / 0-)
      Recommended by:
      skrekk, Ender, 207wickedgood, Nowhere Man

      there was chatter in the trade press at least five years ago about very big buck RFPs for both speech recognition software and hardware . . . enough to take at least a cursory look at everything.  NSA itself has more than once disclosed keyword searching on all international calls, and they have a couple orders of magnitude more hardware than they'd need for just that.

      It's all part of how they winnow down the call volume to make storage practical . . . if you call High Tech Burrito and say "Cajun, no sour" they probably won't save the content of the call.  Unless they're already on you for something else . . .

      Fake Left, Drive Right . . . not my idea of a Democrat . . .

      by Deward Hastings on Tue Jun 18, 2013 at 04:37:16 PM PDT

      [ Parent ]

      •  I did research work in this area over 20 years ago (4+ / 0-)

        and developed a morphemic-based predictive parser that operated in real time and was reasonably speaker independent.   Without going into details I'd be very surprised if the DoD were less interested in the field today than they were back then, especially given the advances in hardware and predictive parsing algorithms, envelope constraints, etc.    It's just a lot easier to do today.

      •  Yes, this seems feasible (1+ / 0-)
        Recommended by:
        Ender

        Listening in for specific keywords or phrases, and flagging them as potential candidates for review, makes a lot more sense with the current technology (or what I know of it) than trying to do general speech-to-text on all calls. I'd find this a more credible scenario, though I have no idea if they're actually doing it across the entire phone network.

        Let us all have the strength to see the humanity in our enemies, and the courage to let them see the humanity in ourselves.

        by Nowhere Man on Tue Jun 18, 2013 at 08:00:41 PM PDT

        [ Parent ]

  •  I'd be modestly surprised... (2+ / 0-)
    Recommended by:
    Nowhere Man, Ender

    Partly because the technology isn't that good (and compression isn't that bad, esp. on a 56kb/s data stream), and largely because the conversations of most interest are likely to be in something other than English.

    I'm sure the NSA does keep a lot of conversations in text form, but in those cases, they're probably most often translated and transcribed by people.

    •  Or English in Heavily Accented Forms (1+ / 0-)
      Recommended by:
      Ender

      which is basically all dialects.

      We are called to speak for the weak, for the voiceless, for victims of our nation and for those it calls enemy.... --ML King "Beyond Vietnam"

      by Gooserock on Tue Jun 18, 2013 at 09:13:36 PM PDT

      [ Parent ]

Subscribe or Donate to support Daily Kos.

Click here for the mobile view of the site