Intel’s updated Smart Sound Technology DSP found in 2014’s Broadwell promises to outsmart Siri in every way possible. The catch? It’s already partially here in some Bay Trail configurations.
Voice recognition today has a problem: it’s just not that good. Current voice recognition setups like Siri and Android’s Voice Actions show potential, but are incredibly taxing on system hardware.
The more accurate of the two, Siri, relies heavily on the system on a chip’s digital signals processor (DSP) to augment the CPU (Apple contracted Audience to build this chip) to remove background noise and prepare the speech audio files to be sent to the cloud for processing. But even then Siri’s (and other voice recognition solution’s) fatal flaw is that in order to be work at least semi-accurately the user needs to be speak to it a direct voice: louder than average; focused and not natural.
Adding to this lack of natural feeling is that voice recognition solutions today are “push-to-talk”. They have to be triggered. On the iPhone this means hitting the Home button to get Siri to ask “what can I help you with?” or hitting the microphone button in Android to get it to patiently listen for your request.
Intel is planning on changing that with its refreshed Smart Sound Technology (SST) DSP that’s currently in some Bay Trail configurations, and will be wholly integrated with 2014’s Broadwell. It promises to offer a more natural voice recognition system than current ecosystem offerings, without being as taxing on the CPU — as Siri and Google’s voice actions are — as it will bypass it entirely.
Right now, as per leaked Intel explainer slides seen by VR-Zone, Windows 8 offers software support for offloading voice processing to the DSP. Intel’s slides don’t get too specific on the software side of things, but it has to be assumed that the work is being done in collaboration with multiple stakeholders.
Pushing audio processing to the DSP allows for the device’s personal assistant app codenamed “Genie” to be always listening for instructions while in a battery-saving low power mode. This means no more jabbing the home button for “push-to-talk” or clicking specific icons, but speaking naturally: “Hello computer, please call Sam.” Intel says that on 2014’s Broadwell chips users will get a confirmation prompt after they say the “Hello Computer” part of the phrase, while in 2015 they will be able to do it in “one shot” as the company projects accuracy will have improved by then. Intel also says the system will be matched with a specific user’s voice, so another person in the room doesn’t inadvertently trigger the assistant software.
This chart below maps out how the path between Genie (the app), the requisite APIs, the Intel SST driver, and the DSP:
Intel says it’s working with DTS and Waves (for Maxx Audio 3) to provide support for output sound CODECs. This has less to do with voice recognition, and more to do with enhancing audio output (“widening”) without increasing CPU draw.
Right now it isn’t clear if Intel will be offering this audio technology as a licensed package to developers, or if Intel will package and brand the app itself. Slides from Intel appear to show the company intends to license the As Microsoft and Intel have long been partners, perhaps this will be a part of Microsoft’s upcoming “Cortana” personal assistant application.
While the technology sounds promising, and potentially a Siri-killer from a technical perspective, Intel’s lack of mobile market penetration is holding it back. OEMs haven’t shown much enthusiasm for placing Intel chips in their products yet. As speech recognition is becoming ever more popular amongst a certain user set, enhanced speech recognition could be a big selling point, but without pickup from OEMs this will be nothing more than a paper win — and not make touch screens obsolete.