New Device Could Someday Translate Thoughts into Speech


Vocoder, a device that harnesses the power of AI and speech synthesizers, could monitor an individual’s brain activity to reconstruct the words they imagine to speak.

Neuroengineers at Columbia University’s Zuckerman Institute have recently created an innovative system that enables translation of thought into intelligible and recognizable speech, a discovery that could provide new techniques for computer to communicate directly with the brain and help individuals suffering from different disorders or diseases affecting speech such as the effects of a stroke or Amyotrophic lateral sclerosis (ALS).

The new system has been outlined in a research paper published in the journal Scientific Reports. In a statement, senior author Nima Mesgarani said that voices help people connect with friends, families, and the surroundings, which is why losing the capability of one’s voice because of an injury or a disease is quite devastating. With the new study, there is potential to restore the affected speech capability, she added.

The neuroengineers have shown that with right technology, it is possible to decode people’s thought which can be understood by any listener.

Studies have shown that brain activity patterns appear when someone speaks or imagines speaking and when a person listens and imagines listening to other.

Efforts to harness such effects to decode the brain signals have been challenging, mainly focusing on simplistic computer models that evaluates visual representation of sound frequencies – spectrograms.

However, in the new study the research team used a computer algorithm that can produce speech after training with recording of people talking. According to Mesgarani, it is the same technology used by voice assistants like Apple Siri and Amazon Echo to provide verbal responses to people’s questions.

The researchers trained a vocoder to interpret the brain activity by asking patients suffering from epilepsy undergoing brain surgery to listen to words spoken by different people, while the team measured the brain activity patterns.

In the next step, they recorded the brain signal and asked the same patients to listen to speaker reciting numbers from 0-9 and fed the measurements via a vocoder. Then, the team investigated the sound synthesized by the vocoder, responding to the signals, and make them clear using neural networks.

Ultimately, they synthesized a voice sounding like a robot that recites the number sequence. When tested the accuracy by having study participants listen to the recording, the team found that people could not only understand but also repeat the sounds around 75% of the time, way beyond any earlier attempts.

The vocoder along with strong neural networks represented the voices the patients has previously listened to with better accuracy, Mesgarani said.

Posted in ,
Rohit Bhisey

Rohit Bhisey

Holding a vast experience in a number of fields associated with market research and Internet marketing, Rohit strives to gain a better understanding of global as well as regional trends in a number of industries and analyzing their impact on a plethora of markets at several levels. His scrupulous attention to details and high understanding of how socio-economic factors impact how markets develop can be seen in his analytical write-ups presented on Fact.MR.

Leave a Reply