Skip to main content

Training Deep Speech: How you can help

By May 7, 2018 No Comments
How we train DeepSpeech for Mycroft and how you can help

Last month we released DeepSpeech support for Mycroft.  Many have now tried it and been, well, underwhelmed by the performance.  Was this a colossal failure?

Welcome to the Evolution

Short answer:  Nope, this isn’t a failure at all.  DeepSpeech is behaving exactly as we expected.

The data that Mozilla has used to train DeepSpeech so far is largely from individuals reading text on their computer.  A bit also came from speakers at conferences. In both cases, the person in the recordings is very careful to speak plainly and directly into a microphone.  So DeepSpeech learned to understand speech, but only clear speech like it was trained on.  No slurred words or yelling from across the room.

Like many interesting problems, the first pass isn’t perfect.  But without it there would never be a second, third or fourth pass.

Machine Learning Loops

The exciting thing about this step is that we have created a machine learning loop.  We took an imperfect dataset, trained on it to produce a model, and now we are using this model to create more data to place in a better dataset.

In the next few weeks, you will see a new tagger at Home for validating the voice data we have collected.  This will create an invaluable real-world voice dataset to include in the training.  The Mycroft Open Voice Dataset. Mozilla will be granted access to it to complete this machine learning loop.

Wash.  Rinse. Repeat.  Every time this cycle completes things will be a little bit better.

Let Your Voice be Heard!

The voices in the dataset are exclusively from those who have chosen to Opt-In.  At Mycroft we will never use your data without your explicit permission. So far over 2,500 of you have chosen to join us in this effort, and we can’t thank you enough for your trust and contributions.

But there is a secret… by contributing to this dataset you are literally training the technology to recognize your voice.  Eventually the tech will evolve to work for virtually everyone, but initially, it will be slightly biased to recognize the voice and pronunciation variants of those in the training data.

If you want to participate, go to Home and check Opt-In under your settings.  Joining is easy, and changing your mind later is just as easy.  Working together — Mycroft and Mozilla on the technology, you and the Mycroft community on the data — we are creating the foundation of an open AI for Everyone!