Artificial Intelligence: What is it?
The Mycroft team has gotten this question a lot lately. How do you define Artificial Intelligence (AI)? How does Mycroft fit into this definition? In today’s computing environment these are pertinent questions, so we thought we’d take a few minutes to explain how we define AI and how our Mycroft platform fits into this definition.
Here at Mycroft we set out to build an AI that is defined first and foremost by its utility. Though Watson won Jeopardy more than four years ago, most of us have yet to benefit from IBM’s advanced data system. The ability to pull accurate answers out of Watson’s stand alone data system hasn’t changed the world. Our goal here at Mycroft is to take some of the AI technologies already available through the Internet and make them accessible to people in their homes.
First we must define artificial intelligence. In our case we define AI as a device or system that understands natural language and uses that understanding to interact with its environment. Our AI is a weak because it is an expert system and not a general intelligence. It will turn on your lights, play media on your TV and might even be able to hold a conversation with you some day, but it doesn’t have initiative, intuition or inventiveness.
Mycroft’s first and most defining AI feature is its ability to understand natural language. The first step to understanding natural language for a computer is converting a series of sounds into understandable text. Unfortunately performing this task accurately is very computationally intensive. The larger the computer’s vocabulary the more computing resources are required. To accurately convert speech to text (STT) on the Raspberry Pi we had to split the task in two. The first half of the task is performed locally (on the device). Mycroft uses a local process to analyze incoming audio looking for a key phrase. By default this key phrase is “Mycroft”, but users can change this to be anything they want from “Samantha” to “Ada”.
Once Mycroft has identified its key phrase, the device opens a network connection and begins to stream audio to our cloud. It is important to note: no audio is sent to the cloud until Mycroft has identified its key phrase. This helps to ensure user privacy. Mycroft streams audio to the cloud until it detects a silence or 15 seconds have elapsed (the 15 second timeout is adjustable). The audio is then processed in the cloud to determine both the text content and the meaning behind the content (the intent of the speaker). It is the interpretation of the meaning of the sentence or command that makes Mycroft intelligent. Simple text matching has been available since the late 1990s but it is only recently that it has become possible to determine the meaning of the text based on the context, objects, phrasing and sentence structure.
Once Mycroft has determined the intent of the user, it can then perform actions. These actions can be anything from turning on a light to searching a media repository. Right now these actions or “skills” are developed by Mycroft’s software team, but in the future the platform will have dozens, hundreds, perhaps thousands of skills developed by end users of the system.
As we move forward with the project we plan to include various additional features into Mycroft. For example, our team will be developing data structures that help to forecast a user’s intent based on the current scene or query. Roll in late in the evening and mention “I’m so tired Mycroft, shut off the lights I’m going to bed” and the unit might automatically set an alarm for the following day. It would set the alarm even if it wasn’t asked to directly. This type of deep learning and intuition is very hard to accomplish, so it will likely take both a large dataset of anonymized user data and a lot of experimentation to get right. The thing about the Mycroft platform is that it will allow the software team to partner with the open source community to develop these features in a transparent way that protects the privacy of end users and allows the best developers in the world to collaborate on the source code.
As we continue to develop the platform, we are hoping to expand Mycroft’s skills and abilities to include other features you’d expect of a strong AI. It would be great if Mycroft could recognize emotional context and react accordingly, initiate and hold up a conversation or select media content for you based on your past preferences. All of these are features that we are looking to the open source community to help us develop. It is our goal to provide a robust, useful platform that is both technologically and financially stable. What the open source community does with that platform and the anonymous data it generates is up to the creators, makers and developers who choose to contribute.
If you like what you’ve heard about Mycroft and our take on artificial intelligence, please consider backing our project by heading to our Kickstarter campaign.