Mycroft is a global company. One of the key advantages of having a distributed world-wide workforce is our ability to engage with entrepreneurial and technology ecosystems in cities across the world. As our Australian contingent, I recently attended the Digital AI Summit in Melbourne. In our spirit of openness and sharing, I have provided my notes for our entire community to benefit from.
Tweets from the day are available at the hashtag #digitalaisummit
Session 1 – Blair Bryant, Global Digital Advisor, Microsoft
Blair outlined how important it was to get the CEO on board with digital transformation programs. He underscored how digital transformation needs to solve specific problems for the C-suite – don’t sell digital transformation itself, sell the problems that it solves. He outlined a four-point plan;
- Inspire the CEO – the CEO needs to be inspired to adopt digital transformation
- Align the senior leadership team
- Build a digital transformation strategy
- Understand how the execution will proceed
Business should immediately start to gather their data; understanding what data they collect – as data is the “fuel” for machine learning. Then, business should start to understand how hyper-personalization will influence their offerings, services, and products.
Key takeaway: Get your house in order, build a plan and execute it.
Session 2 – Professor Phil Cohen, Monash University Laboratory for Dialogue Research – Conversational Technology, Present, and Future
Prof Cohen has recently joined Monash University to head up the Laboratory for Dialogue Research. He previously headed up a startup called VoiceBox, which was recently acquired by Nuance.
This was the standout presentation of the day.
Prof Cohen provided an overview of the evolution of voice technologies. Although much recent progress has been made, largely in part due to advances in machine learning and computational speed, dialogue and interaction are still stunted by several challenges, including;
- Paraphrasing – there are many ways of saying the same thing
- Ambiguity – one phrase can have multiple meanings depending on context
- Meaning and semantics – meaning can be different to spoken words – “Do you have the time” means “tell me the time” but this is not the way it is worded
- Pragmatics – this is the word Cohen used to describe “context” – the history of a dialogue and the meaning that is imbued in that history.
Cohen outlined the basic premise behind Chatbots – they’re essentially “stimulus-response” engines. Where an Intent is matched to a stimulus and a response is provided. However, they only provide the illusion of having a dialogue – they currently don’t handle context well, nor do they handle follow up questions or interactions that fall outside the ‘dialogue tree’.
He provided an excellent walkthrough of failures of Chatbots – how they don’t handle diplomatic nuances well – ie “What’s your grandmother’s name” => “My grandmother is dead” – nor do they handle follow up questions well because their ability to determine context is severely limited.
Spoken natural language retrieval
Natural language (NL) retrieval allows a system to respond to a query using an FAQ type system, or route a query to the right department using keyword matching. Again, the ability to hold a dialogue is very limited.
Voice assistants are able to handle broad topics, but there are limited ways to phrase a query – it’s not really “natural language”.
Semantic parsing is a technology currently in research labs that allows users to have better “expressivity” and better “understanding” of natural language. Semantic parsing has more capability than the current generation of voice assistants, and can handle things like superlatives – ie “find the best pizza in downtown closest to the space needle, but not McDonald’s”.
Side note: I had a quick look at open source Semantic Parsing libraries and this one called SLING from Google looked like the most commonly used one – if anyone has thoughts, feel free to post in the Forum!
Prof Cohen gave a live demo of Voicebox using web-based speech to text, which was then converted in real time to Intents, and it came back with a pretty slick answer.
https://www.voicebox.com/ – the live demo website isn’t available.
Comparison voice assistants
He then presented a chart (not available online) of what Voicebox can do compared to Siri, Google Home, and Alexa, positioning Voicebox in a positive light.
The next big frontier for voice assistants, as we’re starting to see with projects like Duplex from Google is transactional dialogue – being able to have a natural sounding conversation with a voice assistant that allows the user to complete a task – booking tickets for a movie, ordering a pizza and so on. Most transactional dialogues are ‘slot-filling’ systems where the aim of the dialogue is to ensure the slots are filled so that an API can be used to complete the transaction. That is, the voice assistant will prompt for the missing ‘slot values’. There are not many systems like this on the market today.
Semantic parsing and collaboration
The next horizon in speech recognition research is semantic parsing coupled with collaboration. Under this model, the voice assistant will be able to:
- Analyse meaning
- Infer the intent of the user from the utterance – that is what is the user really trying to do?
- Debug the plan – be able to explain how it arrived at a conclusion about meaning
- Offer the user solutions to the Intent that has been inferred
The Laboratory for Dialogue Research intends to collaborate with industry via a membership model.
Key takeaways: Investigate semantic parsing libraries as part of emerging tech roadmap
Session 3 – Whole of government and whole of community approach to AI – Cheryl George, Kathy Coultas, and Martine Letts
- Kathy Coultas – Director, Technology Innovation and Investment, Department of Economic Development, Victoria
- Cheryl George – Government and Stakeholder Relations, Data61 CSIRO
- Clive Dwyer and Martine Letts – Committee for Melbourne
The AI space is evolving rapidly, and there are opportunities that Australia is well positioned to take advantage of – especially using our domain knowledge. To really take advantage of the opportunities requires a broad cross-sector collaborative approach. Coordination will be critical. The government needs an “enabling environment” and “enabling infrastructure” to harness the opportunities, and this needs to be done quickly.
Kathy Coultas recognized the need for ALL citizens to engage with digital, and explained that this was part of the reason behind the Digital Innovation Festival.
Martine Letts noted that the cross-partisan Parliamentary AI group had been established at a state level to help politicians really grapple with the opportunities and challenges of AI. She would like to see this initiative expanded into a national working party. The only other country in the world that has a similar cross-partisan working party is the United Kingdom.
Kathy Coultas noted the issues surrounding data privacy, security, and ethics and how we need to tackle these as part of coming to terms with artificial intelligence. Part of this will be the necessity for legislative amendments to harness emerging technology effectively while protecting citizens from foreseeable harm. She was firm that political point-scoring won’t be effective; regulatory reform for emerging technologies requires a multi-partisan approach.
Martine Letts noted that Australia is well behind investment in the AI space; China and the USA are really leading both investment and technology development in this space. AI is seen not as an integral part of emerging technology strategy – and a pillar of strategy – but as a “bolt on” to existing measures. This view is anachronistic and will not service organizations well. Kathy Coultas followed up by outlining that most technology investment now is poured into ABC – AI, blockchain and crypto. Australia is significantly behind in terms of technology investment, and that needs to seriously change in order to realize the vision of Melbourne becoming the technology capital of Australia. We have the capability, but it’s fragmented across multiple sectors.
Kathy Coultas also noted the investment needed in workforce capability and training to be able to have the skills needed to harness AI – right now there is a significant skills shortable in this space, and universities are only now moving to catch up and address this shortfall.
Cheryl George highlighted that CSIRO/Data 61 had been tasked with developing a national AI technologies roadmap to lay out the opportunities, threats and possible approaches. A draft of this is due by the end of the year 2018.
Key takeaways: Whole of government approaches to AI require significant coordination, whole of sector and whole of pipeline approaches.
Session 4 – Dr Nathan Faggian, Google – Machine Learning Infrastructure
Nathan attempted to provide an overview of machine learning in 30 minutes – much respect to him.
This was the second-best presentation of the day.
He opened by explaining that what used to be fantasy is now reality. There have been massive improvements in artificial intelligence and machine learning. Google uses a huge amount of machine learning and AI – and he quoted the statistic that over 20% of Gmail responses are now the “automated” pre-canned responses that are available.
He provided a demonstration of Google Duplex, and explained how it was using semantic parsing based on massive machine learning efforts to be able to have natural interaction style. In terms of Professor Cohen’s earlier presentation, Duplex is certainly at the “cutting edge” of where speech recognition technology is at the moment.
He went on to show how machine learning can be valuable in industrial contexts – citing both the case of where illegal fishing was identified in Vanuatu due to fishing boat movements, and the case of soy crop forecasting – which was accurate to within 2% with a 5 month lead time – an incredible level of accuracy.
He underscored the importance of data and the huge volumes of data that are required for effective machine learning.
He also outlined that machine learning does not exist in isolation in an organization, and that there is a large supporting infrastructure that sits around it. This was essentially a shadow-pitch for the Google Cloud Platform, but did make the important point that there are many parts to an effective machine learning capability within an organization. They all have to be considered as a whole in order to be able to effectively scale an organization’s machine learning efforts.
Side note: None of the existing governance frameworks like COBIT, ITIL or SFIA have recognized machine learning as a key organizational capability yet.
Key takeaways: Google uses a lot of AI, they’re the leaders in the field. Make sure you have the infrastructure to scale your ML efforts. It’s not just about ML and algorithms, the infrastructure is a key component of your machine learning capability. ML can solve some pretty hoary problems for business and industry. If you’re not considering it yet you should be.
Session 5 – Health Technology Panel
This was a really interesting panel that looked at the application of machine learning to the health sector. The key points were;
- Getting access to patient records in a way that is consistent is very difficult because data formats differ between providers; data standards matter.
- Machine learning has a large role to play in effective pharmacology dosing. For instance, the industrial strength antibiotic vancomycin is nephrotoxic, so being able to get the right dose by personalizing it for the patient based on their unique characteristics means less kidney damage.
- IBM have faced the challenge of trying to convince clinicians that tools like Watson are there as complementary tools. They’re not trying to replace the clinician – the goal is to work in a complementary way.
Key takeaways: Data, data, data. Getting the right data in a machine-readable format is absolutely essential for machine learning.
Hailing from Geelong, Australia, Kathy is a techie from wayback, with a background in web development, Linux, videoconferencing, digital signage and data visualization. She works in Developer Relations with Mycroft.AI and loves documentation. Yes, really 🙂