Author

About the Author
The ultimate resource in the voice space. Conversational interfaces, voice interfaces, smart speakers and smart assistants, voice strategy, audio branding.

Flash briefing 32 – Throwback 10 years of the App Store and voice

Last week the App Store turned 10 years. It has generated 40 billion dollars for Apple. The day it opened it had only 500 apps. “Let me just say it: We want native third-party applications on the iPhone, and we plan to have an SDK in developers’ hands in February,” Steve Jobs wrote. Those first 500 creators had the unique opportunity of shaping the design direction and interaction methods of the millions of apps created since. And there’s no question that the App Store, and mobile apps in general, have been a major influence on the world over the past decade.

Recently someone posted on Twitter how voice is not as the App Store was at that moment when it made the first developers of mobile applications millions ‘overnight’. In 2008, the 2 most famous apps were Facebook and Poker. A social network and a game. Let’s not forget that Facebook was around since 2004 as a website and did a semi-mobile transition, and a year before was available as one of the iPhone’s first web apps. They made available an already defined and successful product on mobile. Ebay, Yelp, the list continues. True mobile-first experience are apps like Instagram and Snapchat, didn’t exist for the first couple of years of the App Store. And most of the games that were top at the beginning might have made the monies, but they are not around anymore. Now you might tell me that no developer has become millionaire with skills. And your argument is right, the history timeline, however, is wrong. Voice is a complete new platform, you can not port Facebook to voice. You have to create a completely new experience in voice to have a social network and to have games.

Design in mobile applications is organized, predictable and constraint by the space in the phone. We don’t have those constraints for voice applications. Conversations can be as opened as the users want, they can use as many similar words as they want, there’s no list to select a value from. Is in the users mind. We are effectively hacking how we communicate to a deeper level. Conversation has other expectations as well, where context is very important. If we are talking about my mother’s car, and we ask about the doors then we ask about the painting, we both implicitly know we are still talking about the car.

All this challenges will be overcomed. I truly believe that once people get used to do certain tasks by voice, it will be frustrating for them to do it differently. Now we mention this frequently, voice-first is not only about voice, but that’s the subject for tomorrow’s episode! If you want to define how the next million skills are going to be designed, if you are a developer, a designer, a scriptwriter, an entrepreneur, bet on voice.

Thank you for listening. The transcript for this episode is available at our website voicefirstweekly.com/flashbriefing, and search for episode number 28. Have a productive, joyful day.

Resources:

https://web.archive.org/web/20071024180600/http://www.apple.com:80/webapps/whatarewebapps.html

Flash briefing 31 – Voice terminology

Hello everyone, happy Sunday, it’s really sunny and beautiful today, I hope you are have a lovely one wherever you are listening.

In this episode we give a (short) list of terms you need to understand in voice. Let’s begin with the most basic ones:

Smart speakers are devices that can be commanded by voice, with an integrated virtual assistant that can respond to actions after a wake word. Some can also act as control home automation devices, like lights switches or control your TV. We have talked about Alexa, Cortana and Google Assistant here but there are several other smart assistants. Alice, by the Russian search engine parent company Yandex, AliGenie, by Alibaba, Xiaowei of Tencent. A skill is a capability of Alexa. Alexa provides a set of built-in skills and developers can use the Alexa Skills Kit to give Alexa new skills. An action is the equivalent for Google devices.

Wake word, the word that makes an always listening speaker to wake up or listen to answer to user prompts

There a lot of other terms like

Natural language processing

Speech recognition or Automatic speech recognition

Speaker Recognition

Artificial Neural Networks

Natural Language Processing (NLP), Natural Language Understanding and the list goes on. It’s important to at least have heard of these, whether you are voice interface designer, entrepreneur or developer. Most people building skills don’t need to know these by heart, but I certainly suggest to get the basic understanding of the terminology even if you are not building in the subyacent technology. Can’t hurt!!

Natural language processing: technology that extracts the “meaning” of a user’s utterance or typed text.  A meaning usually consists of an “Intent” and “Name-Value” pairs. The utterance, “I want to book a flight from Washington, DC to Boston,” has the Intent “Book-a-Flight” with the Name-Value pairs being, “Departure City”=”Washington, DC” and “Arrival City”=”Boston, MA”.  An NLP system takes the flat sequence of words, “I want to book a flight from Washington, DC to Boston,” and produces a “meaning structure” (usually a JSON object) that boils down the sequence of words to an Intent and Name-Value pairs. The JSON object delivered can then be inspected by what is often called “middleware software” that can now easily extract the information in the object and execute additional business logic (e.g., retrieve available flight information, or ask for additional missing information, e.g., “What date would you be flying out of Washington, DC?”).

Speech recognition: a machine’s ability to identify spoken words and translate them into a machine-readable format. It’s the base technology in the smart assistants. It all start with recognizing the the user speech, and that’s why rolling out updates in different languages is not an easy task for the companies in it.

 

https://en.wikipedia.org/wiki/Smart_speaker

https://en.wikipedia.org/wiki/Speech_recognition

https://www.onevoicedata.com/speech-recognition-technology-2017/

https://www.witlingo.com/voice-first-glossary-of-terms/

https://medium.com/@joshdotai/16-voice-control-terms-you-need-to-know-4a79303db08a

Flash briefing 30 – Voice investments and your trash can voice commands

The voice world keep getting the monies. Bespoken announced yesterday that it raised 2.4 million dollars in seed funding round. Bespoken, a developer focused tool we featured as a voice resource a couple of weeks ago in our newsletter, allows you to test and monitor your Alexa skills. They have extensive documentation, if you are a developer you know what i’m talking about, that’s a rare thing to find in services. Overall we think it’s a great tool, check our newsletter for our review. Along with Sensible object, the company that developed Alexa skills for board games, featuring their first game When In Rome, recently also announced a similar raise, Storyline that raised 770k, all happening in a span of a few weeks, proving investors betting on voice. Comes with no surprise to us, we have been saying it: voice is the future first interface for computers. The Wall Street Journal published an article yesterday All Ears: Always-On Listening Devices Could Soon Be Everywhere, describing how tiny microphones are moving us to a world where your trash can can respond to voice commands. The good news for the privacy concerned is that these microphones are not connected to the cloud, and thus they are not sending information back. One of such companies, Vesper Technologies, Inc.—which has received money from Baidu, Bose and Amazon’s Alexa Fund, claims unique capabilities in their microphone, like understanding your voice even in windy conditions, and drawing zero power when awaiting a “wake word,” since sound itself generates the power the microphone needs. Bottom line, voice is moving fast in investment and smart assistants is not the only market for voice applications. This opens the doors for a world where everything with power or a battery can respond to voice commands. Let that sink in and we’ll talk tomorrow.

This is the link to Wall Street Journal article we referenced:

https://www.wsj.com/articles/all-ears-always-on-listening-devices-could-soon-be-everywhere-1531411250

Flash briefing 29 – Voice in healthcare

voicIn a recent episode we talked about coming events in voice ecosystem. One of this conference is the Voice of Healthcare Summit taking place in August. Then someone asked earlier this week for real life examples of voice applications in hospitals or medical institutions. As he didn’t specified location, the first part of my reply was asking for a specific country. And then proceeded to explain all the legal challenges that Amazon Alexa and Google Assistant face at least in the US. Both companies have teams dedicated to healthcare in their respective smart assistants and are fighting to become HIPPA compliant. I, personally had seen or heard more applications dedicated to wellbeing and not healthcare per se, for example Ask Marvee, dedicated to elder care, but then other users in the group replied with a list of healthcare voice services or apps and I thought it would be a good idea to have an episode about it, and give this list to you, my dearest listener. So here is a list of services, podcasts and voice applications for healthcare/wellbeing:

  1. Podcast: The Voice of Healthcare by voicefirst.fm https://www.voicefirst.fm/voiceofhealthcare
  2. Service: Neurolex Labs and Sonde Health: https://www.sondehealth.com
  3. The conversational AI platform Nuance has built an AI powered solution for healthcare: https://www.nuance.com/en-gb/about-us/newsroom/press-releases/nuance-unveils-AI-Powered-solution-for-healthcare.html
  4. Sensely Sensely, has a virtual nurse avatar patients speak (or type) to. Patients have conversations with the avatar, do daily health checkins, take their blood pressure, or do symptom checkers. http://www.sensely.com/
  5. Suki: formerly called “Robin.” They are a digital assistant for doctor’s offices, targeting the physicians as their users (not the patients): https://www.suki.ai

Thank you for listening. Another week newsletter was sent yesterday to all of our subscribers so, if you are not in the list, subscribe today at voicefirstweekly.com. Have a nice day and we’ll talk tomorrow.

Flash briefing 28 – Voice is not about voice (only)

We have been waiting on the rise of bots for years now. I remember how in 2015 people were going crazy on chatbots and that it will be the “Next big thing”. Revolutions in technology are multi stepped. It has been quiet, but it’s growing. Now what happen with voice and chatbots? With aggregated together, you can provide a seamless experience, transparently to the end user of communication between a voice app and your chatbot. Following voice command prompts, inter-related services will communicate to provision requests, such as booking a preferred airline, hotel and transport services for a meeting taking place in two weeks’ time. A multimodal, context based interface powered and started by voice; is what we see when we think of a voice-first world. All of those mediums needs to be aware of the other, the chatbot, the smart assistant, the skills, plus all the services we have available today to know when a plane is landing in Chicago, to make reservations to restaurants, everything that is powering the web today: The API. The interconnectivity necessary between services to allow your smart speaker to connect with your phone, your bank, and the airlines website is also central to the technology. You can start a conversation with a smart speaker that might continue in your phone, if you need to input a password, for example. All these connections might also change, inadvertently, the moral: in conversation interfaces, build for change. Speed matters, probably more than ever before in the history of business, being able to react quickly to changes in services interfaces and to build APIs to connect to these services is a key advantage for any company today starting in voice today. What we used to call this is the viking team. Is the infantry, it moves fast, it test the terrain, it provides the team with visibility. Find your viking team for voice, whether in house or by hiring an agency. Because the truth is voice is not only about voice, but about hyperconnectivity and availability.

https://www.technative.io/how-voice-technology-is-transforming-the-way-we-transact-and-socialise/