Google announced yesterday in their Canadian blog the introduction of new Google Assistant experiences for families. The announcement clearly focuses Google Assistant in a light of family-related activities, especially with kids.
Today, we’re introducing new experiences, designed specifically for families with kids so they can learn, play and imagine together.
Among the new experiences, Google says families “will be able to join The Wiggles,” from the Treehouse TV show and go on a “custom Wiggles experience” commissioned by Google exclusively for Assistant.
Google is also introducing news stories appealing to kids through CBC Kids News. It can be invoked by “Hey Google, Play CBC Kids News“. The feature is focused on daily, local, national and international stories that are relevant to Canadian kids with a focus on media literacy. Google Assistant recently launched storytime experiences in partnership with Disney. Along with these new services for Canadian kids, the search giant is shifting a lot of its focus to kids and family environment. Moreover, Reuters also released a report in which one of the main conclusions is that users don’t love news in smart speakers. By shifting to kids consuming news in Google Assistant, the company is setting news consumption in their smart speaker for the future. Smart.
Another feature the blog post announced is Boukili Audio, an interactive activity that tests listening and comprehension of stories on animals, nutrition, music, travels and a ton of other captivating subjects, all in French. After listening to stories, Boukili Audio puts your child’s skills to the test with a series of multiple choice questions to evaluate their French language comprehension and their memory, all while having fun. Boukili Audio has over 120 interactive books, over 70 of which are exclusive to the Assistant. Available in French only. To Try it out Canadian users can say: “Ok Google, Parler avec Boukili Audio”.
Just in time for Christmas, Google is announcing calls to Santa: “Hey Google, call Santa”. It’s only available in English. Can’t wait for this to be available in the US to see what Santa has to say.
With their parent’s permission, children under 13 can also have their own personalized Google Assistant experience when they log in with their own account, powered by Family Link (Not available in Quebec). Family Link helps parents manage their child’s Google Account while they explore. And with Voice Match, your family can train the Assistant to recognize up to six voices.
The announcement was in the Google Canadian blog yesterday Wednesday, November 28. It doesn’t specify when the family experiences are going to be available in other territories. Google focus on families and kids in the holidays is very smart. It proves the company is looking ahead and it’s not as concerned to “fix” current users’ view. After all, Generation V is the real protagonists in voice technology.
Perfect for the Holidays 🎄
Yesterday via the Amazon blog we learned that the company has been working on a neural TTS that can model speaking styles with only a few hours of recordings. Among the problems in speech synthesis is the lack of tone and emotion. Finding correct intonation, stress, and duration from written text is probably the most challenging problem for years to come (per research in the Helsinki University of Technology).
The way you customize a synthesized speech today is through SSML, a speech markup language. SSML allows configuring prosody, emphasis and the actual voice used. The problem is that people change their speaking style depending on context and the emotional state of the person. What Amazon is saying in this announcement is that their latest TTS can learn a newscaster style from just a few hours of training, which it’s significant because with their previous model tens of hours were required.
This advance paves the way for Alexa and other services to adopt different speaking styles in different contexts, improving customer experiences.
The same way the neural model might work for newscaster style might work for other styles. Amazon also said they created a news-domain voice.
Listeners rated neutral NTTS more highly than concatenative synthesis, reducing the score discrepancy between human and synthetic speech by 46%.
Let’s listen to the female voice with the newscaster style and judge for yourselves.
Very realistic news style. Isn’t?
It is very timely to bring up the results of the Reuters and Oxford report:
The Future of Voice and the implications for news. (I expanded on this on our last newsletter, subscribe). According to the report, consumers love smart speakers, they don’t love news from smart speakers. One of the main reasons the report concluded is Synthesized voices are hard to listen for many users.
The report also concluded that news updates are among the least valued features in smart speakers.
This new development of neural TTS by Amazon could mean more options of customization for brands looking to get a unique persona and voice in smart devices. Definitely, this is a very well received improvement in TTS.
I get more and more interested in synthesized speech every day as I realized is going to be a fundamental part of the future. That future might not be that far off: last week Chinese News Agency Xinhua announced the “world first” TV anchor at the World Internet Conference. The anchor features a virtual AI controlled avatar powered by synthetic speech.
The Revolution will be synthesized, my friends.
Thank you for listening, you have a great day. As always you can find me on Twitter as @voicefirstlabs or Instagram as @voicefirstweekly.
Montgomery court, in Dayton city, Ohio has added a new technology to boost their services.
The innovation, an automated virtual assistant and chatbot, will answer many of the most frequently asked questions received by the Montgomery County Clerk of Courts Office by email and phone.
The bot is named after the Greek goddess of wisdom, law and justice Athena, is designed to answer questions directed to the Montgomery County Municipal Courts and Montgomery County Common Pleas Court, as well as the county’s auto title branches. Athena, accessed at www.mcclerkofcourts.org, can also look up basic ticket and case information.
The bot was designed in-house using Microsoft Cognitive Services. Part of the significance of the innovation is that it can answer questions from their five divisions, so you can ask about your car, or passport or how to pay your traffic ticket.
The virtual assistant is also connected to the county clerk’s online public records information system. Athena can use that connection to tell users the status of a case and link them to related documents.
We have talked before how voice and conversational can be life-changing for government and city services. It comes to mind the work of the city of Ozark, Missouri that VoiceXP did. Instead of browsing all the services websites to find the information you need, or wait in a call for the status of your application, you can just chat with a bot to ask a question: how is my case for this going? Or what are the requirements for DMV in California? It’s pretty life changing. We humans will always appreciate technology that saves time or provide convenience.
Microsoft keeps pushing its cognitive services as a strong developer option for use cases like this, at the same time pushing their partnership with Amazon and Alexa. It seems to me that they are trying to get enterprises to use their cognitive services way more than they are pushing Cortana. That’s definitely part to the new horizon of Microsoft as an enterprise services company and away from a consumer company.
Thank you for listening. My name is Mari, this is VoiceFirst Weekly flash briefing show. Before I wrap up this episode: a reminder of the coming Alexa Conference in Chattanooga, Tennessee January 15-17. You can sign up at the voicefirst dot fm website. I’ll be there, as well as my cofounder Nersa. I’ll be talking in the track of podcasting on the age of Alexa and we’ll also have a booth where you can check us out. Don’t miss it!
The voice community is amazing to watch as it evolves. I have been very fortunate to meet lots of people who are making an impact on voice technology and conversational interfaces. This episode is special because it features one of those startups impacting voice tech every day, and it is the first time VoiceFirst Weekly show welcomes a guest. We are very thrilled about how it turned out.
In this episode, I talk to Brendan Roberts, CEO of Aider, The AI assistant for small businesses. Aider is launching now in Australia and New Zealand, with plans for the US in 2019. Aider will help you answer questions like: What’s my top selling product? What’s my revenue today? Who is meant to be working tomorrow and what’s the weather going to be like? All this from your phone bringing your business context into account.
I met Brendan at the Voice Summit back in July after arranging a meet up of folks at the conference from the Voice Entrepreneur Community on Facebook. I got to see Aider first hand and was really impressed by its capabilities. Aider integrates with several SaaS apps that small business users might be familiar with for sales, accounting and client management providing insights and learning from the user’s actions. What I thought was really impressive for an app this type was the ability to keep the conversation across conversational channels through voice or messaging.
Without further ado, please enjoy my talk with Brendan.
A showcase for smart glasses opened yesterday, marking November 12 as the day (per the company) that the first world’s smart glasses store opened. North is the Canadian company who develops futuristic HCI products. The company has raised over 135 million from investors, including the Amazon Alexa fund.
Focals is the smart glasses the company is presenting, exclusively available in their stores in Brooklyn, New York and Toronto since yesterday. The custom made glasses features a display that only you can see, I’m not exactly clear about the technology behind this as it’s not expanded in their webpage, I’m very curious and excited to know more about it.
Focals includes visual summaries, smart text and emojis and voice to text. It also comes with a navigation feature with search, turn by turn and the ability to hail an Uber.
The display is controlled with a Loop, a small finger ring with a tiny, four-way joystick that’s included in the purchase, along with a case that doubles as a battery charger. The glasses sync with the user’s Android or Apple iOS device via Bluetooth.
Focals comes with Alexa built-in. According to the showcase page you can Ask Alexa to play music, hear the news, see the weather, control your smart home and more. I’m guessing you can do anything that Alexa allows you to do.
The glasses also comes with a function to pause it all from when you don’t need them:
Technology that’s there when you need it, gone when you don’t – hidden by design.
Form plus function
The glasses comes in stylish designs, a la Warby Parker, maintaining the idea of keeping the technology invisible for only when you need it. The store is also selling the experience in the shopping process. You have to be custom fitted for Focals.It’s crucial to understand how the technology looks and feels,Adam Ketcheson, Chief Marketing Officer of North said to The Bridge: It’s incredibly important for people to get a hands-on experience, especially at our price point. The entire retail model is so people can immersively understand what it is and get the right fit.
Focals will be offered in a variety of styles at $999.
Smart glasses have been emerging and dying for a while now. Google Glass and Intel’s Vaunt both shut down in 2015 and 2018 respectively.
What makes Focals different? The focus on design and style more than the geek outlook of Google Glasses might be a compelling point. Focals are voice activated, but their first selling point is for the technology to be there only when needed, otherwise looking as regular glasses. They are not advertised as a technology, geeky gadget, more as helping companion.
As it often turns out in technology advancements, timing might turn different for North glasses.
Waiting next time I go to NYC to visit the store and try the Focals. Let me know what you think on Twitter @voicefirstlabs or Instagram at voicefirstweekly. I’m Mari, this is VoiceFirst Weekly flash briefing, have a great day and I’ll talk to you all tomorrow. We have an special episode tomorrow with the first human guest in the show. Don’t miss it. See ya.
Survata’s September survey of 2,000 smart speaker owners in the US came with one surprising finding: Apple HomePod owners are more likely to be receptive to audio ads than anyone else.
According to the Survata data, as reported by BusinessInsider:
This shows there is still a large chunk of people who don’t want to hear ads on their smart speakers, suggesting it’ll be an unpopular move if anyone introduces sponsored content any time soon. It’s also unlikely Apple would venture into the sponsored content territory, given it has shied away from targeting ads at users.
Survata market research president Dyna Boen explained that anomaly:
While adoption of Apple HomePod has thus far lagged behind Amazon Echo and Google Home, and thus makes up a smaller percentage of the sample, we still are seeing that these users are saying sponsored content ‘very positively impacts their smart speaker experience’ at a statistically significant level.
As they say, sometimes is better to wait to report on some news. I feel the episode of Bixby on Wednesday could’ve wait until yesterday, when I was going to the Samsung Developer Conference and will get more context and details. If you didn’t listen to yesterday episode for some reason here’s the summary: Samsung opened Bixby for developers, we were part of the developer Beta program and VoiceFirst Weekly now has a capsule. The words game changing were said. Perhaps, I didn’t completely understood my own words on Wednesday. When I was at SDC yesterday, I realized, Bixby is definitely and completely going to change the voice game. You might ask, didn’t you said you developed capsules for Bixby already? Yes, we absolutely did. They are in the Bixby showcase page. The thing is, as I said, Samsung might be rushing it a little to enter the race. As such, some things I saw first hand at SDC were not promoted in the documentation. Maybe they wanted to unveil it during SDC. The truth is I saw this camera putting some makeup on my face, then showing me a list of the same lipsticks or mascaras and then showing a list of places where I could buy them, right there. I saw Bixby recognizing a bottle of wine and showing reviews, prices. Read signs and translate them right there from the camera.
This was all part of Bixby Vision a Samsung S8+ app (some features are only S9+) powered by image recognition. I have read reviews that sometimes Bixby is not as accurate as other smart assistants in speech recognition, but all these features, combined with the ability to learn is a powerful point in favor of Bixby. All of that is now open for developers to interact with.
Among the other announcements in SDC was the coming Marketplace for Bixby capsules in 2019 and the expansion to five new languages in the coming months. I think I’m saying Samsung might have figure out multimodal commerce right there in the faces of everyone. Without AR or VR, just the camera. Certainly Bixby is here to change the game, plus we can not ignore all the phones and appliances Samsung makes. They even have HARMAN, the market leader in connected car solutions.
Bixby is coming to everything.
Here is a video of my interaction with Bixby vision:
Samsung announce yesterday at the Developer Conference in San Francisco that the Bixby platform was open to developers. The Bixby developer Studio was until so far in private beta. Nersa my cofounder at VoiceFirst Labs and I were lucky enough to be in the beta program and contest for the creation of the first capsules (Bixby voice app). I heard that name might change and I’m happy for it, capsule is definitely not a good name for a voice app.
We developed two capsules, with the intention to understand the platform, one for number facts and the other for getting episodes of this show. You heard correctly, VoiceFirst Weekly flash briefing is already available in Bixby, yay!
Experience creating for Bixby
The developer experience still fills a little raw, they clearly need polishing in the platform with the documentation and such, I feel they are in a rush to open up the platform, with the clock ticking.
The capsules were created in a weekend or less, after watching some of the videos provided and then following the documentation. It means it’s relatively straightforward to start creating something for the platform. And the documentation geared towards developers, but we found it pretty useful.
The good, the bad and the ugly
The good part about the platform is the ability to remember an answer or similar answers by instruction, that’s a pretty sweet deal that in short means you don’t need to put all the utterances for an intent. It learns. I really liked that. The way you build the capsule itself It’s also a different way to develop voice apps compared to Alexa or Google Assistant. The IDE was decent enough, it felt smooth.
The bad is the maturity of the platform. Is definitely not at the level of the likes by Amazon or Google.
The ugly, as far as we could see, and we tried, the platform is more focused towards visual interfaces, and it does not reproduce audios. As we were trying to get the audios of the show reproduced directly by Bixby – my expectation was that it was going to be similar to the cards in Google Assistant – we quickly hit a wall. I’m sure they are gonna correct that, but at this point it feels a little outdated already.
Bixby platform and the developer studio might be a game changer in the smart assistants race. The Bixby team have a different, novel idea on how an assistant should behave and I’m expecting the competition to only be good for the voice ecosystem overall.
If this catches up, Samsung will have the “phone advantage”, in their case it’s not only phones but all kind of appliances. The possibility to instantly have their platform on all this devices, without having to convince users to buy a smart speaker. Although they did released the Galaxy Home a couple of months ago, and for sure the whole Bixby ecosystem will work there as well. All in all, exciting times ahead.
This is VoiceFirst weekly flash briefing. My name is Mari, as always you can find me on Twitter as @voicefirstlabs or Instagram as @voicefirstweekly. You have a great day and I’ll talk to you all tomorrow.
P.S We will be at the Developer Conference today during the announcement of the capsules contest winners. Expect live updates on Twitter.
Today I’ll talk about one of the indications that more companies are jumping into the voice bandwagon: the Guardian announcement yesterday: Voice Lab.
According to the press release in the Guardian website, the goal is to bring the authentic voice of the Guardian to Google Assistant through experimentation and innovation.
The Guardian is looking to create immersive interactive stories leveraging emerging technologies.
In partnership with Google, our dedicated multidisciplinary team of journalists, developers and designers will create and test innovative voice-driven audio experiences for the Google Assistant.
There are no examples for now as the article exhorted to come back to the page for updates. However, it’s a relevant announcement for media companies. This is not the first time that I talk about media companies leveraging voice technologies and experimenting with audio and smart speakers, BBC, NYT and HuffingtonPost Canada, either have voice applications in one of the leading platforms and a lot of other media companies have flash briefings.
I’m definitely excited to see how this new combination of mediums changes journalism and storytelling.
Global web index released last week a report on Voice Search: trends to know, a deep dive into the consumer uptake of the voice assistant technology. I summed up the main points.
The report tackles 3 fundamentals:
Here are the key takeaways in each of them:
Growth prospects for voice
Consumers have a wide range of voice-powered search services at their disposal. From Siri to Cortana to Google Assistant to Baidu DuerOS. Voice enabled smart speakers and voice assistants on mobile are the primary interfaces consumer use to engage with voice tech.
According to the report, the demographics of mobile voice users by percent who used voice search or voice command tools on their mobile in the past month is:
Ages 16 – 34 a combined 66%.
The important trend in the chart is how mobile voice is driven by younger internet users. More than 2 in 3 Mobile Voice Users fall within the 16-34 age bracket, giving us a clear indication of the trajectory of growth in the mobile voice market.
From a market-by-market perspective, the Global web index data shows that mobile voice search is being driven by Asian markets, with the strongest figures by Indonesia (38%), China (36%) and India (34%).
One of the biggest obstacles smart speakers have faced is in convincing consumers that they are an essential rather than a nice-to-have device.
This is in part being erased by third party applications for consumers, giving brands the power to engage or sell in a convenient way. Expedia is the most recent example that I featured here a few episodes back.
Similar to mobile voice, ownership and intent to purchase is concentrated among younger age groups, but it’s still a significant number of older consumers who say that they plan to purchase one of these items in the future. Clearly, there is a widespread awareness of how these devices can bring value into everyday activities which spans across age and income groups. A key factor in increasing this awareness has been aggressive promotional and discount periods during holiday seasons – especially from the likes of Amazon – in ensuring that these devices are available even to the more modest budget.
Another key takeaway from the report is the prolonged interest in smart speakers as they approach their fourth year on the market providing a promising outlook in the longevity of consumer uses cases of these devices.
The Consumer Privacy factor
The final section of the report outlines the consumer privacy based on the user skepticism that they are being recorded all the time. Concluding that:
The balance between convenience, privacy and security for new technologies like voice search often rests upon brands being transparent with their customers.
Global web index outlines social, transparency and affordability as the main implications for the future of voice tech. For the consumer research company Amazon is clearly ahead of the competition and that should serve as a warning for both new and traditional competitors.