Month: August 2018

VoiceFirst infinity stones

I saw this image of a voice gauntlet by Mark Tucker on Twitter. Are you fans of Marvel? I wouldn’t say I’m a fan of anything cause I don’t lose it over teams or companies, but if I have to be a fan of something it has to be Pixar and or Marvel. And for those who have seen their latest biggest success movie, Infinity war, where all the cinematic universe click together like a big puzzle, then you heard about the gauntlet and the infinity stones. In this episode we take you to the wonderful world of the VoiceFirst infinity stones.
Let me remind you first that there are six stones in the Marvel universe:

Soul, reality, space, time, power and mind.

The VoiceFirst infinity stones

Now for VoiceFirst, that will translate as: Personalization is the soul stone, monetization is reality stone, discoverability is the space stone, convenience is the power stone, retention is the time stone and context and memory is the mind stone.
With monetization, discoverability, personalization, convenience, retention and keeping context and memory, almost all of the voice puzzle come together as well.

There is another stone

But I think a piece is missing: the tongue stone: internationalization. Voice is the technology that can understand people instead of people understanding and adapting to the technology. And for reaching more users, to move beyond the English speaking countries, internationalization has to become a upfront strategy for voice.
Each stone has different challenges and a road to travel before we can say it’s solved, but they are the pieces that needs to come together for great voice experiences. I hope it does not takes us 18 movies to figure it out!
Go to to see the image and the tweet.
Send me fandom of marvel comparisons with voice tech if you have any, or send me any Marvel fandom stuff, I’ll promise to watch and read.
Remember to subscribe, like, comment and share this episodes. My name is Mari, and you can find me on Twitter as voicefirstlabs and on Instagram @voicefirstweekly. Thank you for listening you have a great day and we’ll talk tomorrow!

It’s Thursday newsletter day, here’s a preview

It’s Thursday, newsletter day. Every Thursday, at 9:50 AM Pacific we deliver our weekly installment on voice technology. Subscribe to get content we don’t talk here, because it’s nature is longer or more thoughtful for the format of this episodes that are short. Here is wavenet E for you with a preview:

Hi there, this is Google Voice Wavenet E, weird name, but the host of this podcast can not talk about weird names, I can’t even pronounce hers. Here is a summary of branding in voice covered in today\’s newsletter:

More brands are recognizing the influence of voice. Today cases: Petco voice experiences, Why voice integration for brands is more important than voice search

Voice commerce considered next retail disruptor.

Yes, I do consider the future of media to be synthetic media. So we’ll talk soon.


Google GA of cloud text-to-speech wavenet voices, with a especial guest

Google announced the general availability of Cloud text to speech, new audio profiles that optimize sound for playback on different devices, enhancement to multichannel recognition and more.
Actually, to show you how meta we run at VoiceFirst Weekly, starting yesterday Cloud text to speech will offer multilingual access to voices generated using WaveNet, a machine learning technique developed by Alphabet subsidiary DeepMind. And this bit is generated from a wavenet voice for English US. WOW
Clouds to speech offers 17 new wavenet voices, in the episode notes you can find the link to all the voices available and languages. Wait, there is more. The company also announced Audio profiles, which were previously available in beta.
Audio profiles let you optimize the speech produced by Cloud Text-to-Speech’s APIs for playback on different types of hardware. You can create a profile for wearable devices with smaller speakers, for example, or one specifically for cars speakers and headphones. It’s particularly handy for devices that don’t support specific frequencies; Cloud Text-to-Speech can automatically shift out-of-range audio to within hearing range, enhancing its clarity.
The other features that are part of the announcement are multichannel recognition, language auto-detect, and word-level confidence.
Multi-channel recognition offers an easy way to transcribe multiple channels of audio by automatically denoting the separate channels for each word.
Language auto-detect, which lets you send up to four language codes at once in queries to Cloud Speech-to-Text. The API will automatically determine which language was spoken and return a transcript, much like how the Google Assistant detects languages and responds in kind. (Users also get the choice of selecting a language manually.)
Word-level confidence, which offers developers fine-grained control over Google’s speech recognition engine. If you so choose, you can tie confidence scores to triggers within apps — like a prompt that encourages the user to repeat themselves if they mumble or speak too softly, for instance.
In the monthly free tier users will have up to 4 million characters for the standard voices and up to 1 million characters for the wavenet voices.
End of synthesize voice.
Doesn’t that sound amazingly real. This is Mari, your usual host. I wanted to give you a real scoop of the Google wavenet voices. I tried several Wavenet voices and as the alphabet was going up, I felt the voices were improving and felt more natural. This one is wavenet D. Synthetized voices that feel natural like this one will have a huge impact.
Maybe I’ll let wavenet-D to host more episodes! Did it sound natural to you? Let me know what you think.
Remember to like comment and subscribe and tell me what you will like to hear me talk about in future episodes. Our Twitter handle is voicefirstlabs and we are voicefirstweekly in Instagram, shut us a comment there to let me know what you think or anything else really, I love to interact about this space. My name is Mari, I’m will talk to you tomorrow.

Here are the resources for the new wavenet voices:

Supported voices

How to create audio

Here is the pricing

And this are the other voices:


The next wave of chatbots

Hello, there! Welcome to VoiceFirst Weekly.

Three or four years ago chatbots were set to be the Next Big Thing. It hasn’t been so, and according to a recent report by VentureBeat, tendency is moving towards the side of rule based chatbots, those that are build on a set of predefined rules, also referred as dumb as opposed of those that depend on machine learning. The VentureBeat article highlighted how chatbots for a lot of companies have failed to provide the returns expected. Other evidence is how chatbot platform providers like Amazon are basing their chatbot platform in this rule-based model that’s easier to implement and easier to use.
Does that means that chatbots are dead? Certainly not. Let’s say that was chatbot 1.0 wave. And to all of the disappointment chatbot progress so far might have bring: As Bill Gates puts it, We always overestimate the change that will occur in the next two years and underestimate the change that will occur in the next ten.

The current state of chatbots have proved they are unable to handle specialized queries requiring knowledge outside the functional domain. The next wave of chatbots is going to be enhancing the capabilities to create a completely custom, differentiated experience by combining knowledge across relevant segments and provide better insights. According to PWC, this will give rise to a new level of conversational [banking] where results are delivered instantly through real- time conversation.
What will be the 3 main points driving the next wave for chatbots?
Number 1. Drive customer loyalty and brand awareness. When designed right, chatbots can add emotional power to the interaction with the user, enhancing captivation and loyalty. As sentiment analysis keeps improving, the next chatbots will be able to leverage the sentiment of the user to provide the best solutions for the context and the sentiment the user is in when interacting.
Number 2. Create a cognitive institution
A chatbot can be designed to respond to all kinds of requests and queries. It can become the insights database driving decision making with data based analytics for your company.
And number 3. Integration with present and future technologies
Having reached certain level of maturity, chatbots can now look at more integrations to leverage innovative technologies. Leading to a new set of use cases for chatbots that are not being considered today. ManyChat CEO has predicted that chatbots will transition from the early adopter stage into the beginning of early majority in 2018 and more than 1 million bots will be created on Facebook Messenger.
To the future of jobs to be done and a master assistant we seem to be moving, bots will not just interact with customers and human agents but, increasingly, with other bots in order to get tasks done.
Chatbots are leaving their infancy to enter a more mature stage in the next 5 years.
The links for the resources mentioned in this episode will be available at
Thank you for listening. Remember to like comment and subscribe and tell me what you will like to hear me talk about in future episodes. Our Twitter handle is voicefirstlabs and we are voicefirstweekly in Instagram, shut us a comment there to let me know what you think or anything else really, I love to interact about this space. My name is Mari, I’m will talk to you tomorrow.

What you need to know to start your week in voice

Hello, happy Monday! Welcome to VoiceFirst Weekly! What you need to know to start your week in voice: First

Shortwave, Google new experiment on audio

According to a release by The Verge, Google is working in an experimental podcast app called Shortwave. The app was discovered by the trademark filing which describes it as “allow[ing] users to search, access, and play digital audio files, and to share links to audio files.”

A Google representative said the focus of the app was on spoken word content and that the project being developed in the Area 120 incubator will help users discover and consume spoken-word audio in new ways. It’s an early experiment and they didn’t give more details.

This comes after Google released the Podcast app, which we made an episode on probably a month ago. It’s unknown what will be the difference between Shortwave and Google Podcasts, but it’s clear that the forefront of the company is AI, but they are betting hard on audio and a voice-first future.

GOV.UK gets a voice

As a government we need to approach voice services in a consistent way. That’s a quote from an article released in gov.UK blog. GOV.UK is incorporating voice assistants to their digital strategy. Smart speaker ownership in the UK is up 8% of adults, 3 points ahead in 2018 alone. For the team behind the work, conversations with Amazon and Google made it clear that many users are asking questions where government is the best source.

For GOV.UK, working on voice is an opportunity to meet the rising expectations of users and make government more accessible.

GOV.UK is designing for scale, for anwer government related questions, for consistency and for multi-platform. The site is aware of the current challenges, like privacy and identification that many government services requires. But they are also very aware of the advantages and playing wait and see for the present challenges, providing users on voice platforms with what they can offer today.

This is not the first time government service gain a voice, the city of Ozark, Missouri developed an Alexa skill by the guys at VoiceXP. But I do think is the first time at a national level. There is so much potential in government services for conversational interfaces that I’m sure, we’ll see more use cases like GOV.UK emerging.

Thank you for listening, you have productive week, an awesome day. Remember to like comment and subscribe and tell me what you will like to hear me talk about in future episodes. Our Twitter handle is voicefirstlabs and we are voicefirstweekly in Instagram, shut us a comment there to let me know what you think or anything else really, I love to interact about this space. My name is Mari, I’m will talk to you tomorrow.