Episode Archives

The Revolution will be synthesized

Synthesized media is a category surging with the latest development of synthetic voices and speech to text technology. It’s not new, but is emerging with the development of Machine learning algorithms. Synthesized media refers to information types synthesized by the computers (e.g. Text, graphics and computer animation). They must be computer-controlled.

Synthesized media today:

We are not that far away from that if we look at current examples of synthesized celebrities:
Lilmiquela on Instagram takes photos with influencers, and has more than 1 million followers, who don’t seem to care about her existence.
Japanese pop star Hatsune Miku appears as an hologram in concert venues. Thousands pay to see her “live”.
Other examples of the advances of synthesized media includes
Lyrebird, a company that created a service that listens to your voice for five minutes and then can sound like you saying anything. Lyrebird published a video almost a year ago of Donald Trump voice synthesized in their social media. There are other companies that offer this service (the banking of your voice) as well and you can check it out in out text-to speech services episode.

Another example, researchers at the University of Washington used AI to synthesize a video of President Barack Obama speaking based on footage from his weekly addresses.

Debunking of synthesized media, but wait

The world federation of science journalists published a tweet earlier in May calling for the development of “Robust processes for debunking of synthesized media”. The article was based on Obama synthesized video and warned against the dangers of deep fakes. But they also highlighted an opportunity for the media and quote:

The media itself is a simulacrum of reality, in which each selection, edit, highlight, or turn of phrase shapes the audience’s interpretation of events. What’s new here is that media-synthesis algorithms further fracture any expectation of authenticity for recorded media while enabling a whole new scale, pervasiveness, potential for personalization, and ease of use for everyone from comedians to spies. Faked videos could upset and alter people’s formation of accurate memories around events.

Will the future be synthesized?

What will happen when nothing seen online can be trusted. What will happen while we ride the wave of authenticity awareness gap?
According to CJR it’s an opportunity for media to ramp up training media forensics techniques. However much of this technology is still away from cheap availability that would make it practical for reporters.

Media forensics is new for me as it’s probably for a lot of you as well. There are some sources to read on the episode notes at voicefirstweekly.com/flashbriefing/82 (yeah this one). It’s about following a rigorous process to ensure what’s published is authentic. However, as I was reading I wonder, there is so much new media being created every day that this process of media forensics by journalists might just not be enough. What if we decide to surrender to the fact that we will have synthesized media? What measures will we take, will we become more tribal with a trusting authority discerning information for the group?
And what if our future news anchors are synthesized videos combined with text to speech services? The Hatsune Miku of news.

Thank you for listening. Remember to subscribe, like, comment and share this episodes. My name is Mari, and you can find me on Twitter as voicefirstlabs and on Instagram @voicefirstweekly. Thank you for listening and you have a great day!

Every platform company should have evangelists. Start your week in voice with this

During the 2018 IFV conference in Berlin last week, Amazon announced that Alexa has 50 000 skills worldwide, 20 000 devices and it’s used by 3500 brands. Big numbers, what does it tell us, though?

Precisely last week, but before the announcement, I did an episode on voice space fragmentation. The main highlight relevant to the latest announcement by Amazon that I mentioned there is that Amazon has the developer ecosystem and no one is close doing the education and evangelist work Alexa team is doing with developers. And that it’s playing with the number of skills worldwide and that will continue to play out. Today, as companies are looking for developer attention, every company should have developer evangelists, no HR or sales. This is not a new concept, it’s just taken to a different level. And the biggest companies get this, but it’s not enough to put some tutorial out there, it’s the careful work of listening and responding to developers concerns, educate, and relate to developers at a human level in social media. I understand not all companies have the resources to pull this, but the future voice space is going to be more fragmented and more companies will try to stand out. The authenticity of your interactions with the users of your platform is going to be (for Amazon is) cornerstone.

Developer attention is the currency today for platforms

Amazon Alexa team is currently doing that better than any other smart assistant platform (I will argue better than anyone else). And it’s driving hundreds of thousands of developers from more than 100 countries, even those where Alexa is not available yet. As a platform the goal should be to attract people to build on top of it. Developer attention is the currency today for platforms, especially worldwide available platforms like smart assistants.

More than 3500 brands are in Alexa

For all the above, is natural that brands are landing their voice strategy on Alexa first. Again, that will continue to be the case while the more developers are in the platform. That includes a tool ecosystem that it’s bigger for Alexa, tools also start providing availability in Alexa first. So if you are a brand looking for a shift of your digital strategy towards voice, Alexa should be starting point for this remarks alone.

Google Assistant gets bilingual support

The other relevant news for you starting this week is the Google Assistant bilingual support. As you guys know I’m one hundred percent behind internationalization and translation as key elements (I even went ahead to name it one of the infinity stones for voice) for voice apps and naturally this is a great news to hear. It also paints the picture for future development and set users expectations to a new level. Once users get use to the feature it’s going to be expected from every assistant. In the video promotion you can also see how it’s remarked the use for learning new languages or teaching kids new languages.

Apple announced their event for September 12

Last but not least, I’m also excited for the Apple event announced for September 12. I just want to see what they are gonna come up with, especially with Siri, the shortcuts app and any other conversational development.

Before wrapping up, I want to thank you from the bottom of my heart for listening. The number of listens of the briefing are growing consistently and we are reaching people from every latitude of the planet. Thank you. Now go ahead and continue to make this happen by sharing this with someone that needs to know about voice platforms.

Videos demonstrating Google Assistant bilingual support:

  • By Google itself
  • And this is a video published by Tobias Goebel, VP at SparkCentral.

Alexa’s Contact and motion sensor APIs use cases

Hello there, Happy Sunday! I hope you are having a nice, relaxing day.

Earlier this week the Alexa team announced the availability of Contact and Motion Sensor APIs and Integration into Alexa.

Why I think this is important? For the cases that are not being espclifically advertised for the feature: like physical impaired people improved access to home utilities. For caregivers of kids or older adults and patients. The contact and motion sensor API also allows to connect to the Routines in Alexa, automatizing even further all this tasks. The future of home automation is more connected every day. So I’ll do an episode in the next coming weeks about the main players working home automation.
Here are the main use cases as featured in the Alexa blog post:
Customers can view their connected sensors in the Alexa App, query their status by asking Alexa, and use sensors to activate Routines to control other connected smart home devices, say special phrases, play music, receive notifications, and much more. Customers can use their sensors to automate a wide variety of custom-built Routines, such as:

  • When motion is detected in the living room, Alexa can turn on the lights, and then turn off the lights after 30 minutes with no motion detected
  • A motion sensor within a Wi-Fi camera can turn on a light and send a notification to your phone
  • You can ask Alexa if a door or window is open before arming the home security system
  • A front door contact sensor can activate Alexa to announce that the front door is open
  • A pantry door contact sensor can turn the pantry lights on and off when the door opens and close

That’s it for today.
Remember to subscribe, like, comment and share this episodes. My name is Mari, and you can find me on Twitter as voicefirstlabs and on Instagram @voicefirstweekly. Thank you for listening and you have a great day!

Automatic transcription to video and audio files stored in OneDrive

Among Microsoft announcements this week is a number of new features coming to OneDrive for Business and SharePoint that will use AI and machine learning technologies to manage and collaborate on content stored in those services.
Starting “later this year,” Microsoft will be adding automated transcription to video and audio files stored in OneDrive and SharePoint. This transcription will use the same technology that Microsoft uses in its Microsoft Stream business video service. OneDrive and SharePoint video and audio files will become fully searchable thanks to these transcription services.

Microsoft is providing developers with natural language processing tools and cognitive services

One of the things I noticed it’s that even if Cortana is not making a lot of noise in the smart assistant space, aside for the integration with Alexa, Microsoft has been providing great developer tools for language understanding and processing.
Microsoft has an enterprise focus, and the work of automatic transcription can be a real deal for the content needs of many organizations, let’s say they later add translation to 3 more languages and it’s a game changer for collaboration in the enterprise.

That’s it for today
Remember to subscribe, like, comment and share this episodes. My name is Mari, and you can find me on Twitter as voicefirstlabs and on Instagram @voicefirstweekly. Thank you for listening and you have a great day!

VoiceFirst infinity stones

I saw this image of a voice gauntlet by Mark Tucker on Twitter. Are you fans of Marvel? I wouldn’t say I’m a fan of anything cause I don’t lose it over teams or companies, but if I have to be a fan of something it has to be Pixar and or Marvel. And for those who have seen their latest biggest success movie, Infinity war, where all the cinematic universe click together like a big puzzle, then you heard about the gauntlet and the infinity stones. In this episode we take you to the wonderful world of the VoiceFirst infinity stones.
Let me remind you first that there are six stones in the Marvel universe:

Soul, reality, space, time, power and mind.

The VoiceFirst infinity stones

Now for VoiceFirst, that will translate as: Personalization is the soul stone, monetization is reality stone, discoverability is the space stone, convenience is the power stone, retention is the time stone and context and memory is the mind stone.
With monetization, discoverability, personalization, convenience, retention and keeping context and memory, almost all of the voice puzzle come together as well.

There is another stone

But I think a piece is missing: the tongue stone: internationalization. Voice is the technology that can understand people instead of people understanding and adapting to the technology. And for reaching more users, to move beyond the English speaking countries, internationalization has to become a upfront strategy for voice.
Each stone has different challenges and a road to travel before we can say it’s solved, but they are the pieces that needs to come together for great voice experiences. I hope it does not takes us 18 movies to figure it out!
Go to voicefirstweekly.com/flashbriefing/78 to see the image and the tweet.
Send me fandom of marvel comparisons with voice tech if you have any, or send me any Marvel fandom stuff, I’ll promise to watch and read.
Remember to subscribe, like, comment and share this episodes. My name is Mari, and you can find me on Twitter as voicefirstlabs and on Instagram @voicefirstweekly. Thank you for listening you have a great day and we’ll talk tomorrow!