In our last episode, we talked about the arrival of the “voice-first” movement, covering the current landscape for voice technology and some of the immediate implications for marketers. In Part 2 of this series, we shift our focus to the future.
In this episode of The Marketing Remix, Ron Hadler, Sr. Director of Marketing Technology, and Ross Briggs, Manager of SEO, discuss the future of voice technology, the impact of voice-commerce, ways to future-proof your website – and finally, answer the question “will voice kill the website?”
Voice-Commerce: Has voice become a sales channel in its own right?
Ross: 2017 saw $1.8 billion worth of e-commerce sales through the voice channel, primarily driven by Alexa. That's the closest contact to that purchase ability. That number is going to grow over the next few years upto $40 billion worth of sales, and that's really just looking at direct purchases that take place via voice. So, I think the bigger implication is really, as consumers get more used to this technology and it becomes more integrated into their lives, they start to also do product research, and that has a big implication for that final purchase that might take place online via desktop or mobile, or maybe in person at a brick and mortar location.
Reid: Yeah. So I think that people are, naturally gravitate toward the Alexa app and say, "Hey I'm just going to reorder something or something." That's pretty easy to be doing, but it's a little different behavior in some of the other devices. Last time we talked, last episode we talked a little bit about the differences between an Alexa device and a Google device or maybe even Siri. How do we see all the different aspects of voice tech evolving into the future? How's it different device or device, watches, phones and the Alexas of the world, I don't even know what you call that, I guess, home speakers of some sort, how is that evolving, you think into the future?
Ron: Just to go back a little bit to commerce, I think one of the things that that's sort of evolving with, because even as good as Alexa is for e-commerce, I don't know who are these people who are buying things without seeing them. And it's like... 'cause even like pair of socks, it's like, is that scratchy, is that soft, is that thick enough? All these things that you just can't just do through voice, no matter how good it is. So I think that really the evolution of preparing a screen with these speakers. And so it becomes no longer just the voice, it becomes voice as interface. So you are navigating the digital world via a voice navigation. So that's really kind of one of the ways that we're moving forward with voice is that, the interface is voice.
Ross: Sort of Star Trek, not Star Wars.
Reid: Oh, no. That's a whole different battle.
Ross: And we've got these devices around us right now, and some people have the Google Home in their house, and we've got voice interaction with our phones, but as we see this growth of the Internet of Things, there are connected devices all around us. And maybe it's your refrigerator in your kitchen that can then take a shopping list by you speaking to it. It can be your watch. As you mentioned, it can be interactions within your car as you're driving. Healthcare is a huge place where Internet of Things is taking off, one example being in home elder monitoring, and making sure that if they fall or something like that, that they can be taken care of. And so as more of these surround us, and people become more comfortable with voice, it's easier just to make that natural interaction, and not say, "I'm going to go to my computer and do a search." You just start speaking and the search is happening around us. And I think one more future-looking thing when you see people wearing the virtual reality headsets, they're not typically sitting on a computer doing it, right? They're interacting with the world around them, naturally, and that uses voice.
Reid: Yeah. Well, it's interesting, getting comfortable with it, because some of the awkwardness I've experienced is the people around me, if I'm going to have a conversation with the device. And that's another thing, I think people... We've started to humanize our devices a little bit more. I mean obviously, they put names, like a Siri or Alexa or something that's tried to make it a little more human. I think the voices themselves are getting more and more human and less computery. So perhaps that's some of it, is just a cultural norm and how that's going to evolve and change.
Ron: How many times have you said thank you after you've gotten an answer from the speaker. [laughter]
Reid: I was taught to be very polite, so almost always, I think I say thank you. [chuckle]
Ron: It's one of those interesting things. And so, in understanding that, they're actually evolving the voice devices, to understand politeness, and because what's really popular is these devices with kids.
Ron: And so, they're now being trained to be more polite, in interacting with us, because the computer right now doesn't really care whether you're mean nice, angry, it's just going to give you the answer. And so being able to actually enforce good behavior is now happening through the interactions.
Reid: That'll be interesting, I wonder if it'll start to infer and understand tone. We haven't really talked about that and maybe that's where some of the stuff will be heading to. If the urgency with which someone is asking a question or some of the background environment or things we tell... Last time we talked a little bit about use cases. Something happening in the gym is different than... So from a geo-location standpoint, than something that happens in a kitchen. So these devices are obviously getting smarter and smarter, and voice being the interface to have that conversation.
Ross: One of most exciting things about voice, and I think that it conveys so much more. And I don't care how many emojis somebody puts on or something, it doesn't convey emotion. And so voice...
Reid: My emojis do.
Ross: Voice does convey emotions very, very well. Whether that's inflection, volume, and all of that comes through very, very closely. And so it's really a lot of those points where you can get sentiment personalization via voice. So understanding and delivering content whether somebody's coming and asking for content, whether they're angry, happy, sad, or excited about your content.
How can brands “future-proof” their website (and content) for a voice-first world?
Ron: See, and I got into that a little bit in the very end of our last podcast, and tried to talking about content engineering. And really this is the concept of really applying engineering principles to your content, and kind of the reason you do that is two-fold, is, one, just well-engineered content, it's easy for a content publisher to get it out there. Which means friction-less publishing, that's really important in this day of fast turn around things. But also at removing your styling, all of those different things and HTML that comes in with content, a lot of content measures, they tend to kinda mix that, so well-engineered content removes that. And so, what this does is you're just going to get pure content. And the advantage of having pure content is it then is going to serve not only in a voice-first world, but whatever the next iteration beyond voice, 'cause we were doing this before we had mobile then we had a watch then voice and whatever's next. I'm just thinking, and it comes to me, it's still delivering the content is really kind of what it is, and so having content by itself is really it.
Reid: Can you break apart a little bit, the word content? And when you say stripping it all the way, what is left when you're talking about content? It's not just words or is it just words?
Ron: Not just words, but it's really kind of... Just use the word content type, so robust content type. So let's just think of a blog article as the content type. So a blog article has a title. It has the body, it has maybe a picture associated with it, it might even have a summary associated with it. So that's pure content. Now that body does not have HTML in it. Now it can have things like bullets or it can have things like paragraph breaks, that stuff actually translates the robots that speak and understand that, will understand those sort of things, but it won't understand HTML like header, or div, those things will really mess up your content. So making sure that you can deliver pure content is part of future-proofing, because then no matter what form the Internet takes in the future, your content is able to be delivered.
Reid: So that breaking it apart like that then changes maybe it's discoverability, or how people search for it and get it back. Ross, maybe talk a little bit about future-proofing it from that perspective.
Ross: Yeah, I think one of the things that we have to be aware of is sort of as these new technologies pick-up and gain popularity, there's often a lot of hype around them, and sometimes the hype pre-dates the actual impact that comes later. But what Ron's talking about with quality content engineering, that's something that's beneficial for your website now. But also future-proofs you for voice interactions, later. And sort of one thing we talked about in the previous episode, was those featured snippets, and using structured data on your website to win more featured snippets. So there's a few new types of featured snippets that were recently approved by Google for usage. One is an FAQ, a way to mark up that content, for here was the question, and then here were the submitted answers. Maybe it's through a forum or could just be FAQ like customer service questions, things like that. There's also a How To schema markup that you can leverage, and then there's also a Q&A. So using these now can win you featured snippets in search, and can get you improved click-through rate on mobile and desktop as we've been talking about, but also future-proofs you for voice and so it's sort of a two-prong strategy that improves how you're doing now and future-proofs you as well.
Reid: So that's interesting too, now as we're adapting this, the devices have changed, Ron on alluded to the shift from websites all the way down to the form factor of a watch or something to that effect. And we started this whole conversation around commerce and how it moves toward transactions. So how do you guys see this moving more toward transactions via those different kinds of interfaces? I mean, is it we're doing a search and then there's a sequence of events that may happen that ends with this kind of voice response mechanism, where it's more like a conversation. Today it feels a lot more like I ask a question, I get a response and I'm done. It gives maybe some further content, but what's this dialogue start to look like?
Ron: That's the exciting part. And maybe I'm going to steal Ross' thunder here, but Google just this summer had at their IO had a duplex, which is now rolling out to six cities here shortly. And it is the ability for an assistant to start making calls and making appointments for you. And they demonstrated and you can find this up on YouTube is where an AI robot dialed up a hair dresser and made an appointment. And the interaction was so real that the live human on the other end, taking the appointment information did not realize it was a robot speaking, and in making 'cause there was pauses, there was 'uhms', and so on. Now for some place like California, here where we're at, it is illegal or against the law to record conversations, and the only way that robots can actually respond to you is they record what you say translate it and then respond based upon their translation of that. And so in California, it'll be like, "Hi, I'm Jim. Ron Hadler's AI robot, and I'm here to make an appointment for his beard."
Reid: And then you have to get approval that it's okay to record that call and have that dialogue?
Ron: They will have that, that kind of confirmation.
Reid: And I assume that then that gets into then trying to make changes to policy or something as from a government standpoint to make sure that this stuff can continue to progress or people will just ask for forgiveness. Do this anyway, and they'll, Google will ask for forgiveness later as consumers have adopted it, and love having it that way, so.
Ross: And to jump in on the commerce piece of it, as we're talking about this evolution from just a simple voice search application to more of the idea of personal assistance, I think as the personal assistant learns more about you and sort of understands your shopping behavior and you're looking to plan a trip. Okay, the personal assistant knows. Do you pick the cheapest hotel or do you pick the resort-style hotel? Are you looking generally for a first-class plane ticket or are you happy with Coach? And as that sort of learns more about you, and knows your preferences, and we start to layer in things like AI and machine learning to sort of save that information, people become more secure with the purchase that's happening on their behalf through these personal assistants. And that's actually one of the biggest hurdles to get over, is people saying, "Okay, well, I'm okay with interacting and doing skills and asking questions, but I'm not so sure about making a purchase, because that's when my wallet's involved. All of a sudden, I'm thinking, "am I going to get a 30-pack of paper towels or am I going to end up with 300 paper towels?" That's probably overkill. But once people do make that initial first purchase and they see that it does work out, they're much more likely to make future purchases versus that first one.
Is voice technology going to “kill” the website as we know it?
Ron: No, but artificial intelligence might.
Reid: Oh, okay, I knew there was a catch.
Ron: Voice is going to be amplified, and Ross can speak to this, is kind of like when mobile came on, and how that was affecting websites and web searches. But really with voice technology you're not going to see the death of the website, but if you are prepared for it, you will take advantage when AI comes along, because people will probably stop visiting websites, 'cause they will not have to. And your assistant will go and get curated content for you, and if your website is structured and such that they can find the right content, they'll bring back and they'll curate it, they'll take that content and mash it with other things, giving the person who owns that robot choices about what they want to do, they'll answer those questions. So it's curated content delivered by AI is what will kill the website as we know it today.
Reid: Then they would receive, the way I'm hearing it then, receive it the way they want to receive it. So if they're trying to compare products together, they may receive content from three different websites or something, and see them in some form of comparison, that those three websites independently didn't do, they didn't produce some comparison tool or anything like that, but the tool itself, the interface, whatever that screen may be as we're starting to see obviously the screens, they'll start to produce it. And that way, that's where, I think, content engineering what you're talking about allows some of those things to happen. So you get to be one of the three considered, because you've structured content in that appropriate way.
Ron: Absolutely. Because yeah, you may want to only consume content via videos, and then your robot will put together all that content into a video using the content they found. Whether you only want to see an infographic, they'll put those things together, you will get curated content that's kind of the future.
How will voice technology impact search intent?
Ross: One thing Ron was sort of alluding to is when mobile came about, people said, "Oh is desktop search over? Is it dead? Is it going to go away?" And a lot of people assumed that it was. It's like, "Okay, well, now you have this thing in your pocket, you don't have to go to your laptop or your desktop computer to do the search any more." But what actually happened is that people started doing new types of searches via their mobile device. So they're asking about location or taking directions to somewhere that they can drive, things like that. And what happened is actually that desktop remained pretty stable, and then mobile grew on top of that. And it started to take more and more people's everyday lives, and put more of that time spent online. And as we move towards voice, we're going to have more time spent online, but it's not going to be the same type of engaged screen experience that we're used to. And so there still very much will be desktop interactions, there will be mobile phone interactions and voice is likely to grow on top of that with new intents, new situations where that's the better way to do things, where you couldn't use a laptop device before, now you can interact via voice and stay connected.
Reid: Yeah. And so that's how people get to spend 36 hours a day on a device, something to that effect 'cause they've got many devices in front of them.
Ross: Have you been looking at my Screen Time report?
Reid: That's right, that's right. Well, it proactively was served to me by my virtual assistant.
Ron: Yeah, I'm waiting for the glasses to come up, 'cause I think that's really where that marriage of is, your computer is talking to you in your ear, and your video, your monitor are your glasses. So there again, it's another yet screen that we have to develop and think about.
Reid: Yeah. And it's just all enhancing. I think one of the things you were talking about earlier is that idea of thinking about an assistant and some of that is going to change a way a lot of the ways this is doing that. Then layering AI and all this other stuff, it's all converging. I think when we talked about when mobile was a thing people talking about, "Oh, the mobile agency or a mobile who would do... Whoever build the mobile site." And I think the way we were thinking about it back then was the web is mobile, so it's still just the web, and I think that's what we're saying here, is it's still content. So now it's still people want to do things, they're looking for things, they want to buy things, and this is a way to engage interact, where AI is probably that one thing that's probably a little different than everything else, 'cause then it's getting proactive, it's changing a lot of stuff compared to the inputs or something like that.
Reid: This is such a fun conversation. I think there's a lot more to be had and I know that we're going to discuss more of that, 'cause we got a forthcoming voice-first webinar. So if you want to register for that coming up, it's at reddoor.biz/learn, and while you're there, check out show notes from this episode. More as always, subscribe to this show, leave us a review on iTunes. But guys, thanks for joining us, Ross, Ron, really appreciate it, and I look forward to hearing more about this stuff in the halls of red door, as well as then on the webinar later. So, thanks guys.
Ron: I really appreciate it. Take care.
Ross: Great chatting with you.
Like what you hear? Subscribe to the show and leave us a review on iTunes.