After experiencing all of the varied communication systems – visual, audio, written, spoken, tactile, light-emitted, emotionally responsive – embedded into the Toyota Concept-i unveiled at CES this year, we came to the conclusion that we may have entered an automotive Age of Babel. As we spend more time in our cars, and as we become accustomed to always having fingertip access to the world's fount of knowledge (and crowd-sourced falafel recommendations), expectations for what our vehicles should be able to do have outpaced our capacity to fluidly and safely process them.
"I think in cars, driver augmentation is certainly more practical than self-driving in the short term." – Bret Greenstein, IBM Watson
We're not the only ones who feel this way. General Motors, Ford, and BMW agree that new modes of information organization are required. That's why they're partnering with IBM, among others, to work solutions based on artificial intelligence (AI) systems that can interpret and respond to natural voice-based commands. IBM has deep roots in this protocol, having been developing its proprietary system, Watson, since early in this century. (Remember when it competed, and won on Jeopardy!?)
Why is this the way to go? Because, hype and hyperbole aside, cars can't drive themselves, and they probably won't be able to in any meaningful way for some time. And natural speech processing is probably the only input and output mechanism that affords the opportunity to process the quantity of information, and offer the kind of assistance, that occupants currently desire – without looking away from the road.
Bret Greenstein, IBM's vice president for Watson Internet of Things, and 28-year veteran of the company, concurs. "I think in cars, driver augmentation is certainly more practical than self-driving in the short term," he says. "Though I think it's inevitable that there will be a moment when it is safer for a computer to drive than a human."
In the nebulous interim before the future arrives, how will a natural voice-activated driver augmentation system function, and how does it work? The basic idea is to teach (and keep teaching) a computer how to recognize meaning in the spoken word. This is defined as not just the capacity for the machine to comprehend and process a person's individual words, and what they represent when strung together. It also involves providing a means for the AI to recognize and respond to deeper context.
"Most people focus on the tech, and that misses the point," Greenstein says. "AI is really much more about understanding people's intent, goals, and objectives, which requires not just listening for commands for the car to do this and that."
"The goal is to make Watson comfortable, informative, and not invasive. But part of that is deciding what your purpose is." – Greenstein
Greenstein points to his lab's work on the deciphering of higher-level speech-embedded meaning – like that affiliated with personality, tone, style, and emotion – and tailoring an individualized AI interface that takes all of this into account. "We have a service called Watson Tone Analyzer that's based on choice of words," he says. "Some people are assertive or aggressive in their tone, others are more laid-back, and they want a user experience based on that."
Similarly, Watson has another service called Personality Insights, which can decode and respond to your word choice, the speed or volume of your voice, or your level of intensity.
"The goal is to make Watson comfortable, informative, and not invasive. But part of that is deciding what your purpose is," Greenstein says. "If you're a security alarm, you have to be urgent and quick. In other settings, you want something calmer."
And herein lies the greatest challenge of a system like this. Automotive designers have been working in the interface paradigm of buttons, screens, and touch for decades. The design of voice-based systems does not necessarily draw on the same skills. "There's an elegance to a properly designed voice interface, an efficiency of feedback," said Greenstein about the softer more anthropological science of dialogue. "Not long drawn-out responses, but the ability to provide alerts and notifications without being intrusive, while being informative at the right time."
Expect to see Watson rolling into dealerships, or "mobility" services, very soon. "With Local Motors, they have certain commercial deals that they're working on, certain products, so we'll see it in Local Motors this year." Greenstein says. "We have a formal relationship with BMW for dealing with a cognitive assistant in a vehicle. And [GM CEO] Mary Barra announced that Watson will be in GM vehicles with OnStar in 2017 where it will act in addition to the OnStar operator."
Like many current infotainment systems, the cognitive assistant will operate via cellular data services. But the cars that feature them will include multiple antennas. Greenstein says that the upcoming BMW integration he mentioned will feature six or seven different cell radios, necessary since most of the data exists in the cloud.
"As coverage gets better, but not perfect, there are still gaps. But just like GPS systems, there's some data that's cached and useful. And there are other things where you must be live, so it's inconvenient to be out of range," Greenstein says. Simple tasks like using a voice command to change your radio station can be accomplished without cell access. But most of the systems' value comes from context and other data. Since advanced driver assistance features now rolling out require constant connectivity to function properly, Greenstein feels confident that any gaps will soon be filled.
For Greenstein, these challenges – and those of information overload – feel surmountable. This is based in part on his decades-long work with a tech-forward company like IBM, and thus his understanding that our ability to relate to technology is contextual and transient, and subject to change.
"We all have a different tolerance for this stuff," he says. "And we're all adjusting. Two years ago, or four years ago, you looked like an idiot talking to your house to try and turn on the lights. But now, it's fine."