Design

Speech is the next UX

“We’ve seen more progress in this technology in the last 30 months than we saw in the last 30 years. Ultimately vocal computing is replacing the traditional graphical user interface.” -Shawn DuBravac

Interface design enables humans to experience and interact with technology. Interestingly, Voice User Interface (VUI), is the ability to speak to devices and its capability, in turn, to understand and act upon users’ commands. 

Voice user interface: the next-gen of UX

Augmenting human intelligence is a lot more daunting than it looks. The difficulty of mimicking human cognition with software is showing Artificial Intelligence researchers that there’s more than one way to be “intelligent”. The rise of voice can be mainly credited to the evolution of AI and cloud computing capabilities. With machine learning and natural language processing, technology now has the ability to interpret human speech more accurately and in real-time, while also taking note of individual users’ speech tendencies.

This sans-hands method of interaction is rapidly gaining traction. With an approach that is more convenient and human-like, VUI is becoming the next generation of human-computer interaction. From asking Siri to book your appointment with the doc next Monday to asking Alexa to play your favourite show on Amazon Prime; the act of using voice commands has become increasingly natural for users.

At the Google I/O 2018 event, CEO Sundar Pichai demoed Google Duplex: A.I. Assistant calling a local business to make an appointment. The eerily lifelike phone call triggered a wave of intrigue and laughter in the 7,000-strong audience. 

 

Designing a Voice User Interface

Accurate natural language processing has until now existed only in the realm of science fiction. Voice represents the new pinnacle of intuitive interfaces that democratize the use of technology. However tech is still in its nascent stages and not the ultimate incarnation of the medium, but yet it’s currently a strong favourite.

For web and application designers, voice interaction, perhaps, is the biggest UX challenge since the dawn of the touchscreen age. Every voice recognition platform has a unique set of technological constraints. It is essential that you embrace these constraints when architecting a voice interaction UX.

The basic voice UX flow

UX was always designed to make interactions as similar to the real world as can be and voice has the potential to make that a reality. UX designers must make sure they’re asking the right questions to elicit the appropriate verbal responses from users. Gender, age, inflexion, tone, accent, cadence and pace are all elements that can be used by UX designers seeking to craft a particular customer experience with their brand.

Related Post

Below is the sample flow demonstrating the process of speech recognition

A more viable approach could be to prioritize and summarize the information based on known user preferences, prior to delivering an answer – in other words, doing what a normal person would naturally do in a conversation

More complex queries, at times, fall further off the cliff. Risking unpleasant interactions is something brands can rarely afford. Keeping this in mind, error messages could be crafted in a way that’s not only less annoying but also gets users back on track while presenting additional options.

Can we expect a ‘humane’ VUI?

In this age of expected instant gratification, it’s hard to imagine an average user patiently listening to their AI assistant as it narrates a laundry list of all continental restaurants one by one. We want our voice interactions to be as immediate as human alternatives.

VUI’s are extremely complex, multifaceted, and often hybrid amalgams of interaction. Voice interaction may not have garnered the same fanfare just yet. However, for the time being, the creation of a multi-model interface can ignite the furnace for an all-voice controlled interface. 

Will VUIs eventually become our primary means of interaction?

Let us know your views by commenting.

Fun fact

Celebrities are likely to find a brand new income stream from licensing not just their voices, but entire personalities as AI assistants. Sounds ridiculous? It does, but you can already pay about $10 to make your TomTom GPS nav unit speak like Snoop Dogg. Go for it!

Share
By
Tuhina Chattopadhyay

Recent Posts

AI Code Assistants: Revolution Unveiled

AI code assistants are revolutionizing software development, with Gartner predicting that 75% of enterprise software…

2 days ago

Machines That Make Up Facts? Stopping AI Hallucinations with Reliable Systems

There was a time when people truly believed that humans only used 10% of their…

2 days ago

What’s Next in Cloud Optimization? Can We Optimize Costs Without Sacrificing Performance?

Not too long ago, storing data meant dedicating an entire room to massive CPUs. Then…

1 week ago

The Future-Ready Factory: The Power of Predictive Analytics in Manufacturing

In 1989, a missing $0.50 bolt led to the mid-air explosion of United Airlines Flight…

4 weeks ago

Manufacturing 4.0: How Augmented Reality is Reshaping the Factory Floor

Augmented reality began in a lab at Harvard back in the 1960s, and over the…

1 month ago

Smart Machines & Smarter Humans: AI in the Manufacturing Industry

We have all witnessed Industrial Revolutions reshape manufacturing, not just once, but multiple times throughout…

1 month ago

This website uses cookies.