The following article was adapted from the thesis work of Jennifer Spatz. Jen is a Master of Industrial Design from Rhode Island School of Design. For more on her ideas, process, and final deliverables check out her portfolio.
Voice user interfaces excite us because they give us a glimpse into what the future might sound like. They require us to activate the senses that often remain dormant in most of our day to day interactions with technology. Unlike the graphic user interfaces of our phone and computer screens, a VUI ever so politely commands the use of our speech and hearing in order to interact.
At present, most voice user interfaces use a conversational, often casual approach that mimics human interaction. Ask a question and “she” will respond by reading off information that the user will (hopefully) find useful. Sometimes Alexa is diplomatic, and sometimes Siri has an attitude. The mainstream VUI hopes to resemble a useful and efficient human that assists diligently in retrieving information and performing technology-based tasks such as playing a song or making a list.
It is important to note that although it seems easy and even novel to interact with a virtual human at your beck and call, it isn’t necessarily the most efficient. With human conversation, there is excess. Although the excess brings us closer to one another and helps us to connect, for a VUI, it is essentially theater. And at the end of the day, we are more likely to bond with a coffee table book or feel more connected to a teddy bear than we would a VUI that mimics human conversation—even with its technological capabilities.
Scoping The Problem
How, then might we interact with a VUI with the level of familiarity and comfort we experience when clicking, tapping and scrolling through our screens? Graphic user interface design must rely on entirely separate criteria than voice user interfaces, but the ways in which humans relate to technology remain conceptually similar across both interfaces. GUI design mimics real-world objects much in the same way VUI now mimics human-to-human interaction. iBooks were originally stored on what appeared to be a wooden bookshelf, just waiting to be dusted off and read in the archetypal overstuffed armchair.
However, as GUI design advanced, so did the pseudo-object representations. They were stripped of their excess and novelty and streamlined. The result is graphic user interfaces more confident in their own essence without attempting to obviously mimic real-world experience. There is elegance and authenticity found in technology for the sake of technology itself. And as users, we feel more at ease with a design that does not attempt to mirror our lives in a virtual reality. Rather, we appreciate its function and efficiency but have come to understand its use through the unmet needs of our real-world experience.
Observing strict conversational formalities when interacting with voice user interfaces creates unnecessary obstacles in accessing information.
How might voice user interfaces follow? The implications of advancement in VUI design are enormous both for the personal and professional lives of non-sighted individuals. Furthermore, voice user interfaces may change the way in which we interact with technology. The human senses associated with technology use have the potential to synthesize in such a way where one may optimize the other—a technological integration that remains surprisingly untapped.
The blind and visually impaired are often entirely excluded from mainstream graphic user experience. The use of screen readers, which refer to apps such as VoiceOver, do not replace the technologies used to consume digital content visually. They remain a supplemental tool at present, secondary to the visual content which they translate. Screen readers are effective in their ability to eliminate the excess and theater of a more novel and conversational VUI, but they aim to reproduce, not reimagine the way in which we interact with sound itself.
Possibilities and Capabilities
One possibility for the future of VUI is a more gestural and non-conversational approach to sound design. With the use of gestural sounds to communicate effectively and non-graphically, sound design could positively stray away from the frustrations and inefficiency of mimicked speech, and into something more elegant and less verbally skeuomorphic. Much in the same way that GUI design has advanced, VUI design has the potential to deliver content to us without the excess of speech. In turn, this could allow for us to rely more on voice user interfaces and open up a world of possibility for both sighted and non-sighted individuals to work and go about their daily lives with more ease and efficiency.
Our brains often respond more strongly to monosyllabic, natural and gestural noises such as a swish or a series of gentle taps that help us navigate large amounts of information. These subtle, non-speech indicators can limit the frustrations of conversational speech that already saturates our daily lives. There even exists the possibility of navigating through the use of symbolic sounds that both lead and alert, rather than simply dictate directions. Gestural, non-speech sound design eliminates a lot of unnecessary confusion between the interface and user, resulting in a more efficient means of communication.
Ideas can be distilled into short groupings of words that carry meaning even though they don’t sound like conventional language.
We are used to triggering monosyllabic sounds by turning something on and off, or sending an email. But what remains dormant is our ability to use technology through interaction with this type of sound. The possibilities are endless. The present of VUI design is oriented around human-centric interaction that mimics the world we already know. If voice user interfaces parallel the advancement of GUI, perhaps the future will reveal a more elegant and less obvious discourse with sound design.
To orient itself toward the future, VUI design must consider that the ears and the voice are well equipped to perform what we perceive to be the work of the eyes. We must reimagine VUI design so that it embodies the full spectrum of human capabilities, for sighted and non-sighted individuals, rather than assuming a supplemental existence to the limitations of what we achieve only on the screen.
Want to design a user-focused, functional VUI of your own? You can start for free today at Botsociety.io!