Talk to Your Digital Signs Using a Voice User Interface

Organizations are constantly upgrading their communications offering, trying to anticipate what information their audience wants and how they want to access it. Interactive touchscreens were a huge step forward and now there’s a new way to interact with digital signs that’s completely hands-free – voice user interface.

Put simply, a voice user interface, or VUI, allows spoken human interaction with computers. It uses a voice command device and speech recognition software to understand commands and trigger words to perform actions.

The first attempts at VUI were done at Bel Labs back in 1952. They created a digit recognizer called Audrey that used pattern matching to recognize the numbers 0-9. Interactive voice response (IVR) was developed in 1984 by Speechworks and Nuance, mainly for telephony, which kicked off the voice menu revolution for businesses.

Apple premiered Siri in 2006, Google followed with voice-enabled search in 2007, Microsoft came out with Cortana in 2011, Amazon started selling the Amazon Echo smart speakers that integrates with virtual assistants in 2014, and then Google unveiled Google Assistant in 2016. Today, VUI smart assistants are being coupled with AI technology to offer a wider range of tasks that can be done simply by talking to the device.

Smart speakers have become a massive new market, with over one-third of US consumers owning one or more units. And the market grows every year as more individuals use them and more organizations see the benefits they bring to the table. Today you can find VUIs in eCommerce, at office workstations, in cars and in hospitality venues. And as more people are working at least some of the time from home, adoption of smart speakers and virtual assistants is likely to increase even faster. They save effort and time, and people are getting better at using them.

Clearly VUI is here and here to stay. In fact, it’s starting to become commonplace. It’s now possible to add VUI technologies to digital signs (even meeting room signs), transforming the way audiences interact with information in far-reaching ways. It’s improving the customer experience by mimicking the at-home experience, and providing an interactive option without having to invest in more expensive touchscreen displays.

And of course, there’s no touching the screen. No fingerprints, no smudges and no need to clean it constantly. Especially these days (July 2020), when social distancing is very much on people’s minds and might even be an internal policy for your organization, this is very much in sync with the times. But even in what we might call “normal” times, germ transmission is something to keep in mind, especially at healthcare facilities and hospitals. Well, no touching, no germs.

You may already have interactive touchscreens in your facilities. If so, you’ve seen how they completely change the user experience and allow the organization to offer a large amount of relevant info on limited screen real estate. If you don’t have interactive displays, you may still have a certain amount of interactivity by including things like QR codes and URLS that direct people to more details than a short digital signage message can display. VUI heralds the next step in this progression, by allowing full interactivity on any screen, touchscreen or static, without touching any device at all.

When adding VUI to digital signage, you have two systems to choose from (right now). One is to use a cloud connected VUI that utilizes Natural Language Processing (NLP) to understand what’s being asked. NLP is a branch of AI research and it “learns” different ways to ask for the same thing. The more it’s used, and the more the cloud is used, the more accurate it becomes at doing what it’s asked to do. So, if someone wants to see the weather for the local area on Friday, they can ask in numerous ways, like “What’s the weather going to be on Friday?” or “Is it going to rain on Friday?” or “ What’s Friday’s forecast?” or any of a number of other phrases that amount to the same request. This is what home assistants use to parse a virtually unlimited number and type of requests.

The other method of adding VUI to digital signage is to use a native Voice Recognizer widget in the content management software that works with the built-in speech recognition engine in the media player OS. It can identify specific pre-coded keywords and phrases that trigger specific messages on screen. The advantage this has over an NLP system is that it’s self-contained and doesn’t even need an internet connection. The constraint is that a Voice Recognizer only identifies keywords, or combinations of keywords, that have been entered into the widget. Although a limitation, this is usually a more practical solution if you don’t want your digital signage to be open to browsing the web, but limited to a set of content you’ve provided.

In both cases, you’ll need a microphone to pair with your displays. You can work with either your own IT department or a local integrator to choose the right one for the physical space and noise level. Most suitable mics use a simple USB plug, so it’s a pretty simple installation. There will have to be a wake command or trigger in order to start using VUI capability. Much like at home, you might say “Hey, Alexa” to start your smart speaker’s operation; “Hey, Alexa” is the wake command. Otherwise, the screens wouldn’t know to display anything, or end up displaying all kinds of messages as the system responds to all kinds of conversations near the display.

Does this mean that the sign is always listening? Yes, in a way. It’s always listening for that wake command, which instructs it to activate the VUI interface. It is not, however, recording everything around it, nor can it understand human conversation. A VUI system is just another way to interact with the display.

If there’s no content configured to be displayed after a request, then the digital sign can do nothing. It’s not a web browser, nor a smartphone filled with various apps. Asking the digital sign for movie times for the latest Marvel blockbuster will not work if that information, or path to that information, is not already present in the series of messages that are mapped to keywords and phrases. The organization still controls what information is available to be interacted with.

And it’s that interaction that’s revolutionary. In her song “Language Is a Virus”, avant-garde artist Laurie Anderson said, “Paradise is exactly like where you are right now, only much, much better.” This is not just a gratuitous cultural reference – it’s a statement of where we are and where we’re heading. With VUI technology, users have yet more ways to interact with information that’s valuable to them.

Consider ADA guidelines, where displays must be a certain height off the ground so that both people with and without wheelchairs can comfortably reach the display. If you can’t meet those guidelines, or you’re working with screens already installed at a fixed height, a voice user interface can give equal access for everyone, provided the microphone is calibrated correctly. And while most facilities in the US will probably use English, VUI can work in any language as long as the trigger words and content include those language choices. And the cloud database of languages just continues to grow and improve with time, so Natural Language Processing certainly supports this option.

We’ll soon see better and better voice user interfaces and language processing, allowing a greater range of accents and vocal styles to seamlessly interact with digital signs. We may even, in the not-too-distant future, be able to add gesture recognition to displays (with a camera, of course), further increasing the ways people can interact with content. Just like our children find the idea changing channels without a remote control to be nearly incomprehensible, our grandchildren may be asking us one day, disbelievingly, “Is it true that you used to have to touch screens to get them to work?!”