Smart speakers have arrived and they are not only changing how we interact with technology, but also how we get things done. As everyday users increasing rely on them and businesses are quickly ramping up to use them, Voice Apps are now the new frontier for app design. Have you wondered what it would take to design an app that runs on them?

People use their smart speakers for just about everything - from listening to music, catching up on the news, checking the weather, and setting daily reminders. By 2021, it is expected that the use of smart speakers will top tablets.

And it goes beyond just smart speakers – with most phones having built in Voice capabilities, users are starting to use voice applications there as well.

Part of the reason behind this growth in voice usage – is that voice is a ‘natural’ interface. In other words, we as humans know how to talk very soon after we are born. And, on top of that, we just generally prefer using voice to interact with others, as opposed to some form of text messages. The interface can be as intuitive as a conversation. Voice however, does come with its challenges…

A little context

The first smart-phone apps, before the iPhone - were clunky, awkward and difficult to use. Designers had not yet figured out how to properly stuff all the capabilities of computers and the internet into such a tiny device. It was like the first time we had to stuff an entire 4-person tent into a bag the size of a fanny pack. It was possible, but we just couldn’t make sense of it at the time.

Then came the next generation of mobile apps. They took the features that made their fellow web apps so revolutionary and useful and styled it completely for the smart phone screen. Programmers acknowledged that smart phones required their own sense of design.

However, web and mobile apps have a lot of things in common. They function in the exact same ways. They have buttons, text fields, dropdown menus, to name just a few of the things we take for granted. Therefore, It didn’t take an overload of imagination to reformat web apps to fit smartphones.

But Voice is Different

Voice apps are a brand-new form of technological engagement. Rather than just tweaking existing apps to better fit touchscreens and smartphone operating systems, voice apps require an entirely different way of thinking about user interfaces and how the app interacts with us. There are some real challenges to this.

For one, voice is a wholly temporal thing. While not exactly tangible, it is something that changes with the whims of human beings. We are masters of whining, yelling, and mumbling. Text and push buttons lack nuance but they also make it far easier to just command things to do as they are told and to expect an exact result. There’s no emotion attached to the action.

Users can quickly become bored or impatient with voice apps. Currently, voice apps have a wide set of functions but not a whole lot of depth to them. In other words, they offer a pretty shallow experience in exchange for being highly functional and fast. Mobile and web apps, however, are designed to be highly engaging and with a depth that leaves users interacting with them longer.

Users can’t really speed through a voice app. You say your command, wait for the appropriate result and then move on to the next thing. At the moment, smart speakers and their apps require processing time as they take in the request. Furthermore, voice apps can only process one or two commands at time, thus users cannot swiftly run through a vocal interface.

Voice apps are also anything but private. Vocalizing commands and hearing the app respond is a public action. Moreover, whispering tends to be ineffective since the app while barely understand the user and then respond in with full volume. Someone looking to work incognito will have a stressful time entering in their social security number or searching for their favorite adult site.

Lastly, voice is pretty ineffective with things like lists, relationships between items, or anything involving processing of visuals like graphs and charts. It goes back to voice apps being a shallow experience with brief engagement. Something as simple as listing becomes incredibly difficult without visual interfaces to organize and understand them.

Voice is Similar

The fundamentals of app interaction remain the same. What we call events for mobile and web apps, are known as intents with voice apps. It is just how those actions are taken that has changed. Clicking and tapping were intuitive ways to interact with an app’s features and this is more or less the same for vocal intents for voice apps.

At the end of the day all apps are simply designed to help a user complete an action of some sort. The only difference is in how those actions are instigated and the depth of engagement each app has. In this regard, voice apps are incredibly efficient. Vocal commands are effortless and when it comes to simple tasks voice apps are incredibly fast and intuitive.

In this regard, the shallow engagement is actually a strength. Voice apps have to remain quick and efficient to please users. Holding their attention for extended periods is a waste of time and potential. Voice app designers are thus incentivized for making voice apps even better at accomplishing specific tasks as efficiently as possible.

Web, mobile and voice apps still operate within the fundamentals of basic app design. They all need to work with intuitive nesting. These take the forms of menus, sub menus, tabs, and accordions. Thus, the organization of the app’s functions remain entirely the same with voice apps. What changes is how the user navigates through them.

Additionally, they also gather, process and provide data in the same ways. For an app to properly function and evolve, it needs to collect data as it used. This function remains unchanged regardless of how users are interacting with the app be it voice commands or button-pushing.

To create a truly great app on any platform be it smart speakers or smart phones, designers need to work with developers to create truly great and functional apps. Voice apps are an exciting new frontier because they will force creators to rethink how users and technology interact. Not only must intents and results be quick and easy, but Alexa and Google Voice need to be able to respond in intuitive and natural ways. Apps now must be capable of real conversation to reach their full potential.


While the differences between voice and traditional web and mobile apps are profound, don’t let this intimidate you from developing your own. You will certainly have to design it outside the framework of a web app. However, the fundamentals remain the same. If anything, the voice app simplifies the goals of a developer. The most important thing is for the voice app to accomplish its tasks quickly and easily, long-term engagement is no longer a primary motivator.

It is important to get started now on learning how to build up a viable voice app. Technology is exponentially improving and thus changing. Getting in early gives you the opportunity to both learn it and eventually shape how it grows.

There are multiple ways to start. If you are not technically inclined or new to app development, Alexa Skills Blueprints is an excellent place to start. If you are experienced technically then using Violet, an open source application-framework for building voice apps will suit you just fine.

Voice apps are fast becoming the dominant way in which we interact with our smart devices. Knowing that smart speakers are well on the way to overtaking tablets and that companies are working to make smart phones more vocally enabled, means that’s now is as good a time as ever to get in on the action. If you have any takes on this exciting new technology, we would love to hear your thoughts!

I am posting new articles roughly once a month. Interested in getting notified? Sign up here.