CONVERSATIONAL INTERFACES – Speech Recognition Technology

Conversations are a part of our lives. But can computers converse too? Well, over the last few decades computers and humans have begun to interact with each other through speech recognition and speech synthesis technology. Known as ‘Conversational Interfaces (CIs)’, we’ll take a look at how these evolved, the types and the latest ones doing the rounds today.


Humans and computers first interacted with each other via the ‘terminal interface.’ They used the command line or DOS prompt. But due to lack of knowledge of the syntax, it did not develop as the mainstream language. Next came the ‘Graphical User Interface’ which included visual images, files, actions that made interaction easier. But it didn’t last long either as these represented abstraction. What has now emerged is the ‘Conversational Interface.’ As the name goes, humans interact with machines using natural language making them more perceptive, proficient and accessible.

Conversational Interfaces (CIs)

CIs are simply user interfaces that mimic conversing with a human. Two types of interfaces exist − voice assistants and chatbots. Examples of voice assistants include Apple’s Siri, Windows Cortana, Microsoft Office Tay, Google Now, and Amazon Echo (Alexa) and have taken the world by storm. Amazon’s Echo can dim lights, play music, order a pizza, and much more. One can do a search or listen to a song by simply speaking. As these can converse with a human as naturally as possible, they seem more personal. Chatbots such as Facebook’s M, Slack’s Slackbot, Slack’s Howdy etc. are available for booking reservations, taking orders etc.

Availability at all times, curation and sharing of information becomes easy on these interfaces. Other activities such as shopping online, ordering a cab ride etc. are becoming easier eg. virtual travel agent Pana, the online shopping application Operator etc. CIs are especially advantageous as they work on laptops, smartphones, smartwatches etc. Further, they can be integrated with other platforms such as Snapchat, Twitter, Facebook etc. Another example of speech recognition technology is that of China’s popular search engine Baidu (DuEr), WeChat, or the latest Deep Speech 2. Baidu handles queries as that of weather, pollution levels etc.

Further, there are two different types of chatbots – artificial intelligence based chatbots and rule-based chatbots; or they can be text based or voice controlled. The third type of pseudo chatbots can be classified as Microsoft’s Clippy and Quartz’s text messaging application. Here, visuals remain the same as that of a chatbot, but cannot converse as well as a chatbot. Websites like Adrian Zumbrunnen and can also be referred to as CIs.

CIs are also making a mark in businesses. For eg. real estate. Users can click on the website to search houses for and chatbots while obtaining all the required information while providing them with relevant answers. CIs use more of text (words), content and make the conversation flow with the user. An important aspect to be kept in mind is that the flow of information should be clear while using CIs. The user must understand the conversation and confirm their understanding. The continuity must be maintained, must be as natural as possible, and focus on personalization. The animation is also essential to chatbots to increase the satisfaction factor.

The Explosion

CIs have been around for years, but are now catching up in the digital space. They are a sweeping departure from GUIs as they use text which gives a better experience. The mode of interaction in CIs is essentially a conversation – written (Facebook M), voice based (Amazon Alexa), or hybrid (Siri/Cortana as the response could be voice or text based). Finally, CIs tend to give choices to select the reply from. Adventure games were a form of CIs, but as they were hard to learn, GUIs took over, finally giving way to text interactions called CIs. And now due to mobile connectivity, IoT devices, social networking platforms, messaging world and cloud-based AI-powered applications are giving all the reasons for the rise of CIs. But of course, businesses need to understand their vision and strategy, presence in social media, and underlying processes and metrics and the support required before they jump on the bandwagon to create CIs.

The explosion in CIs around the world now is due to competition and innovation. Businesses need to be ahead of the competition. Instant help and information gathering become possible through CIs like chatbots. Also, CIs understand what is being said and accurately guess the needs of the consumer. But sometimes, people don’t speak in a straightforward manner and that causes a difference in the question being asked and answered. However, Normalizer is one such app that tackles issues of short forms, slangs etc. to a certain extent.

CIs must understand a natural language and respond equally naturally. Other extremely happening chatbots include US-based Nordstrom (shopping), KLM which shares flight information via Facebook Messenger, the application Telegram which possesses specific buttons for shortcuts or specific actions, or Taco Bell that allows ordering of tacos via Slack. Google is going a step further with CIs; looking at an interaction at a personal level. As Sundar Pichai said, “We are evolving search to be much more assistive [and] want users to have a two-way ongoing dialogue with Google to help get things done in the real world. We think of this as building each user their own individual Google.”(MIT Technology Review).

The Benefits

CIs offer benefits such as immediacy (responses are quick and tailor-made), ubiquity (CIs are available at all times by talking or typing), authenticity (brands provide answers which are as personalized as can be), and buzz (brands can create a story while reaching out to a larger segment of the population). All these tend to increase loyalty and customer satisfaction for businesses.

It is being understood that CIs will be of a great help to the visually impaired around the world and hence must be looked on as a very intelligent medium of communication.

The natural language interface is moving up the curve and brands/businesses while making use of CIs can keep ahead of the race. With time, CIs will learn/understand the user’s likes/dislikes, routines and schedules. They will eventually be an extension of ourselves.

Sharon Christine
Sharon Christine

An investment in knowledge pays the best interest

Updated on: 24-Jan-2020


Kickstart Your Career

Get certified by completing the course

Get Started