Oct 07, 2021 By Team YoungWonks *
What is an Intelligent Virtual Assistant? When we talk about smartphones or even smarthomes, the discussion typically entails mentions of Amazon’s Alexa, Apple’s Siri, Microsoft’s Cortana or Google Assistant, all of which are examples of Intelligent Virtual Assistants (IVAs). In this blog post, we shall take a close look at what the term Intelligent Virtual Assistant means, how they work and their varied applications today.
What is an Intelligent Virtual Assistant?
An Intelligent Virtual Assistant (IVA) is essentially a software agent that can carry out tasks or offer services for users in response to commands and / questions. There are different types of intelligent virtual assistants today: for instance, chatbots are virtual assistants that are generally or specifically accessed by online chat. Then there are some that can interpret human speech and respond through synthesized, human-like voices; here people ask their assistants (think Alexa, Siri, Google Assistant) questions and use them to control their home automation devices (which have been synced with them) and also to perform basic tasks such as creating to-do lists, grocery lists and even calendars with verbal commands. Another feature is the media playback also controlled by voice.
So basically, an Intelligent Virtual Assistant - also known as an Intelligent Personal Assistant (IPA) - is an engineered entity that’s hosted in a software and interacts with users in a more human way. It does so using its intelligence - aka, Artificial Intelligence (AI), which effectively powers the assistant’s interactive voice response.
The world of intelligent virtual assistants is rapidly expanding even as leading technology companies roll out more and more devices equipped with them. As a result, the usage of intelligent virtual assistants - many of which can be just installed (if not already pre-installed) on one’s smartphones - has shot up remarkably over the years. Moreover, the capabilities of virtual assistants are growing quickly too, even as more new products enter the market offering IVA features for email and voice user interfaces. So while Apple’s Siri and Google Assistant get their major user base from their smartphones, Microsoft gets it from its Windows PC users, smartphones and smart speakers. Meanwhile, Amazon offers it mainly through its Echo line of smart speakers and Conversica has over 100 million engagements via its email and SMS interface intelligent virtual assistants for business.
In this blog, we shall loosely refer to IVAs as virtual assistants.
A Brief History of Intelligent Virtual Assistants
The first voice activated toy was launched as early as 1922; it was the Radio Rex wooden toy that looked like a dog and would come out of its house when its name was called.
Then came the Automatic Digit Recognition machine, aka ‘Audrey’ which was created by Bell Labs in 1952. It took up a six- foot-high relay rack and a lot of power even as it identified the fundamental units of speech, phonemes, and accurately recognized digits spoken by designated talkers.
Also deserving a mention is the IBM Shoebox voice-activated calculator which could carry out digital speech recognition (it recognized 16 spoken words and the digits 0 to 9) and was launched in 1961.
Another notable development would be the first natural language processing computer program or the chatbot ELIZA that was created by MIT professor Joseph Weizenbaum in the 1960s. It used pattern matching and substitution methodology to come up with scripted responses giving the user the feel of a human conversation.
The 1970s saw the emergence of Harpy, which was the result of an ambitious project at Carnegie Mellon University and could identify about 1000 spoken words. The 1980s saw IBM upgrade Shoebox to Tangora which was a voice recognizing typewriter.
The ’90s saw Naturally Speaking software identify and transcribe natural human speech without pauses between each word into a document at a rate of 100 words per minute. This was back in 1997, followed by the 2001 release of the chatbot SmarterChild that looked up topics, gave weather updates, played games, and could interact with users to an extent.
The first modern digital virtual assistant to be installed on a smartphone was Apple’s Siri, that was introduced as a feature of the iPhone 4S in October 2011. In its rudimentary form, it helped with tasks such as sending text messages, making phone calls, looking up the weather or setting up an alarm. Since then it has grown to share restaurant recommendations, scan the internet, and even provide driving directions. In November 2014, Amazon announced the launch of Alexa along with the Echo smart speakers. In 2017, the company released a service for creating conversational interfaces for any type of virtual assistant or interface.
How Virtual Assistants Work
Virtual assistants work using the following media:
Text: This includes online chat (in instant messaging application or other apps), SMS text, e-mail or other text-based communication channel. An example would be Conversica’s intelligent virtual assistants for business.
Voice: Voice is a common medium today. Amazon Alexa on the Amazon Echo device, Siri on iPhones, or Google Assistant on Google-enabled/Android mobile devices are all apt examples.
Images: Here the communication is done by taking and/or uploading images. For eg, in the case of Samsung Bixby on the Samsung Galaxy S8.
Multiple media: And then there are some whose services can be availed of using multiple methods, such as Google Assistant which works via chat on the Google Allo, Google Messages app and through voice on Google Home smart speakers.
Virtual assistants basically use Natural Language Processing (NLP) to match user text or voice input to executable commands. A subfield of linguistics, computer science and AI that deals with the interactions between computers and human language, NLP focuses on programming computers to process and analyze large amounts of natural language data.
Many virtual assistants continue to “learn” using AI techniques, including Machine Learning (ML) algorithms. Assistants like Google Assistant (which contains Google Lens) and Samsung Bixby can also carry out image processing so as to be able to identify objects in images, thus helping users get better results from them.
Activating such a virtual assistant is rather easy. Either one enters the text or one uses one’s voice to say out loud a pre-decided wake word/ words. This could be, “Hey Siri”, “OK Google” or “Hey Google”, “Alexa”, or “Hey Microsoft”.
Devices where Virtual Assistants are Found
As seen above, Virtual assistants come integrated into many types of platforms and devices today. Let’s look at the major ones now.
- Smart speakers such as Amazon Echo, Google Home and Apple HomePod
- In instant messaging apps on both smartphones and via the Web
- Built into a mobile operating system (OS), like in case of Apple’s Siri on iOS devices or into a desktop OS such as Cortana on Microsoft Windows OS
- Built into a smartphone independent of the OS, like Bixby on the Samsung Galaxy S8 and Note 8
- In appliances, cars and wearable technology
- Within instant messaging platforms, assistants from specific organizations, say, Wechat Secretary on WeChat or Aeromexico’s Aerobot on Facebook Messenger
- Within mobile apps from specific companies and other organizations, such as Dom from Domino’s Pizza.
Applications/ services of Virtual Assistants
There are many applications of intelligent virtual assistants today. Let’s look at them below.
a. Sharing information about weather, news, facts from popular websites such as Wikipedia or IMDb
b. Carrying out mundane day-to-day tasks such as setting alarms, making to-do lists and grocery/ shopping lists
c. Playing music from streaming services such as Spotify and Pandora
d. Playing content from radio stations and reading audiobooks
e. Playing videos, TV shows or movies on televisions, streaming from Prime Video / Netflix.
f. For conversational commerce; this refers to e-commerce powered by conversational AI, via various means of messaging, including via voice assistants and/ or live chat on e-commerce websites, as well as live chat on messaging applications such as WhatsApp, WeChat and Facebook Messenger as also chatbots on messaging applications or websites.
g. Assisting public interactions with government
h. Complementing and/or replacing customer support/ customer experience by humans.
Privacy and Ethical Concerns surrounding Virtual Assistants
Virtual assistants - like many technological innovations today - come with their share of concerns, be it user privacy being compromised or even the ethical implications of how they essentially work.
To begin with, there’s the fact that end users using these virtual assistants basically end up sharing free data for the training and improvement of the said assistants, and more often than not this happens without the consumer’s knowledge. This, in itself, is ethically disturbing.
Then there’s the fact that AIs are trained with this data via neural networks, which in turn need a huge amount of labelled data. And this data requires labelling through a human process, which means there are scores of people doing some repetitive tasks for a few cents, where they are actually listening to virtual assistant speech data, and writing down what was said. The emergence of this microwork has already drawn flak for the job insecurity it comes with, the extremely low-pay, zero employee benefits and for the total lack of regulation. Thus, there’s the argument that AIs are still human in that they would be impossible without the microwork of millions of human workers.
It’s also important to realize the privacy concerns caused by the fact that voice commands are in fact available to the providers of virtual assistants in unencrypted form, and can thus be shared with third parties and be processed in an unauthorized or unexpected manner.
That’s not all, for along with the linguistic content of recorded speech, a user’s manner of expression and voice characteristics can also implicitly contain information about his or her biometric identity, personality traits, body shape, physical and mental health condition, sex, gender, moods and emotions, socioeconomic status and geographical origin. All of this makes the usage of virtual assistants a potent threat to consumer privacy.
*Contributors: Written by Vidya Prabhu; Lead image by: Abhishek Aggarwal