On October 4, 2011, Apple announced a brand new feature at the iPhone 4s launch event— the intelligent voice assistant Siri; through it, users can make calls, send messages, set alarms, and more using voice commands. Although the actual experience was not outstanding, in the following years, Google Now (later renamed Google Assistant) and Microsoft’s Cortana emerged, making voice assistants seem like standard features of smartphones.
However, the reality is that on smartphones, where touch interaction is the primary mode, the functionality and usage scenarios of voice assistants are quite limited, and the voice recognition rate and conversational ability still need improvement, leading to a poor user experience with Siri and similar assistants. Many smartphone users refer to them as “useless.” Nevertheless, against the backdrop of the arrival of the artificial intelligence era, with the emergence of Amazon’s Alexa and Echo smart speakers, intelligent voice assistants finally found a new application scenario—smart homes.
Unlike Siri and other voice assistants, Alexa, built into smart speakers, was not intended by Amazon to be integrated with smartphones from the beginning. In November 2014, when Amazon quietly released the Echo smart speaker, its built-in Alexa voice assistant entered a different scenario from Siri and others. As a smart speaker, Amazon Echo has no screen, and the only supported interaction method is voice; after setup, users can say “Alexa” to wake it up and use it to play music, check the weather, set alarms, order Uber rides, check recipes, and more.
For the iPhone, Siri is merely an optional voice assistant tool, but for Echo, Alexa is its entire soul, while Echo is just a shell for Alexa. Additionally, since Echo has no screen, user interaction is entirely voice-based, which requires Alexa’s voice capabilities to be robust; the limitation of interaction methods means that Alexa must be intelligent enough to evolve from a relatively simple voice assistant into a voice robot that can understand and respond to users.
To this end, the functional characteristics of Amazon Alexa (and the Echo smart speaker) are mainly reflected in three aspects:
-
Powerful Voice Technology. Voice is the only interaction method for Alexa, so the technical requirements for voice recognition, semantic analysis, and other aspects are very strict. In an interview, Amazon CEO Jeff Bezos stated that Amazon has invested years of effort into Echo and Alexa, initially recruiting a large number of talents from the established voice recognition company Nuance, and later acquiring two startups focused on voice technology, Yap and Evi. Ultimately, Alexa’s voice capabilities can compete with those of Siri and Cortana.
-
Artificial Intelligence. Alexa is inherently a product of artificial intelligence, with its initial development inspiration coming from the robot in “Star Trek,” which can communicate like a human. To achieve this, Amazon integrated advanced machine learning and other AI technologies into Alexa, which were previously used for product recommendations and price predictions on the Amazon marketplace; as the number of Echo users increased, Amazon also collected a vast amount of voice data to improve and upgrade Alexa’s AI technology.
-
Voice-Based Applications. As the carrier of Alexa, what Echo smart speakers can actually do is what users are genuinely concerned about. Initially, Echo could perform basic tasks such as playing Prime music, setting alarms, checking the weather, and answering questions, but later, Echo began to support third-party services like Spotify music, Audible audiobooks, NPR news, and also started to add control over home appliances like lights, air conditioners, and cameras.
In June 2015, Amazon opened up Alexa, allowing third-party developers to build their own voice applications through Alexa, which Amazon refers to as Skills; a month later, Amazon also opened up sales of the Echo smart speaker. By December 2016, Echo sales had exceeded 5 million units, and Alexa’s Skills had reached over 5,000.
At this year’s CES, Amazon Alexa was highly prominent; the main reason is that several manufacturers, including Samsung, Lenovo, LG, and Dish, launched various types of smart home devices, all of which are equipped with Alexa. This is thanks to Alexa’s open strategy towards third-party hardware manufacturers.
Back in June 2015, when Amazon launched the Alexa Skills Kit for third-party developers, it also introduced the Alexa Voice Service development kit for third-party hardware manufacturers. To encourage developers and manufacturers to participate, Amazon launched a $100 million funding initiative called the Alexa Fund; the result was that, in the initial period, some third-party developers created Skills for Alexa, but very few third-party hardware manufacturers were willing to integrate Alexa into their products.
Some brands funded by the Alexa Fund, therefore, in the initial period, the only carrier for Alexa was Amazon’s own Echo smart speaker; later, in March 2016, Amazon launched two more devices, Amazon Tap and Echo Dot. The former is more like a mobile version of Echo, while the latter is somewhat similar to a customizable speaker that can also use Alexa.
Amazon Tap and Echo Dot. However, as Echo device sales increased and the number of Alexa Skills continued to grow, starting in the second half of 2015, third-party hardware manufacturers began to realize the significant development potential of Alexa in the smart home sector, and many appliance manufacturers began to collaborate with Amazon to integrate Alexa into their products. In fact, at the CES 2016, Alexa had already made a significant presence. However, by CES 2017, Alexa appeared at product launches from major manufacturers, covering product types including refrigerators, vacuum cleaners, DVRs, gesture remote controls, light bulbs, and in-car systems, all of which integrated the Alexa voice assistant and audio I/O modules for interaction with Alexa. This means that Alexa is no longer confined to a single smart speaker; it has begun to serve as an embedded assistant in various smart home products.
Lenovo Smart Assistant smart speaker, embedded with Alexa. It is also worth mentioning that during CES 2017, Mike George, Amazon’s Vice President responsible for Echo, Alexa, and the app store, announced that the number of Alexa Skills had reached 7,000, and that multiple hardware devices with Alexa built-in would be launched in the coming months. At this point, Alexa was getting closer to becoming a smart home operating system based on voice interaction; it just had a wider variety of supported hardware.
The Initial Form of Smart Homes. In terms of the development of the smart home industry, the concept of smart homes has existed for a long time; however, in the past two years, the emergence of the Alexa artificial intelligence voice assistant seems to have made the form of smart homes clearer.
From an interaction perspective, in home scenarios, voice interaction is clearly more natural and free than screen touch operations; this is also the reason why Alexa has been accepted by users. After setup, people can call it using voice in any living scenario at home without having to focus their attention on any device screen; this is the interaction method that aligns with people’s home life scenarios. As Kenn Harper, Vice President of voice technology company Nuance Communication (supporter of Siri’s voice technology), believes:
Voice is the future interaction interface of smart homes. In the view of Ifanr (WeChat ID: ifanr), the “intelligence” in smart homes should also point to artificial intelligence, rather than just being connected to the internet.
Kenn Harper. From this perspective, it can be said that Amazon has initially established the future form of smart homes with Alexa, thus occupying an important entry point for artificial intelligence. More than a year after Amazon launched Echo, Google, which has been deeply involved in the artificial intelligence field, also launched a smart speaker based on its own AI voice assistant Google Assistant, thus forming direct competition with Amazon’s Echo. Moreover, in terms of specific development directions, Google has also adopted Amazon’s approach. In early December 2016, Google announced that third-party developers could develop Conversation Actions based on Google Assistant, which are quite similar to Alexa’s Skills. A few days later, Google released over 30 such Conversation Actions.
Applications supporting Google Home. Additionally, at CES 2017, Google also collaborated with Nvidia to launch the Nvidia Shield gaming console with Google Assistant built-in; furthermore, during CES 2017, Google announced on its official website that Google Assistant would be available on more Android TV devices. Besides Amazon and Google, Apple’s Homekit is also working closely with Siri, attempting to move towards voice interaction; however, some media believe that compared to Amazon’s Alexa, Homekit still appears very basic. Moreover, there are rumors that Microsoft is also trying to enter the smart home field through its Cortana voice assistant on Windows 10, launching a product called Homehub.
Homekit.
How will the future play out? Looking ahead to 2017, the number of third-party applications supported by Alexa and Google Assistant will continue to grow, and the variety of third-party devices that integrate them will also increase; among them, Amazon’s Alexa is a significant step ahead of Google Assistant. However, with Alexa as a pioneer and Google’s influence in the developer community, the growth rate of Google Assistant’s applications will be relatively fast. Under the leadership of Alexa and Google Assistant, voice-interactive artificial intelligence technology will continue to develop vigorously and become more closely integrated with the smart home industry. Apple has already made some progress in the smart home field with Homekit, but it still needs to enhance Siri’s technology and increase Siri’s openness, deeply integrating Homekit with Siri to free Homekit from its reliance on the iPhone. If Microsoft wants to enter the market, it may take a different approach, starting from home computers with built-in Cortana to penetrate into home scenarios, but the effectiveness remains uncertain.
Conceptual image. Additionally, as the product forms of Alexa and Echo have been followed by domestic companies, the smart home market in China will also timely follow up with similar voice interaction interfaces and artificial intelligence technologies in 2017, but due to the barriers of voice technology and the willingness of third-party developers to participate, achieving significant results may still take some time. Another point worth noting is whether Alexa and Echo will enter the Chinese market. In the view of Ifanr, at least in the next few years, it is likely to be minimal. The reasons are twofold: on one hand, the difficulty of processing Chinese voice technology due to the differences between Chinese and English languages, and on the other hand, Amazon has not yet formed a service ecosystem in China like it has in the United States; additionally, many of the third-party Skills supported by Alexa have localization features that are difficult to transplant.
As for Google Assistant and Google Home, the answer is even more well-known.