▲Click above Leiphone to follow
Written by | Wang Jinwang
Reported by Leiphone (leiphone-sz)
Leiphone reports that on March 8, Google added continuous conversation (multi-turn interaction) capabilities to its screen-equipped smart speakers, including Google Home Hub, Lenovo Smart Display, JBL Link View, and LG XBOOM AI ThinQ WK9. This is to solve the problem of users needing to repeatedly use the wake word when interacting with smart voice assistants.
It is reported that this feature was launched by Google at the I/O conference in May 2018, alongside the “call” skill for Google Assistant. At the conference, Google CEO Sundar Pichai expressed his desire for Google’s smart assistant to converse naturally with people. “Users can now talk to Google Assistant, and if you want to ask a question, you can keep asking without having to repeatedly say ‘Hey Google’ to wake it up.”
According to Leiphone, this feature had previously been applied to Google’s non-screen smart speakers, such as Google Home, Google Home Mini, and Google Home Max. Upon launch, it was initially configured for smart speaker users in the United States. Specifically, Google explained that after hearing the wake word or responding to user questions, Google Assistant will remain awake for 8 seconds.
The Skills of Smart Speakers Are Increasing, Market Sales Steadily Rising
Smart speakers have gone through five springs and autumns, and the entire market is beginning to grow steadily.
In terms of skills, smart speakers can now basically achieve daily interactions with users, including checking the weather, telling stories, listening to music, and even screen-equipped smart speakers can incorporate video functions.
According to Voicebot statistics, in 2018, the number of skills available to U.S. users for Google’s smart voice assistant, Google Assistant, was 4,253; the number of skills available for Alexa was 56,750.
In China, according to information released by Alibaba’s Tmall Genie at the spring press conference in March 2018, the Tmall Genie system currently has 356 skills, with 6,500 developers working on new applications around Tmall Genie; according to data released by Baidu at the February 2019 Xiaodu strategy press conference, DuerOS voice skill count exceeds 1,000, with the number of developers reaching over 27,000.
It can be seen that smart speakers have gradually covered users’ daily life, leisure, and even some learning needs in terms of skills.
As a result, smart speakers are also achieving good sales. According to a report released by market research firm Strategy Analytics on the global smart speaker market in the fourth quarter of 2018, the total shipment volume of global smart speakers was 38.5 million units, a quarter-on-quarter increase of 95%. Among them, the top five are still Amazon, Google, Alibaba, Baidu, and Xiaomi, with shipment volumes of 13.9 million, 11.5 million, 2.8 million, 2.2 million, and 1.8 million units, respectively.
Smart Speaker “Variants” Frequently Emerge
The favorable market for smart speakers has also brought about many “variants,” from the initial smart speakers to screen-equipped smart speakers, and now to those integrated with TV scenarios.
In comparison, smart speakers are seen as entry-level devices and are a battleground for major players. Major manufacturers, including domestic giants like Alibaba and Baidu, have already made clear their pricing subsidy strategies, while Tencent and Huawei, although starting late in this field, have also launched their own smart speakers. Even foreign giants adjusted their pricing strategies slightly when launching smart speakers in 2018.
At the same time, the variants of smart speakers have become a norm. For example, the screen-equipped smart speakers that domestic and foreign giants such as Amazon, Google, Alibaba, and Baidu have already laid out have, after nearly two years of market promotion and user experience, shifted from initial skepticism to widespread acceptance.
After Leiphone conducted long-term use and experience on these screen-equipped smart speakers, it was found that the functionality of screen-equipped smart speakers is slightly different from that of tablet computers primarily focused on “entertainment.” Screen-equipped smart speakers still mainly focus on voice and video functions, leaning more towards a “leisure” tone.
Furthermore, to emphasize voice functionality and cultivate users’ habits of using voice, video applications are slightly different from tablets and computers, eliminating corresponding mouse and keyboard operations, relying more on voice control. For example, on the iQIYI interface of Xiaodu in Home 1S, the sidebar navigation has been removed, and VIP account login requires authorization through a mobile scan code.
Another variant of smart speakers combines with TV application scenarios, known as magic boxes. In May 2018, Alibaba’s Damo Academy AI Lab and Youku jointly launched the Tmall Genie Magic Box; in September 2018, Baidu, iQIYI, and Gehua Cable jointly launched Gehua Xiaoguo; in February 2019, Baidu released Xiaodu TV Partner.
Taking the recently released Xiaodu TV Partner as an example, the official functional explanation labels it as “Hi-Fi home theater + high-performance 4K set-top box + high-end artificial intelligence speaker” in one. When the TV is turned on, it can serve as a voice-controlled set-top box; when the TV is off, Xiaodu TV Partner can also function as a “smart speaker” for applications such as checking the weather and playing music.
These “variants” of smart speakers, which also focus on voice functionality as a core capability, naturally require strong comprehension and smooth interaction abilities.
Multi-Turn Interaction Issues Need Urgent Attention
Driven by such products and market promotion, smart voice systems are gradually becoming popular. Although considered the next generation of interaction methods, smart voice systems currently can achieve basic interaction functions, but the fluency of interaction, especially in multi-turn dialogue capabilities, still needs improvement.
In fact, major manufacturers have long been researching and adapting this capability. According to Leiphone, prior to Google, Amazon added a new skill called “Follow-Up Mode” to its smart voice assistant Alexa in March 2018 to address this issue. Through Follow-Up Mode, Alexa remains awake for 5 seconds after responding to user questions, waiting to see if the user has other questions; forcing Alexa back into standby can be done by using the words “thank you” or “stop” to end the conversation.
In comparison, the multi-turn interaction capabilities of domestic smart speakers are slightly lagging. According to previous tests by Leiphone on multiple brands of smart speakers, some brands only wait for a second round of interaction after answering certain questions, while in most cases, users still need to use the wake word multiple times to wake the device. However, according to previous reports, Baidu will release an upgrade to DuerOS later this year to enhance interaction capabilities, improving the experience for smart speakers that currently require frequent use of the wake word “Xiaodu Xiaodu” to wake up.
If smart speakers want to become smarter, the first issue to resolve is the communication barrier. How to improve semantic understanding so that smart speakers better comprehend user questions and wait for the next round of interaction at appropriate times to achieve more natural communication is a pressing issue.
After all, using the wake word too often can be quite annoying…
– END –
◆ ◆ ◆
Recommended Reading
Panda Live Announces Official Closure; Cook Changes Twitter Name to Tim Apple
Huawei Announces Decision to Sue the U.S. Government
Meng Wanzhou Extradition Hearing Postponed; Ma Huateng Discusses Didi Incident: Avoid One-Size-Fits-All; Panda Live Reported Bankrupt
Years Later, Facing Chat Treasure, Luo Yonghao Can No Longer Recall the Idealism He Once Spoke Of
Bullet Message Team Announces Dissolution; Multi-Platform iPhone Price Drops for Promotion; Forbes 2019 List: Top 20 Includes Ma Huateng, Excludes Jack Ma
360 Group CTO and Chief Security Officer Tan Xiaosheng Announces Departure
Follow Leiphone (leiphone-sz) and reply 2 to join the reader group and make friends