Alibaba Damo Academy AI Laboratory General Manager Qian Xue. Source: Interviewee provided image
The education market needs time; AI technology is the biggest bottleneck restricting the development of the entire smart speaker industry.
Interview | China Entrepreneur reporter Zhang Hong Liang Xiao
Text | Zhang Hong Editor | Yin Yijie
For latecomers in the Chinese market, replicating the success of Amazon Echo is not easy.
In mid-August, data from market research firm Canalys showed that in Q2 2018, global smart speaker shipments saw Amazon Echo (4.1 million units), once the “leader”, surpassed again by Google (5.4 million units), while Alibaba (3 million units) and Xiaomi (2 million units) ranked third and fourth globally.
Chart: China Entrepreneur
The increasing shipment volume highlights the rapid progress of the giants. Notably, the emerging Chinese market contributed nearly half of the smart speaker shipments. Data from the China Business Industry Research Institute shows that the smart speaker market in China will reach 330 million yuan in 2018. Analysts both domestically and internationally agree that the potential of the Chinese smart speaker market is limitless and may surpass that of the United States in the future.
Smart speakers are referred to as the “gateway to smart homes”. Currently, domestic internet giants are fiercely competing, with smartphone hardware manufacturers such as Huawei, Nokia, Lenovo, OPPO, and vivo all entering the market, not wanting to miss the opportunity.
However, on the capital level, this track has faced a cold reception from investors this year. A long-time investor in the smart manufacturing sector bluntly stated, “I no longer pay attention to this track,” and has shifted focus to the upstream and downstream of the supply chain.
In 2014, JD.com’s Dingdong smart speaker chief scientist Tang Yuezhong joined the early R&D team in collaboration with JD and iFlytek. He succinctly stated, “The window period has passed; latecomers may play the role of ‘fillers'”. He believes there are still opportunities in technology and content services for smart speakers.
CounterPoint’s China Research Director Yan Zhanmeng told China Entrepreneur: “While latecomers may have some opportunities, whether they can grow large is uncertain.” He believes that the smart speaker field will become a market dominated by major players, and small players can only explore opportunities from vertical niche markets or focus on content layout.
Latecomers
Opportunities for latecomers are running low.
“In 2015, capital missed an opportunity. They lacked confidence in this market and technology, reacted conservatively, and did not enter early. Startups entering now have fewer opportunities. Internationally, Amazon is leading the way, and the domestic market has already entered an explosive period from last year to now,” Tang Yuezhong said.
JD.com’s Dingdong smart speaker chief scientist Tang Yuezhong. Photography: Wang Pan
Compared to last year’s capital frenzy, this year investors have shifted their focus to chips, voice technology, components, and technology solution providers in the upstream and downstream supply chain.In June this year, the voice interaction solution provider behind Tmall Genie and Xiaomi AI speaker, iFlytek, announced the completion of a 500 million yuan Series D financing, led by Yuanhe Holdings and Zhongmin Investment, with Foxconn participating.
Even so, the cooled track continues to attract different players. Peripheral players such as Himalaya, Kugou, and Cheetah, along with home appliance manufacturers like Haier and Midea, have successively entered the market. Smartphone hardware manufacturers Huawei, Nokia, OPPO, vivo, and Lenovo are also rapidly laying out plans. In August this year, Huawei’s Consumer Business CEO Yu Chengdong revealed in a semi-annual performance report that Huawei is developing smart speakers, claiming that products will be launched in October.
However, in Yan Zhanmeng’s view, some manufacturers are “entering passively” to ensure profit growth or recover from setbacks, while others are entering to increase user stickiness beyond mobile phones.
This is not a good time; the market is dominated by giants, and there is much skepticism about latecomers. Even early entrants in this track have not had a smooth development.
In 2014, the startup Rokid from Hangzhou entered the smart speaker track, with most of its early team members coming from Alibaba’s M Laboratory. Timing-wise, Rokid was supposed to release products ahead of Amazon, but in terms of mass production, it was six months late. Rokid’s Vice President Xiang Wenjie, who was one of the early members involved in smart speaker product development, lamented in an interview this July that running a smart speaker company is “too difficult”.
Rokid Vice President Xiang Wenjie (third from right). Photography: Yu Zi
Self-described as “early to the market, late to the party”, the CEO of Outermost, Li Zhifei, told China Entrepreneur, “Initially, there were very few companies making speakers in the market; most manufacturers entered after last year.” In August 2017, the AI technology company Outermost officially released the smart speaker Tichome for home scenarios.
However, after joining the speaker war dominated by giants, Li Zhifei felt two major contradictions: one is the contradiction between AI and a fast market; in a market where AI technology is not yet mature, China is a market where “a war in a field can end in six months”; the second is the contradiction between a small market and large players. The domestic speaker market is originally small, and giants chose to enter the market when it began to grow, leaving other players caught off guard.
The Giants’ Game
“2018 is bound to be a fierce battle.” This statement comes from the mouth of Xiaoyu Home founder and CEO Song Chenfeng, referring to the price war of domestic smart speakers.
During the “Double Eleven” shopping festival in 2017, Alibaba’s AI laboratory slashed the original price of Tmall Genie X1 from 499 yuan to 99 yuan, triggering a buying frenzy, with Tmall Genie sales exceeding 2 million units. In July this year, Alibaba Damo Academy AI Laboratory General Manager Qian Xue stated in an interview that “the quantity of 2 million proves that users’ acceptance in the Chinese market is very high, and the explosive growth will likely exceed that of the US market.”
Tmall Genie. Source: Interviewee provided image
The popularity of Tmall Genie has led to a continued subsidy war. Companies like Xiaomi, Baidu, and JD have successively launched speakers priced under 100 yuan, all aiming to capture more market share.
“I really did not expect that in just one year, the market would completely turn into one dominated by giants at low prices,” Li Zhifei said, “The domestic route has mostly been to replicate Amazon Echo. What took three years abroad was completed in just one year domestically.”
In June 2015, Amazon Echo, which began selling about half a year after its release, saw sales exceed 2.5 million units that year. Soon after, Google entered the market, and for a time, the global smart speaker market maintained a situation of “dual hegemony” between Amazon and Google.
Latecomers followed in the footsteps of Amazon Echo. The earliest entrants in the domestic market were JD and iFlytek. At the end of 2014, JD established a team and in early 2015 jointly founded a company with iFlytek to enter the smart speaker track.
Tang Yuezhong explained that in 2014, although the concept of “smart hardware” was popular, everyone, including Amazon, was still exploring what to do specifically. “After filtering, drones and watches were eliminated, and we ultimately decided to make smart speakers.” At that time, Tang Yuezhong, who was worried about “entering the wrong industry,” chose smart speakers based on iFlytek’s experience—”we had already developed a self-researched speaker without a wake-up function.”
Just four months after the launch of Amazon Echo, in March 2015, the Dingdong smart speaker launched by Linglong Technology, co-founded by JD and iFlytek, was released, but market response was tepid. In 2016, iFlytek’s annual report disclosed that Dingdong smart speaker’s annual shipment volume was nearly 100,000 units.
While startups at the same time also tried to share the pie, the reality of technical bottlenecks limited their imagination. In 2016, when Tang Yuezhong attended a meeting in Shenzhen, he learned that there were over 200 startups in Nanshan District making smart speakers. “They (the startups) underestimated the technical content of smart speakers; at that time, many technologies such as microphone array wake-up, natural language understanding, and synthesis were still immature, and everyone was still in the exploration stage.”
This remains a battle between big companies.
In May 2016, Google released the smart speaker Google Home; a month later, Apple quickly followed with the smart speaker HomePod. In July of the following year, Alibaba and Xiaomi successively entered this blue ocean, and in November, Baidu launched the Raven H smart speaker series; in April 2018, Tencent’s Tingtai smart speaker was launched; in June, Baidu released the Xiaodu smart speaker. By then, all three BAT companies had entered the market.
“At this stage, only the giants can play this game,” Tang Yuezhong said.
Different Gameplay
Surviving under the pressure of large companies, Li Zhifei describes it as “being forced to leave”.
Competing for the C-end with these companies, startups undoubtedly find it a bit challenging. Li Zhifei told this publication that in the layout of smart speaker business, the to C business is hard to develop, so they had to adjust their strategy, gradually shifting to to B, while also targeting overseas markets that giants cannot reach.
Outermost CEO Li Zhifei. Source: Interviewee provided image
In Xiang Wenjie’s view, although the market is dominated by large companies, it is very difficult for any startup in the smart speaker sector to quickly achieve profitability, make money, or go public in the short term. “But refining a distinctive and competitive product and patiently waiting is what we should do now.”
Xiang Wenjie believes that the future development of smart speakers has two directions: one is an internet-oriented approach, lowering speaker prices and “giving” the products away to improve their business chain; the other is to platform the products, spreading them out to enhance user experience, making it indispensable for users.
At the 2018 World Robot Conference, a not-so-affordable speaker from Cheetah Mobile, the Little Leopard speaker, also made its way to the robot exhibition area. An on-site user told reporters that its sound quality and effects were indeed superior to the ordinary speakers he had used. When asked about the pricing, the staff responded that the reason for not following the low-price route was mainly due to the product’s cost covering sound effects, sound quality technology, and iterative self-research technology.
Cheetah Mobile’s director and senior vice president Zhou Pin stated that the positioning of the Little Leopard AI speaker is “explosive sound quality, a music listening artifact”, aiming to enter the market with excellent sound quality. Since last year, Cheetah Mobile’s AI company, Orion Star, has assisted Himalaya and Xiaomi in launching the Xiaoya speaker and Xiao Ai respectively as a voice OS technology provider.
The latter half of the smart speaker market will begin to adopt vertical differentiation strategies, which is also a consensus reached by giants, startups, and capital players. Compared to the giants’ low-price strategy, almost all smart speaker startups in China are unanimously pursuing a “high price, individuality” route.
Yan Zhanmeng told this publication that under the “invasion” of giant players, latecomers can hardly survive unless they do vertical niche markets and have different gameplay.
It cannot be ignored that voice technology startups are also making efforts. In May of this year, four leading AI voice startups in China almost simultaneously began to bet on AI voice chips, with Cloud Wisdom, Outermost, Rokid, and iFlytek successively launching their own AI voice dedicated chips.
Zhou Pin believes that after the mid-stage, the logic of smart speaker products will return to the essence of sound, deeply laying out the source of sound quality and music, using sound quality to attract and meet users’ original needs on speakers.
Bottleneck Breakthrough
The AI track is wide and long enough for everyone to gradually find their positioning. However, if the vision is not broad enough, it is impossible to create the future,” Qian Xue said. Alibaba, relying on its strong brand endorsement and industry resource integration capabilities, plays the role of industry promoter, working on educating the market.
When the early smart speakers were born, there were voices predicting their decline—Chinese people’s living habits and scenarios do not require voice interaction, which has been proven to be a fallacy.
However, among many industry insiders interviewed by this publication, “educating the market” remains a frequently mentioned keyword. Most believe that there is still significant room for improvement in the penetration of voice interaction concepts and scenario needs.
“Currently, the shipment volume of smart speakers in the domestic market is still less than that of smartphones,” Li Zhifei pointed out the low user penetration rate of smart speakers at this stage.
A survey from AVC indicates that 57% of Chinese consumers have heard of smart speakers, 23% have some understanding of them, but the frequency and duration of use after purchase are not ideal. Compared to nearly 30% penetration rate in American households, the penetration rate of smart speakers in China remains low.
Users generally report issues with far-field recognition, high false wake rates, unstable continuous dialogue functions, and poor semantic understanding during the use of speaker products.
Tang Yuezhong stated that product education is a process, and global consumers lack experience with voice interaction products. However, Chinese users’ ability to accept new technologies, products, and experiences exceeds expectations. Research data shows that elderly users in some southern cities have very high usage stickiness, with usage duration and activity exceeding expectations.
Yan Zhanmeng predicts that the domestic smart speaker market will welcome a significant explosion in 2019 or 2020, with the key to the real explosion lying in how to make smart speaker products “easy to use and fun”. To date, no “both easy to use and fun” disruptive product has emerged in the domestic market.
Currently, the competitiveness of smart speaker products in the market, aside from channels, emphasizes user experience the most. Voice interaction and service content are the core of competition. Li Zhifei describes the importance of voice interaction: voice interaction is the “soul”, and smart speakers are just the carrier.
However, educating the market takes time, and AI technology is the biggest bottleneck restricting the entire smart speaker industry. Xiang Wenjie believes that the entire industry is currently creating demand rather than using technology to meet demand.
Tang Yuezhong shares the same view. He believes that we are still in the weak AI stage, with technologies such as voice recognition, speech synthesis, natural language understanding, noise reduction, and wake-up only reaching a usable level, and have not yet reached a stage that satisfies users. In the early days, when Tang Yuezhong and his team developed the Dingdong speaker, the most time and effort were invested in the research and development of core technologies such as voice recognition, wake-up rate, and sound synthesis before they prepared to put a completed product into hardware.
He uses the microphone array as an example; to ensure a perfect sound quality experience, a balance must be struck between the quality of the microphone and the noise handling of the speaker. “How to achieve coordination between sound quality and sound effects involves hardware, software, and algorithms, which is also a technical bottleneck that all speaker products, including Amazon Echo, need to overcome,” Tang Yuezhong said. This will also be a key direction for giants to focus on outside of ecosystem layout. “If giants want to maintain their advantages in this industry, breakthroughs in technology are indispensable, aside from building ecosystems.”
Currently, large companies rely on their strong ecosystems to launch smart speakers, each with their strengths. For example, Tmall Genie integrates information from Alipay, Taobao, and Cainiao logistics within its AI system. Xiaomi’s Xiao Ai connects with compatible smart home devices.
However, Qian Xue told this publication that the iteration of hardware and software requires a connecting point, and most companies have not fully touched this point yet. This is also an area where Tmall Genie needs to break through. The focus of Tmall Genie is gradually shifting from internet connectivity to AI algorithms.
“This will still take some time,” Qian Xue said.
Zhang Hong [email protected]
。END 。
Duty Editor: Yang Qian Proofreading: Gao Huanhuan
[ Recommended Reading ] Click the image to read
2018 (17th) China Entrepreneur Leaders Annual Meeting
Business leaders, Ideation Storm, Business Insight, Private Opportunity
December 1-2 | Beijing · China Grand Hotel
Annual classic masterpiece
A touch of cutting-edge value opportunity
▼
Click to register immediately