Current Position: Home > News > Company News Back
Voice recognition competition is fierce than imagination! Amazon's rise and Microsoft
Release Time:2022-4-14 11:43:27
Voice recognition is a technology that can recognize spoken words, and then convert it into text. A subset of voice recognition is voice recognition, which is a technology based on voice -based understanding of others. Amazon, Microsoft, Google and Apple have provided this feature on various devices through Google Home, Amazon Echo and Siri through Google Home, Amazon Echo and Siri.

With the launch of many voice recognition products in the market, we decided to study the business meaning of voice recognition. By studying the voice recognition technology of these companies, we try to answer the following questions for readers:

How does voice recognition drive the commercial value of these companies?

Why do they invest in voice recognition?

What will this technology look like a few years later?

From some backgrounds, we explore how technology giants and why are they developing voice recognition technology. Followed by the decline of voice recognition technology from Amazon, Microsoft, Google, and Apple.

Potential causes of development of voice recognition technology

Technology companies are aware of their interest in voice recognition technology and are working hard to make speech recognition a standard for most products. One of the goals of these companies may be to make the voice assistant speak and answer more accurately around context and content. Studies have shown that the use of virtual assistants with voice recognition functions is expected to continue to grow next year, from 60.5 million in 2017 to the United States to 62.4 million in 2018. By 2019, 66.6 million Americans used voice or voice recognition technology.

In order to establish a strong voice recognition experience, the artificial intelligence behind it must be better to cope with challenges such as accent and background noise. Today, the development of natural language processing and neural network technology has greatly improved voice and voice technology, so that today it is said to be the same as humans. For example, in 2017, the company's word error rate of Microsoft voice technology recorded by the company reached 5.1 %, while Google reported that it had reduced its error rate to 4.9 %.

Research and Markets reports that by 2023, the value of the voice recognition market will reach $ 18 billion. As voice recognition technology becomes larger and larger, the study is estimated that it can be applied to all areas from phone to refrigerators to car. You can see one of them at the CES 2017 exhibition held at Las Vegas, which launched or announced new equipment with voice.

Although all applications have very similar functions and integration opportunities, we have classified them as the main focus of each application based on our research focus.

Amazon Echo and Alexa

Until recently, Amazon's voice virtual assistant Alexa can only be available on Amazon's commercial products. However, Amazon Web Services has provided voice assistants to other companies. Amazon and Intel have launched the Alexa voice service equipment software development kit, which allows third -party companies to embed the Alexa function into its device. The cooperation is the result of Amazon's "Alexa Everywhere" strategy. The company said that the strategy aims to enable manufacturers of various intelligent and wearable devices to generally use the technology behind Alexa.

At CES 2018 held in Las Vegas, Sony, Tivo and Hisense released smart home technology integrating Alexa, enabling customers to control TV through voice. Household appliance manufacturers such as Whirlpool, Delta, LG and Haier have also added Alexa's voice recognition skills to help people control all aspects of houses, from TV, microwave to air -conditioning device and faucet. According to the Amazon Alexa website, Alexa can control more than 13,000 smart home devices from more than 2,500 brands.

Including other companies, Alexa now has 30,000 skills. Although Apple has Siri, Google built its unnamed virtual assistant in smartphones and speakers, Amazon integrated Alexa into smart speakers Echo. Amazon did not disclose the final sales figure, and Forrester predicted that 22 million Echo units will be sold at the end of 2017. Forrester said that reaching this sales figure will make Echo the largest voice assistant in the United States.

As a virtual assistant, Amazon claims that Amazon provided Alexa for Business to help professionals manage the schedule, track tasks and set reminders. When integrated to the conference console and other devices, the application can control the conference room setting through the voice of the speaker. Alexa devices can also be used as audio conference equipment in smaller conference rooms, or as control equipment in larger conference rooms.

Logitech built Alexa into its Harmony remote device to control home entertainment systems and smart home devices. When customers say simple commands (such as "Alexa, open TV" or "Alexa, play DVD"), the remote unit will be activated. Alexa then sent the request to Harmony, and the latter used the infrared to relay to the home equipment, Bluetooth or IP.

According to Amazon, the prototype team is composed of a high -end software architect of Logitech. He spent two hours integrated Alexa into Harmony. Once the prototype is ready, the team of Logitech prepares the skills required for launch. According to Logitech's data, Amazon reported that the process from prototype development to production -level skills was less than two weeks. No other detailed information or numbers are provided in this case study.

At a more basic level, Amazon also offers automatic voice recognition (ASR) service transcripe, enabling developers to add voice -to -text functions to their applications. Once the voice function is integrated into the application, the end user can analyze the audio file and then receive the text file of the transcription voice.

Google Home and Assistant

Google Assistant is Google's voice virtual assistant. Its skills include tasks such as sending and requesting payment through Google Pay or failure to eliminate Pixel mobile phones.

ASSISTATANT can be used on Android or iOS, smart watches, Pixelbook laptops, Android smart TV/display and Android automatic launch. When keeping quiet in places such as library, users can also type commands in Assistant. Google Assistant provides children and families with 50 vocal -related games.

The Google smart speaker carried with you includes Home. Google said that the speaker can be used in conjunction with more than 5,000 smart home devices from more than 150 brands, such as coffee machines, lights and thermostats, including Sony, Philips, LG and Toshiba. According to reports, in the first quarter of 2018, Google sold 3.2 million HOME and MINI devices, surpassing Alexa's Echo device (2.5 million units). Both companies did not release official data.

In order to make Assistant more popular, Google opened a software development tool package through Actions, which allows developers to build sounds in their own products that support artificial intelligence. Google has recently launched the Assistant Investments program, which is invested in startups that are dedicated to improving voice and auxiliary technology (both hardware or software) and focus on tourism, games or hotel industries.

According to the plan, Google will support technology, business development and product potential customers. The startup will also get the first access to the new features and plans of Assistant; the credit of Google products (including Google Cloud); and potential joint marketing opportunities.

Another voice recognition product of Google is a text tool driven by AI, and developers can convert audio into text through deep learning neural network algorithms. This tool can use 120 languages, support voice commands and control, transcribe audio from the call center, process real -time flow or pre -recorded audio.

Microsoft Cortana

In October 2017, Microsoft released his voice virtual assistant Cortana.

Cortana family speakers and mobile device applications can provide users with reminders; retain notes and lists; according to Microsoft, it can help manage the calendar. It can be downloaded from Apple Store and Google Play, and can run on personal computers, smart speakers and mobile phones.

On the Microsoft family speakers named Invoke, Cortana programmed to help users control music voice, queue up the playlist, increase or lower the volume. And stop or start the track. However, it does not support major music flow services other than Spotify. Microsoft said that smart speakers can also answer various questions. Dial and answer the Skype call; and check the latest news and weather.

Microsoft claims that on PC, Cortana can manage users' emails by managing users across Office 365, Outlook and Gmail accounts. Microsoft said Cortana's customers or technical partners include Domino, Spotify, Capital One, Philips and Fitbit.

The core of Microsoft voice recognition technology is the "voice transition text" interface, which can transcribe audio flow into text. This is the same as creating Cortana, Office and other Microsoft products. Microsoft said the service can identify the ending of the voice and provide formatting options, including uppercase and punctuation and language translation.

Apple's Siri

When Apple integrated Siri to iPhone 4 in 2011, the virtual assistant was connected to many web services and provided voice -driven functions, such as ordering taxis via Taximagic, extracting the concert details from Stubhub, looking for movie reviews from Rotten Tomatoes , Or screen the restaurant data in Yelp.

Today, Siri functions include translation, played songs, booking amusement facilities, and transfer funds between bank accounts. According to Apple, because it has a machine learning function, it can be programmed using a new command.

Although Siri was released before Google Assistant and Amazon Alexa, compared with technology in other markets, the accuracy of responding to commands or problems is still worrying.

The reporter compared Siri with Google Assistant and Amazon's Alexa. On the one hand, Alexa responds more accurately. In our research, we also found longer video comments, which shows that Siri fell behind the accurate answers to all three voice technology questions.

It is predicted that from 2016 to 2024, the voice recognition industry worth $ 55 billion will grow at a rate of 11 %. This technology has been used in a small scale in the form of transcription applications, and it is well used in other industries in companies that are well known. At present, in medical care, medical professionals use voice to perform text transcription applications (such as Dolbey) to create electronic medical records for patients.

In law enforcement and legal departments, companies such as Nuance provide transcription applications for accurate and quickly recording documents is crucial, and transcription is also used to record event reports. In the media, the reporter uses transcription applications such as Recordly as a tool for recording and transcription information to help get more accurate news reports. In terms of education, Sonix helps researchers to record the content of qualitative interviews.

Among the five leading technology companies that provide voice and voice recognition functions, Google, Amazon, Microsoft, and Apple have similar functions. Focusing on the schedule, reminding, playing list management, contact with retailers, managing emails, ordering orders, ordering orders And online search.

These are all provided by mobile, personal computers, and most of them are provided by their own brand home speakers. Amazon's Alexa is on Echo, Apple's Siri is on HOMEPOD, Google Assistant is on Google Home, and Microsoft's Cortana is on Invoke.

Although Apple is a pioneer in this regard, it turns out that Siri Baya Alexa and Google assistants are much "stupid" and have limited functions compared to other products. A study consisting of nearly 5,000 issues shows that Google Assistant is the smartest of these four applications.

However, as far as skills are concerned, another report shows that Alexa has the most skills, 25,785, Google Assistant 1719, and Cortana 235. Siri is not included in this report. The reason for these companies to provide the business version of these applications is the growth of skills. Software Development Tool Pack (SDK) has provided developers, enabling startups and small enterprises to establish customized skills for their customers.

Titanium AIX is a mini artificial intelligence computer that integrates the two core functions of computer vision and intelligent voice interaction. It is equipped with a professional AI edge computing chip and a variety of sensors. The Model Play is for the AI model resource platform for global developers. It has a built -in diversified AI model, compatible with Titanic AIX, supports Google Edge TPU edge artificial intelligence computing chips to accelerate professional development.

In addition, the Model Play provides a complete and easy -to -use migration learning model training tool and rich model examples, which can be perfectly combined with titanium AIX to achieve rapid development of various artificial intelligence applications. Based on Google's open source neural network architecture and algorithm, constructing independent migration learning functions. Users do not need to write code, and can complete AI model training by selecting pictures, defining models and categories names to achieve artificial intelligence development.
Technical support: Wanguang Internet Universal Technology Group.All rights reserved 粤ICP备2022014387号