Voice recognition is used as a human machine interface (HMI) which has been adopted in a lot of products including robotics and smart speakers. The voice recognition solution is developed from the need to adopt more convenient functions while keeping costs as low as possible in consumer and industrial equipment. Voice recognition function is becoming an important additional feature, as it can help visually impaired and elderly persons by enabling the use of spoken commands to achieve certain tasks. Renesas' voice recognition solutions do not need an internet connection (edge voice recognition solution), thus providing differentiation and high functionality in current products.
The voice recognition solution is implemented with an A/D converter or I2S (Inter-IC Sound) and middleware and enables a high recognition rate under noisy environment conditions using noise suppressor technology.
Noise Suppression Technologies
- Beamforming - Reduce noise from another target
- Noise suppressor - Reduce steady noise
- Echo cancellation - Prevent or remove echo that is being created or already present
Solutions
RX231, RX651, RA6M1 Voice Recognition Solution
This solution enables edge voice recognition with a small board.
RX671, RX72N Voice Recognition Solution
Start development quickly with a purchasable Renesas evaluation board
RA4M2 ECM Voice Recognition Solution
Cost-effective edge voice recognition solution using ECM (Electret Condenser Microphone)
RA4W1 Voice Recognition with Bluetooth® Low Energy Solution
This solution enables edge voice recognition, voice playback, Bluetooth Low Energy, and environmental sensing using a single RA4W1 MCU.
RX671 Voice Recognition, Capacitive Touch and Cloud Demo
This solution enables edge voice recognition, capacitive touch, and LCD control using a single RX671 MCU. This solution can also use the Wi-Fi Pmod™ Expansion Board for remote control on the cloud.
RA6M3 HMI Solution
This solution enables edge voice recognition, voice playback, capacitive touch operation, and environmental sensing using a single RA6M3 MCU.
RX231, RX651, RA6M1 Voice Recognition Solution
This solution enables edge voice recognition with a small board.
Features
- Small voice recognition solution with MEMS microphone
- Control LED on/off and infrared communication* compatible devices via voice recognition
- Easily change voice recognition parameters by checking the voice waveform with the evaluation tool
*Supported by RX231 voice recognition solution
RX231 Voice Recognition Solution | RX651 Voice Recognition Solution | RA6M1 Voice Recognition Solution | ||
---|---|---|---|---|
Hardware | MCU | RX231 (R5F52318ADFL) ROM/RAM: 512KB/64KB Package: 48-pin LQFP | RX651 (R5F5651EDDFM) ROM/RAM: 2MB/640KB Package: 64-pin LFQFP | RA6M1 (R7FA6M1AD3CFM) ROM/RAM: 512KB/256KB Package: 64-pin LQFP |
Microphone | Digital MEMS Mic x2 | Analog MEMS Mic x2 | Analog MEMS Mic x2 | |
Other functions | Infrared communication, RGB LED, USB (Full Speed), push switch | RGB LED, USB (Full Speed), push switch | RGB LED, USB (Full Speed), push switch | |
Board size | 60mm x 40mm | 60mm x 40mm | 60mm x 40mm | |
Software | OS | Not used | Not used | Not used |
Middleware | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | |
- | Toshiba Digital Solutions/RECAIUS™ Voice Trigger Techno Mathematical/Zoom Voice | Toshiba Digital Solutions/RECAIUS™ Voice Trigger Techno Mathematical/Zoom Voice |
Reference Designs
Hardware | Software (Source Code & Application Notes) and Voice Recognition Evaluation Tool | |
---|---|---|
RX231 Voice Recognition Solution | RX231 Group Voice Recognition Demo Board Rev.1.01 (PDF | English, 日本語) | Contact a Renesas sales office for detailed information |
RX651 Voice Recognition Solution | RX651 Group Voice Recognition Demo Board (PDF | English, 日本語) | |
RA6M1 Voice Recognition Solution | RA6M1 Group Voice Recognition Demo Board (PDF | English, 日本語) |
RX671, RX72N Voice Recognition Solution
Start development quickly with a purchasable Renesas evaluation board
Features
- Edge voice recognition solution with MEMS microphone
- Downloadable demo software
- Easily change voice recognition parameters by checking the voice waveform with the evaluation tool
RX671 Voice Recognition Solution | RX72N Voice Recognition Solution | ||
---|---|---|---|
Hardware | Renesas Starter Kit+ for RX671
| RX72N Envision Kit
| |
Software | OS | Not used | Not used |
Middleware | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | |
Toshiba Digital Solutions/RECAIUS™ Voice Trigger Techno Mathematical/Zoom Voice | Toshiba Digital Solutions/RECAIUS™ Voice Trigger Techno Mathematical/Zoom Voice |
Download
Items | Notes |
---|---|
RX671 Group Voice Recognition Demonstration (AmiVoice Micro) Rev.1.00 - Sample Code (ZIP | English, 日本語) | Supported languages: Japanese, English Contact a Renesas sales office for sample source and evaluation tool |
RX671 Group Voice Recognition Demonstration (Voice Trigger Middleware) | Coming soon |
RX72N Group Voice Recognition Demonstration (AmiVoice Micro) Rev.1.00 - Sample Code (ZIP | English, 日本語) | Supported languages: Japanese, English Contact a Renesas sales office for sample source and evaluation tool |
RX72N Group Voice Recognition Demonstration (Voice Trigger Middleware) | Coming soon |
RA4M2 ECM Voice Recognition Solution
Cost-effective edge voice recognition solution using ECM (Electret Condenser Microphone)
Features
- Low BOM cost and small board voice recognition solution
- Use cost-efficient ECM for voice input
- Selectable ECM for evaluation and its amp gain is changeable
RA4M2 ECM Voice Recognition Solution | ||
---|---|---|
Hardware | MCU | RA4M2 (R7FA4M2AD3CFL) ROM/RAM:512KB/128KB Package: 48-pin LQFP |
Op-amp | READ2303G | |
Microphone | Electret Condenser Microphone x1 | |
Other functions | RGB LED, USB (Full Speed), push switch | |
Board size | 60mm x 40mm | |
Software | OS | Not used |
Middleware | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | |
Toshiba Digital Solutions/RECAIUS™ Voice Trigger Techno Mathematical/Zoom Voice |
Reference Designs
Items | Notes |
---|---|
RA4M2 Voice Recognition ECM Demo Board (PDF | English, 日本語) | Contact a Renesas sales office for how to obtain the demo board |
RA4M2 Group Voice Recognition Demo Board Sample Software | Contact a Renesas sales office for detailed information |
Download
Items | Notes |
---|---|
RA4M2 Group Voice Recognition Demonstration(AmiVoiceMicro) Rev.1.00 (PDF | English, 日本語) | Supported languages: Japanese, English, Mandarin Chinese |
RA4M2 Group Voice Recognition Demonstration(Voice Trigger Middleware) Rev.1.00 (PDF | English, 日本語) | Supported languages: Japanese, American English, Mandarin Chinese |
RA4W1 Voice Recognition with Bluetooth® Low Energy Solution
This solution enables edge voice recognition, voice playback, Bluetooth Low Energy communication, and environmental sensing using a single RA4W1 MCU.
Features
- Voice recognition, voice playback, Bluetooth Low Energy control, and environmental sensor control with a single RA4W1 MCU
- Generates audio feedback according to the voice recognition result and sends the result to a smartphone via Bluetooth Low Energy
- Operate a demo board and confirm sensor information via Bluetooth Low Energy using a mobile device
RA4W1 Voice Recognition with Bluetooth Low Energy Solution | ||
---|---|---|
Hardware | EK-RA4W1 |
|
HMI Expansion Board | ||
Software | OS | Not used |
Middleware | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice CRI Middleware/D-Amp Driver | |
Toshiba Digital Solutions/Voice Trigger Techno Mathematical/Zoom Voice CRI Middleware/D-Amp Driver * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech |
Reference Designs
Items | Notes |
---|---|
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Board | Contact a Renesas sales office for detailed information |
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Sample Software |
Download
Deliverables | Notes |
---|---|
RA4W1 Group Voice Recognition Demonstration(AmiVoiceMicro) (PDF | English, 日本語) | Supported languages: Japanese, English |
RA4W1 Group Voice Recognition Demonstration(Voice Trigger Middleware) (PDF | English, 日本語) | Supported languages: Japanese, American English, Mandarin Chinese |
RX671 Voice Recognition, Capacitive Touch and Cloud Demo
This solution enables edge voice recognition, capacitive touch, and LCD control using a single RX671 MCU. This solution can also use the Wi-Fi Pmod™ Expansion Board for remote control on the cloud.
Features
- Realize voice recognition, capacitive touch, and LCD control (LCD module) using a single RX671
- Change application settings via voice recognition and capacitive touch, and display results on an LCD
- Enable remote control by connecting to the cloud (AWS) via a Wi-Fi module
RX671 Voice Recognition, Capacitive Touch and Cloud Demo | ||
---|---|---|
Hardware | Renesas Starter Kit+ for RX671 |
|
Wi-Fi Pmod™ Expansion Board |
| |
Software | OS | Amazon FreeRTOS |
Middleware | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice | |
Toshiba Digital Solutions/RECAIUS™ Voice Trigger (Coming soon) |
Download
RA6M3 HMI Solution
This solution enables edge voice recognition, voice playback, capacitive touch operation, and environmental sensing using a single RA6M3 MCU.
Features
- Realize voice recognition, voice playback, TFT LCD control, and environmental sensor control using the 1-chip RA6M3
- Use voice recognition to change the TFT LCD settings, and get voice feedback
- Easily change M/W parameters while checking the voice waveform with the evaluation tool
RA6M3 HMI solution | ||
---|---|---|
Hardware | EK-RA6M3G |
|
HMI Expansion Board | ||
Software | OS | Amazon Free RTOS |
Middleware | Advanced Media/AmiVoice Micro Techno Mathematical/Zoom Voice CRI Middleware/D-Amp Driver | |
Toshiba Digital Solutions/Voice Trigger Techno Mathematical/Zoom Voice CRI Middleware/D-Amp Driver * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech |
Reference Designs
Hardware | Software (Source Code & Application Notes) and Voice Recognition Evaluation Tool | |
---|---|---|
RA6M3 HMI solution | RA6M3 Group RA6M3 HMI Expansion Board (PDF | English, 日本語) | Contact a Renesas sales office for detailed information |
Evaluation Tool
Features
Enables the below functions by connecting the evaluation board to a PC.
- Visually check the sound input as a waveform
- Change the M/W parameters for voice recognition and noise reduction
- Display recognized ID
- Sound data before and after noise processing can be saved and played back
Recommended Middleware
Advanced Media/AmiVoice Micro - Voice Recognition
Advanced Media/AmiVoice Micro enables voice recognition without an internet connection with a low clock and small memory environment compared to existing products.
Supported microcontrollers (MCUs)
Renesas Core:
- RXv2 (RX231/RX230, RX23W, RX65N, RX651, RX64M Group etc.)
- RXv3 (RX671, RX66N, RX72M, RX72N Group etc.)
Arm Core:
- Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
- Arm Cortex-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
- Arm Cortex-A9 (RZ/A Series)
Model | Required Memory Size | Languages |
---|---|---|
Normal Model | ROM: Over 33KB, RAM: Over 23KB | Japanese, English, Chinese (Mandarin), Thai, Korean |
High Recognition Model | ROM: Over 482KB, RAM: Over 23KB | Japanese |
Required ROM/RAM Size to Recognition Word Number
Number of Words | Normal Model (KB) | High Recognition Model (KB) | ||
---|---|---|---|---|
ROM | RAM | ROM | RAM | |
5 | 33 | 23 | 482 | 23 |
10 | 54 | 25 | 681 | 25 |
20 | 78 | 28 | 995 | 28 |
30 | 96 | 30 | 1,226 | 30 |
40 | 109 | 33 | 1,444 | 33 |
50 | 117 | 33 | 1,587 | 33 |
100 | 143 | 46 | 2,143 | 46 |
150 | 160 | 55 | 2,452 | 55 |
The information referenced changes according to the language and the content of the recognition word.
The high recognition model is able to improve the recognition rate by consuming more of the ROM usage for calculation compared to the normal model.
Support for Voice Activity Detection (VAD)
This support includes a module that detects sections of only human speech from any voice, and the detection sensitivity can be adjusted according to usage scenes and tasks.
Toshiba Digital Solutions/RECAIUS™ Voice Trigger - Voice Recognition
RECAIUS Voice Trigger realizes voice control function without an internet connection. You can change target phrases without speech data and use this as a customized detector of your own wake-words and/or voice commands.
Supported MCUs
Renesas Core:
- RXv2 (RX65N, RX651, RX64M Group, etc.)
- RXv3 (RX671, RX66N, RX72M, RX72N Group, etc.)
Arm Core:
- Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
- Arm® Cortex®-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
- Arm® Cortex®-A9 (RZ/A Series)
Supported languages: Japanese, American English and Mandarin Chinese
To be commercialized (available for evaluation): Canadian French, American Spanish, British English, French, German, Spanish, Italian
Required Memory Size
Number of Words | ROM (KB) | RAM (KB) |
---|---|---|
5 | 145 | 45 |
10 | 160 | 50 |
20 | 190 | 65 |
The information referenced changes according to the language and the content of the recognition word.
Techno Mathematical/Zoom Voice - Noise Suppressor Technology
Zoom Voice supports two noise suppressor technologies, beamforming, and noise suppressor.
Beamforming
- Extract the target sound properly from the front with reducing the background noise
- Use two non-directive microphones
- Effect could be set from "1: weak to 7: strong"
Noise Suppressor
- Noise reduction 30dB (about 1/30) max.
- Noise reduction could be set according to a frequency
High-Speed Process Version Applied DSP Instruction
The processing speed of a DSP instruction applied version is 30% higher.
Supported MCUs:
DSP instruction applied version: Renesas Core:
- RXv2 (RX231/RX230, RX23W, RX65N, RX651, RX64M Group, etc.)
- RXv3 (RX671, RX66N, RX72M, RX72N Group, etc.)
Normal version: Arm Core:
- Arm Cortex-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
- Arm Cortex-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
- Arm Cortex-A9 (RZ/A Series)
Noise Suppressor Technology | Required Memory Size |
---|---|
Beamforming | ROM: 40KB, RAM: 10KB |
Noise Suppressor | ROM: 40KB, RAM: 10KB |
Beamforming and Noise Suppressor Use Case
The high recognition rate is achieved even under noisy environments by using Zoom Voice. A very high effect can be expected at a 5dB or less S/N ratio.
The recognition rate by using Zoom Voice under noisy environments (AmiVoice Micro is used for voice recognition) is shown below.
Note 1: The comparison was done using the sound of a vacuum cleaner and washing machine as the source of the noise.
Note 2: This data is based on research from Renesas.
Partners
Advanced Media. Inc.
Development and sales of voice recognition software products
Toshiba Digital Solutions
System integration, development, manufacture, and sales of ICT solutions utilizing IoT and AI technology
Contact: https://www.toshiba-sol.co.jp/en/contact/index.html
Email: [email protected]
Techno Mathematical Co., Ltd.
Development and sales of image, acoustic and sound processing software and hardware products
Contact: https://www.tmath.co.jp/en/contact/
Lab on the Cloud
Renesas' Lab on the Cloud is an online environment where Renesas solutions, including popular evaluation boards, winning combinations, and software, are hosted in a remote lab that customers access and test online.
Voice Recognition Solutions
The demo system is a simple working solution that recognizes voice commands to initiate the corresponding operation. It uses a high-performance Arm Cortex-M4 core based RA6M1 MCU. It is highly efficient and supported by an open and flexible ecosystem – the Flexible Software Package built on FreeRTOS to reduce development time. The boards are trained with the voice models to recognize and results in voice response as well as voice match score.