Voice recognition is used as a human machine interface (HMI) which has been adopted in a lot of products including robotics and smart speakers. The voice recognition solution is developed from the need to adopt more convenient functions while keeping costs as low as possible in consumer and industrial equipment. Voice recognition function is becoming an important additional feature, as it can help visually impaired and elderly persons by enabling the use of spoken commands to achieve certain tasks. Renesas' voice recognition solutions do not need an internet connection (edge voice recognition solution), thus providing differentiation and high functionality in current products.

The voice recognition solution is implemented with an A/D converter or I2S (Inter-IC Sound) and middleware and enables a high recognition rate under noisy environment conditions using noise suppressor technology.

Image
Voice Recognition System Overview

 


Noise Suppression Technologies

  • Beamforming - Reduce noise from another target
  • Noise suppressor - Reduce steady noise
  • Echo cancellation - Prevent or remove echo that is being created or already present

Solutions

RX231, RX651, RA6M1 Voice Recognition Solution

This solution enables edge voice recognition with a small board.

RX671, RX72N Voice Recognition Solution

Start development quickly with a purchasable Renesas evaluation board

RA4M2 ECM Voice Recognition Solution

Cost-effective edge voice recognition solution using ECM (Electret Condenser Microphone)

RA4W1 Voice Recognition with Bluetooth® Low Energy Solution

This solution enables edge voice recognition, voice playback, Bluetooth Low Energy, and environmental sensing using a single RA4W1 MCU.

RX671 Voice Recognition, Capacitive Touch and Cloud Demo

This solution enables edge voice recognition, capacitive touch, and LCD control using a single RX671 MCU. This solution can also use the Wi-Fi Pmod™ Expansion Board for remote control on the cloud.

RA6M3 HMI Solution

This solution enables edge voice recognition, voice playback, capacitive touch operation, and environmental sensing using a single RA6M3 MCU.


RX231, RX651, RA6M1 Voice Recognition Solution

This solution enables edge voice recognition with a small board.

Features

  • Small voice recognition solution with MEMS microphone
  • Control LED on/off and infrared communication* compatible devices via voice recognition
  • Easily change voice recognition parameters by checking the voice waveform with the evaluation tool

*Supported by RX231 voice recognition solution

Image
RX231 Voice Recognition Solution Board

RX231 Voice Recognition Solution

Image
RA651 Voice Recognition Solution Board

RX651 Voice Recognition Solution

Image
RA6M1 Voice Recognition Solution Board

RA6M1 Voice Recognition Solution

 RX231 Voice Recognition SolutionRX651 Voice Recognition SolutionRA6M1 Voice Recognition Solution
HardwareMCURX231 (R5F52318ADFL)
ROM/RAM: 512KB/64KB
Package: 48-pin LQFP
RX651 (R5F5651EDDFM)
ROM/RAM: 2MB/640KB
Package: 64-pin LFQFP
RA6M1 (R7FA6M1AD3CFM)
ROM/RAM: 512KB/256KB
Package: 64-pin LQFP
MicrophoneDigital MEMS Mic x2Analog MEMS Mic x2Analog MEMS Mic x2
Other functionsInfrared communication, RGB LED, USB (Full Speed), push switchRGB LED, USB (Full Speed), push switchRGB LED, USB (Full Speed), push switch
Board size60mm x 40mm60mm x 40mm60mm x 40mm
SoftwareOSNot usedNot usedNot used
MiddlewareAdvanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
-Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Reference Designs

 HardwareSoftware (Source Code & Application Notes) and Voice Recognition Evaluation Tool
RX231 Voice Recognition SolutionRX231 Group Voice Recognition Demo Board Rev.1.01 (PDF | English, 日本語)Contact a Renesas sales office for detailed information
RX651 Voice Recognition SolutionRX651 Group Voice Recognition Demo Board (PDF | English, 日本語)
RA6M1 Voice Recognition SolutionRA6M1 Group Voice Recognition Demo Board (PDF | English, 日本語)

RX671, RX72N Voice Recognition Solution

Start development quickly with a purchasable Renesas evaluation board

Features

  • Edge voice recognition solution with MEMS microphone
  • Downloadable demo software
  • Easily change voice recognition parameters by checking the voice waveform with the evaluation tool
Image
RX671 Voice Recognition Solution

RX671 Voice Recognition Solution

Image
RX72N Voice Recognition Solution

RX72N Voice Recognition Solution

 RX671 Voice Recognition SolutionRX72N Voice Recognition Solution
Hardware

Renesas Starter Kit+ for RX671
(Model number: RTK55671EHS10000BE)

  • MCU: RX671 (R5F5671EHDFB)
    • ROM/RAM: 2MB/384KB
    • Package: 144-pin LFQFP
  • Digital MEMS Mic x2

RX72N Envision Kit
(Model number: RTK5RX72N0C00000BJ)

  • MCU: RX72N (R5F572NDHDFB)
    • ROM/RAM: 4MB+64KB/1MB
    • Package: 144-pin LFQFP
  • Digital MEMS Mic x2
SoftwareOSNot usedNot used
MiddlewareAdvanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Advanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Download

ItemsNotes
RX671 Group Voice Recognition Demonstration (AmiVoice Micro) Rev.1.00 - Sample Code (ZIP | English, 日本語)Supported languages: Japanese, English
Contact a Renesas sales office for sample source and evaluation tool
RX671 Group Voice Recognition Demonstration (Voice Trigger Middleware)Coming soon
RX72N Group Voice Recognition Demonstration (AmiVoice Micro) Rev.1.00 - Sample Code (ZIP | English, 日本語)Supported languages: Japanese, English
Contact a Renesas sales office for sample source and evaluation tool
RX72N Group Voice Recognition Demonstration (Voice Trigger Middleware)Coming soon

RA4M2 ECM Voice Recognition Solution

Cost-effective edge voice recognition solution using ECM (Electret Condenser Microphone)

Features

  • Low BOM cost and small board voice recognition solution
  • Use cost-efficient ECM for voice input
  • Selectable ECM for evaluation and its amp gain is changeable
Image
RA4M2 ECM Voice Recognition Solution
 RA4M2 ECM Voice Recognition Solution
HardwareMCURA4M2 (R7FA4M2AD3CFL)
ROM/RAM:512KB/128KB
Package: 48-pin LQFP
Op-ampREAD2303G
MicrophoneElectret Condenser Microphone x1
Other functionsRGB LED, USB (Full Speed), push switch
Board size60mm x 40mm
SoftwareOSNot used
MiddlewareAdvanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger
Techno Mathematical/Zoom Voice

Reference Designs

ItemsNotes
RA4M2 Voice Recognition ECM Demo Board (PDF | English, 日本語)Contact a Renesas sales office for how to obtain the demo board
RA4M2 Group Voice Recognition Demo Board Sample SoftwareContact a Renesas sales office for detailed information

Download

ItemsNotes
RA4M2 Group Voice Recognition Demonstration(AmiVoiceMicro) Rev.1.00 (PDF | English, 日本語)Supported languages: Japanese, English, Mandarin Chinese
RA4M2 Group Voice Recognition Demonstration(Voice Trigger Middleware) Rev.1.00 (PDF | English, 日本語)Supported languages: Japanese, American English, Mandarin Chinese

RA4W1 Voice Recognition with Bluetooth® Low Energy Solution

This solution enables edge voice recognition, voice playback, Bluetooth Low Energy communication, and environmental sensing using a single RA4W1 MCU.

Features

  • Voice recognition, voice playback, Bluetooth Low Energy control, and environmental sensor control with a single RA4W1 MCU
  • Generates audio feedback according to the voice recognition result and sends the result to a smartphone via Bluetooth Low Energy
  • Operate a demo board and confirm sensor information via Bluetooth Low Energy using a mobile device
Image
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Board
 RA4W1 Voice Recognition with Bluetooth Low Energy Solution
HardwareEK-RA4W1
  • MCU: RA4W1 (R7FA4W1AD2CNG)
    • ROM/RAM: 512KB/96KB
    • Package: 56-pin QFN
  • Bluetooth Low Energy circuit
  • USB Full Speed device
  • Arduino™ UNO connector
HMI Expansion Board
  • Analog MEMS Mic x2
  • External expansion microphone circuit (MEMS type (analog output) or Electret condenser type)
  • Speaker operation circuit and speaker
  • Humidity and temperature Sensor (Renesas/HS3001)
  • Gas sensor (Renesas/ZMOD4410)
  • Arduino Uno connection
SoftwareOSNot used
MiddlewareAdvanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
Toshiba Digital Solutions/Voice Trigger
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
 * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech

Reference Designs

ItemsNotes
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo BoardContact a Renesas sales office for detailed information
RA4W1 Voice Recognition with Bluetooth Low Energy Solution Demo Sample Software

Download

DeliverablesNotes
RA4W1 Group Voice Recognition Demonstration(AmiVoiceMicro) (PDF | English, 日本語)Supported languages: Japanese, English
RA4W1 Group Voice Recognition Demonstration(Voice Trigger Middleware) (PDF | English, 日本語)Supported languages: Japanese, American English, Mandarin Chinese

RX671 Voice Recognition, Capacitive Touch and Cloud Demo

This solution enables edge voice recognition, capacitive touch, and LCD control using a single RX671 MCU. This solution can also use the Wi-Fi Pmod™ Expansion Board for remote control on the cloud.

Features

  • Realize voice recognition, capacitive touch, and LCD control (LCD module) using a single RX671
  • Change application settings via voice recognition and capacitive touch, and display results on an LCD
  • Enable remote control by connecting to the cloud (AWS) via a Wi-Fi module
Image
RX671 Voice Recognition, Capacitive Touch and Cloud Demo
 RX671 Voice Recognition, Capacitive Touch and Cloud Demo
HardwareRenesas Starter Kit+ for RX671
  • MCU: RX671 (R5F5671EHDFB: Supported Encrypt Function)
    • ROM/RAM: 2MB/384KB
    • Package: 144-pin LFQFP
  • Built-in audio circuit. SSIE (Serial Sound Interface) can be evaluated.
  • Touch feature (self-capacitive type) can be evaluated.
  • Encryption engines and key management function by Trusted Secure IP can be evaluated.
  • Built-in SD memory card slot. SDHI (SD Host Interface) can be evaluated.
  • 1 channel USB Function or 1 channel USB Host can be evaluated.
Wi-Fi Pmod™ Expansion Board
  • IEEE 802.11b/g/n compliant, 2.4GHz, HT20, MCS0-7, up to 13-ch
  • 1x1 single stream system
  • One UART and one HS-UART MCU host interface
  • Full suite of AT command support
SoftwareOSAmazon FreeRTOS
MiddlewareAdvanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
Toshiba Digital Solutions/RECAIUS™ Voice Trigger (Coming soon)

Download

ItemsNotes
RX671 Group Voice recognition / Touch and Cloud solution using Renesas Starter Kit+ for RX671 Rev.1.00 (PDF | English, 日本語) 
RX671 Group Voice recognition / Touch and Cloud solution using Renesas Starter Kit+ for RX671 Rev.1.00 - Sample Code (ZIP)Contact a Renesas sales office to request the source code.

RA6M3 HMI Solution

This solution enables edge voice recognition, voice playback, capacitive touch operation, and environmental sensing using a single RA6M3 MCU.

Features

  • Realize voice recognition, voice playback, TFT LCD control, and environmental sensor control using the 1-chip RA6M3
  • Use voice recognition to change the TFT LCD settings, and get voice feedback
  • Easily change M/W parameters while checking the voice waveform with the evaluation tool
Image
RA6M3 HMI Solution
 RA6M3 HMI solution
HardwareEK-RA6M3G
  • MCU: RA6M3 (R7FA6M3AH3CFC)
    • Package: 176-pin LQFP
  • USB (Debug, Full Speed, High Speed)
  • Graphics expansion board
    • 4.3-inch TFT color LCD (capacitive touch overlay with controller)
    • 480 x 272 resolution
    • Backlight controller
HMI Expansion Board
  • Analog MEMS Mic x2
  • External expansion microphone circuit (MEMS type (analog output) or Electret condenser type)
  • Speaker operation circuit and speaker
  • Humidity and temperature Sensor (Renesas/HS3001)
  • Gas sensor (Renesas/ZMOD4410)
  • Arduino Uno connection
SoftwareOSAmazon Free RTOS
MiddlewareAdvanced Media/AmiVoice Micro
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
Toshiba Digital Solutions/Voice Trigger
Techno Mathematical/Zoom Voice
CRI Middleware/D-Amp Driver
 * The voice playback file was created by Toshiba Digital Solutions/RECAIUS speech synthesis middleware Text-to-Speech

Reference Designs

 HardwareSoftware (Source Code & Application Notes) and Voice Recognition Evaluation Tool
RA6M3 HMI solutionRA6M3 Group RA6M3 HMI Expansion Board (PDF | English, 日本語)Contact a Renesas sales office for detailed information

Evaluation Tool

Features

Enables the below functions by connecting the evaluation board to a PC.

  • Visually check the sound input as a waveform
  • Change the M/W parameters for voice recognition and noise reduction
  • Display recognized ID
  • Sound data before and after noise processing can be saved and played back
Image
Voice Recognition Evaluation Tool

Recommended Middleware


Advanced Media/AmiVoice Micro - Voice Recognition

Advanced Media/AmiVoice Micro enables voice recognition without an internet connection with a low clock and small memory environment compared to existing products.

Supported microcontrollers (MCUs)

Renesas Core:

  • RXv2 (RX231/RX230, RX23W, RX65N, RX651, RX64M Group etc.)
  • RXv3 (RX671, RX66N, RX72M, RX72N Group etc.)

Arm Core:

  • Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
  • Arm Cortex-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
  • Arm Cortex-A9 (RZ/A Series)
ModelRequired Memory SizeLanguages
Normal ModelROM: Over 33KB, RAM: Over 23KBJapanese, English, Chinese (Mandarin), Thai, Korean
High Recognition ModelROM: Over 482KB, RAM: Over 23KBJapanese

Required ROM/RAM Size to Recognition Word Number

Number of WordsNormal Model (KB)High Recognition Model (KB)
ROMRAMROMRAM
5332348223
10542568125
20782899528
3096301,22630
40109331,44433
50117331,58733
100143462,14346
150160552,45255

The information referenced changes according to the language and the content of the recognition word.

The high recognition model is able to improve the recognition rate by consuming more of the ROM usage for calculation compared to the normal model.

Support for Voice Activity Detection (VAD)

This support includes a module that detects sections of only human speech from any voice, and the detection sensitivity can be adjusted according to usage scenes and tasks.


Toshiba Digital Solutions/RECAIUS™ Voice Trigger - Voice Recognition

RECAIUS Voice Trigger realizes voice control function without an internet connection. You can change target phrases without speech data and use this as a customized detector of your own wake-words and/or voice commands.

Supported MCUs

Renesas Core:

  • RXv2 (RX65N, RX651, RX64M Group, etc.)
  • RXv3 (RX671, RX66N, RX72M, RX72N Group, etc.)

Arm Core:

  • Arm® Cortex®-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
  • Arm® Cortex®-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
  • Arm® Cortex®-A9 (RZ/A Series)

Supported languages: Japanese, American English and Mandarin Chinese
To be commercialized (available for evaluation): Canadian French, American Spanish, British English, French, German, Spanish, Italian

Required Memory Size

Number of WordsROM (KB)RAM (KB)
514545
1016050
2019065

The information referenced changes according to the language and the content of the recognition word.


Techno Mathematical/Zoom Voice - Noise Suppressor Technology

Zoom Voice supports two noise suppressor technologies, beamforming, and noise suppressor.

Beamforming

  • Extract the target sound properly from the front with reducing the background noise
  • Use two non-directive microphones
  • Effect could be set from "1: weak to 7: strong"

Noise Suppressor

  • Noise reduction 30dB (about 1/30) max.
  • Noise reduction could be set according to a frequency

High-Speed Process Version Applied DSP Instruction
The processing speed of a DSP instruction applied version is 30% higher.

Supported MCUs:

DSP instruction applied version: Renesas Core:

  • RXv2 (RX231/RX230, RX23W, RX65N, RX651, RX64M Group, etc.)
  • RXv3 (RX671, RX66N, RX72M, RX72N Group, etc.)

Normal version: Arm Core:

  • Arm Cortex-M4 (RA6M1, RA6M2, RA6M3 Group, etc.)
  • Arm Cortex-M33 (RA4M2, RA4M3, RA6M4, RA6M5 Group, etc.)
  • Arm Cortex-A9 (RZ/A Series)
Noise Suppressor TechnologyRequired Memory Size
BeamformingROM: 40KB, RAM: 10KB
Noise SuppressorROM: 40KB, RAM: 10KB

Beamforming and Noise Suppressor Use Case

Image
Beam Forming and Noise Suppressor Use Case

The high recognition rate is achieved even under noisy environments by using Zoom Voice. A very high effect can be expected at a 5dB or less S/N ratio.

The recognition rate by using Zoom Voice under noisy environments (AmiVoice Micro is used for voice recognition) is shown below.

Image
Zoom Voice Recognition Rate

Note 1: The comparison was done using the sound of a vacuum cleaner and washing machine as the source of the noise.
Note 2: This data is based on research from Renesas.


Partners

Image
Advanced Media. Inc. Logo

Advanced Media. Inc.

Development and sales of voice recognition software products

Contact: https://www.advanced-media.co.jp/contact/english/

Image
Toshiba Logo

Toshiba Digital Solutions

System integration, development, manufacture, and sales of ICT solutions utilizing IoT and AI technology

Contact: https://www.toshiba-sol.co.jp/en/contact/index.html
Email: [email protected]

Image
Techno Mathematical Co., Ltd.

Techno Mathematical Co., Ltd.

Development and sales of image, acoustic and sound processing software and hardware products

Contact: https://www.tmath.co.jp/en/contact/


Image
Lab on the Cloud

Lab on the Cloud

Renesas' Lab on the Cloud is an online environment where Renesas solutions, including popular evaluation boards, winning combinations, and software, are hosted in a remote lab that customers access and test online.

Voice Recognition Solutions

The demo system is a simple working solution that recognizes voice commands to initiate the corresponding operation. It uses a high-performance Arm Cortex-M4 core based RA6M1 MCU. It is highly efficient and supported by an open and flexible ecosystem – the Flexible Software Package built on FreeRTOS to reduce development time. The boards are trained with the voice models to recognize and results in voice response as well as voice match score.

Access the Lab

Documentation

Downloads

Type Title Date
Sample Code Log in to Download ZIP 2.09 MB 日本語
Application: Key Technologies
Compiler: CC-RX Function: Communication Interface IDE: e2 studio
Sample Code Log in to Download ZIP 3.30 MB
Application: Consumer Electronics, Industrial, Key Technologies
Compiler: CC-RX Function: Communication Interface, HMI IDE: e2 studio
Sample Code Log in to Download ZIP 3.30 MB 日本語
Application: Key Technologies
Compiler: CC-RX Function: Communication Interface IDE: e2 studio
3 items

Videos

RX660 Voice Recognition Solution

This video explains the newly designed RX660 voice recognition solution, including board information, relevant tools, and evaluation results in an environment that appears in daily life.