AI algorithms are steadily moving into IoT edge devices, bringing advantages in terms of real-time performance, power efficiency, and enhanced security. One popular implementation of artificial intelligence (AI) at the edge is a smart speaker with voice control.
According to a survey from Business Insider Intelligence, as many as half of US respondents reported living in a home with a voice-enabled AI device. The most popular uses for smart speakers are audio listening, making inquiries, and shopping but not necessarily purchasing, according to eMarketer.
The introduction of battery-powered smart speakers to the market is sure to uncover more use cases for smart speakers and help drive further market growth. Being untethered from a power cord allows users to move their smart speakers around with them from room to room, or outside to the patio or pool – basically anywhere there is an Internet connection.
AI modes of operation
AI relies on two distinct modes of operation. The first is the learning or training phase, where machine learning algorithms are used to create working models. The inference phase is where the system interprets data based on the training. This knowledge generally exists on the device in the form of inference tables. For edge devices, the user experience is greatly influenced by the performance of the inference phase.
Smart speakers make use of inference tables for functions such as wakeup or word recognition. The smart speaker listens for a specific spoken word and uses the stored inference tables to determine if the word it just heard was its predetermined wake-up word.
Flash memory considerations
Because these systems rely heavily on code and data storage, system performance and cost are directly dependent on the performance and cost of the memory. When designing a battery-powered edge device that executes AI algorithms, architects must revisit their system’s memory architecture.
In the article, “Select the Right Flash Memory for Your Battery-Powered AI Speaker with Voice Control,” Adesto’s Bard Pedersen examines different choices of memory architectures for implementing the inference phase of a smart battery speaker.
Two key considerations for a user-friendly experience are addressed in this article:
- Users want long battery life, so the system must have ultra-low energy consumption while in an idle state
- Users expect an immediate response from the device once they speak their command word, so the response time must be quick
The paper examines:
- Considerations for choosing a memory architecture for AI applications
- Trade-offs between different options including external DRAM, external quad SPI flash, internal embedded flash
- An approach using a serial NOR flash device with octal SPI for code, data and execute-in-place operation