BlogMember Blog

Zephyr on HiFi4: DSP Development, Simplified

By January 26, 2026No Comments
Zephyr on HiFi4: DSP Development, Simplified - NXP blog

This blog was recently published on LinkedIn by Iuliana Prodan.

Iuliana is a Software Engineer at NXP, working primarily on Zephyr, as well as Sound Open Firmware and Linux Audio Subsystems. She is passionate about exploring new technologies, driving innovation, and improving everyday engineering practices, with a strong focus on collaboration and knowledge sharing.

Running Zephyr real-time operating system (RTOS) on ARM® Cortex®-A or Cortex-M cores is a well-established practice, supported by extensive documentation and examples. However, several processors in the NXP i.MX and i.MX RT families include additional compute engine—one or more Cadence Tensilica digital signal processors (DSPs) cores—designed for high-performance audio, voice and neural network processing.

This blog article will focus on utilizing the Cadence Xtensa HiFi4 DSP, which is most widely used across NXP’s product lineup. However, the concepts and methods described here are also applicable to other Cadence DSPs, which are carefully selected by NXP to give the best power efficiency and performance tradeoffs.

The HiFi4 DSP offloads compute-intensive workloads from the main ARM cores, improving overall system performance and energy efficiency. With Zephyr RTOS support, the HiFi4 DSP becomes an accessible, open and highly flexible platform for embedded developers targeting heterogeneous NXP applications.

The Role of the HiFi4 DSP in NXP Architectures

The NXP i.MX 8M Plus is a representative example of a heterogeneous architecture, integrating:

  • Four ARM Cortex-A53 application cores (up to 1.8 GHz)
  • One ARM Cortex-M7 real-time core (up to 800 MHz)
  • One Cadence HiFi4 DSP (up to 800 MHz)

Similarly, several devices in the i.MX RT crossover microcontroller unit (MCU) family also include a HiFi4 DSP core, combining microcontroller simplicity with DSP acceleration for advanced real-time and audio processing.

These heterogeneous designs enable workload partitioning according to performance, latency, and power requirements. Linux typically operates on the Cortex-A cores, while Zephyr RTOS runs on the Cortex-M core. Zephyr can also be deployed on the HiFi4 DSP for signal or data processing tasks.

Zephyr on HiFi4: DSP Development, Simplified - NXP blog

This illustrates how heterogeneous NXP architectures partition workloads across ARM and DSP cores for optimal performance.

The DSP is optimized for:

  • Audio and voice codecs
  • AI and neural network pre- and post-processing
  • Fast Fourier Transform (FFT), filtering and echo cancellation
  • Low-latency communication with ARM cores through Open Asymmetric Multi-Processing (OpenAMP) inter-processor communication (IPC)

By offloading such functions to the DSP, systems can achieve higher responsiveness, reduced CPU load and lower energy consumption.

Zephyr RTOS on HiFi4 DSP

The Zephyr Project is an open source, scalable RTOS optimized for embedded and heterogeneous environments. It supports multiple hardware architectures while providing a consistent, modular framework for device drivers, IPC and synchronization.

NXP has contributed extensions to Zephyr RTOS to enable HiFi4 DSP support across both i.MX and i.MX RT product families. These enhancements make it easier for developers, and the wider community, to take full advantage of DSP acceleration in mixed-core systems.

Supported platforms include:

Additionally, on some i.MX RT targets, we have other DSPs such as HiFi1 or Fusion F1.

 

The same Zephyr build environment can be used for all of these targets, allowing a unified development workflow across ARM and DSP cores.

Firmware loading and runtime management are handled by the Linux remoteproc driver (on i.MX platforms) or multicore management frameworks (on i.MX RT platforms), while OpenAMP provides robust intercore messaging.

This illustrates how OpenAMP enables fast, reliable intercore communication between ARM and DSP in NXP systems.

From Basic Execution to Intercore Collaboration

The Zephyr project offers a variety of examples that demonstrate its capabilities—from basic system bring-up to advanced processing and intercore communication. The following sections will walk you through several examples that demonstrate how to use Zephyr on the HiFi4 DSP.

Hello World Example

The standard Zephyr hello_world sample demonstrates successful boot and execution of Zephyr on the HiFi4 DSP. Once the firmware is built and loaded, the DSP console output confirms successful startup:

Example of Hello World from Zephyr OS and booting up innovation on i.MX platforms.

 

This sample establishes a foundation for more advanced applications involving inter-processor communication and workload offloading.

Number Crunching and DSP Acceleration

The number_crunching example highlights the computational advantages of the HiFi4 DSP. This sample performs vector operations, fast Fourier transform (FFT) and filtering using either the Cortex microcontroller software interface standard–digital signal processing (CMSIS-DSP) backend or the highly optimized Cadence NatureDSP library.

Execution cycle counts demonstrate the significant efficiency gains achieved by the NatureDSP backend, particularly for FFT and infinite impulse response (IIR) filter routines. These performance advantages make the HiFi4 DSP ideal for tasks such as audio post-processing, beamforming and real-time data filtering.

OpenAMP Inter-Processor Communication

Many applications benefit from collaboration between the ARM and DSP cores. The openamp_rsc_table sample demonstrates how Zephyr running on the HiFi4 DSP communicates with Linux running on an ARM core, using OpenAMP and Remote Processor Messaging (RPMsg). This enables reliable and low-latency message passing between heterogeneous cores.

For example, imagine a mixed-OS multicore system where a Cortex-A core runs Linux while the HiFi4 DSP runs Zephyr. Linux can handle user-space interfaces and high-level control, while the DSP executes computational tasks under Zephyr RTOS, exchanging data in real time through shared memory.

Audio Offload with Sound Open Firmware (SOF)

For advanced audio applications, SOF builds on Zephyr RTOS to provide a complete open source audio processing framework on the HiFi4 DSP.

SOF enables professional-grade, low-latency audio pipelines, fully integrated with Advanced Linux Sound Architecture (ALSA) on Cortex-A platforms. It supports:

  • Multi-channel audio routing
  • Voice pre- and post-processing
  • Audio effect chains and dynamic reconfiguration

This framework demonstrates how Zephyr enables scalable, production-ready DSP solutions for i.MX product line.

Advantages of Running Zephyr on the DSP

Running Zephyr RTOS on the HiFi4 DSP provides multiple benefits:

  • Unified development flow: Common APIs, tools and build systems across ARM and DSP targets running the RTOS
  • Performance optimization: Offload high-intensity compute or signal processing workloads from ARM cores
  • Open and extensible: Leverages the open source Zephyr and SOF ecosystems to minimize long-term technical debt
  • Scalable system design: enables seamless cooperation between Linux, Zephyr on ARM and Zephyr on DSP

Bringing it All Together

The Cadence HiFi4 DSP integrated in NXP i.MX and i.MX RT processors is a high-performance, low-power compute engine well suited for signal processing, audio and AI acceleration. Through Zephyr RTOS support, this DSP becomes an integral part of a unified, heterogeneous processing environment.

From basic Zephyr examples such as hello_world, to performance-oriented number-crunching routines and complex intercore communication with OpenAMP, Zephyr on the HiFi4 DSP delivers a scalable foundation for innovation. Together with SOF, this capability extends to production-ready audio pipelines and advanced embedded workloads, offering flexibility and openness across the NXP ecosystem.

Zephyr RTOS enables the HiFi4 DSP to operate as a powerful coprocessor—unlocking new performance and efficiency opportunities for next-generation of embedded designs.