Skip to content

ARM-software/ACI-GetStarted

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GitHub Plugin Linux Plugin Windows GPR Test GPR Example MVE Test MVE Example

Get Started with Arm Custom Instructions (ACI)

Arm Custom Instructions (ACI) extend Arm processors with application-specific instructions to optimize the performance of algorithms. ACI is currently implemented on Cortex-M33, Cortex-M52, Cortex-M55, and Cortex-M85 processors using the Custom Datapath Extension (CDE). It extends the processor with a custom compute pipeline for accelerators that avoids the overhead of the co-processor interface.

Note

The instruction set of the Cortex-M processor series is already comprehensive and delivers very good out-of-the-box performance with features like Helium for high efficient DSP and ML processing. However, in some cases custom defined instructions are beneficial. For example, when data inputs require bit manipulations that take several clock cycles. If this operation executes frequently, a single-cycle custom instruction improves performance and energy efficiency.

About this Repository

Imagine that you plan to accelerate a firmware with a set of custom instructions, but before proceeding to hardware design you would like to answer questions such as "How can we accelerate our algorithm?" There is usually more than one solution, and each solution corresponds to a set of custom instructions, so "which one is the best?".

This repository helps you to answer these questions by evaluating your code using software simulation before the time-consuming hardware design. It contains examples that you can adapt to your application requirements. These examples explain how ACI accelerated algorithms are developed by:

  • Defining a set of custom instructions utilizing ACI.

  • Adapt C/C++ source code to use ACI with intrinsic functions.

  • Extend Arm simulation models with custom instructions and estimate the performance gains.

  • Verify the ACI set before starting hardware design.

This repository does not include content related to hardware design. The details of the hardware interface for ACI are available in the Integration and Implementation Manual of the Cortex-M processor products. This document is available for licensees of the related Arm IP or under NDA (Non-disclosure agreement). If you wish to access this document, please contact the Arm technical support team.

Introduction Webinar

Register here for the introduction webinar on April 8, 2025.

ACI Introduction Webinar

Technology Overview

Arm Custom Instructions (ACI, also known as Custom Datapath Extensions in the architecture specification) is an optional feature to allow chip designers to add custom data processing operations in their silicon products. Potentially this can provide higher performance and energy efficiency in certain specialized data processing tasks. Technical details are covered in the Introduction to the Arm Custom Instructions / Custom Datapath Extension.

In addition the following resource pages are helpful:

All C/C++ compilers that implement Arm C Language Extension (ACLE) support CDE intrinsic functions to execute ACI.

ACI access General Purpose Register (R0-R15) or the Vector Register file that contains 32-bit float register (S0-S31), 64-bit double registers (D0-D15), or 128-bit vector registers (Q0-Q7) as shown in the diagram below.

Vector Register File

ACI Categories Register Access Notes
32-bit and 64-bit integer R0-R15 float8/16/32 values can be passed using a C union.
32-bit single-precision float S0-S31 Available if FPU extension is implemented.
64-bit double-precision float D0-D15 Available if FPU extension with double precision float is implemented.
128-bit vector Q0-Q7 Available if MVE (Helium) is implemented.

Example Projects

Introducing custom instructions is frequently an iterative process as algorithms might need adoptions to the underlying compute architecture. Exploring such algorithms on simulation models is an effective method to evaluate custom instructions on realistic compute workloads. This repository contains example projects that utilize this method of exploring ACI including the validation of the custom instruction extension. As these examples have a permissive open-source license they can be used a starting point for optimizing your own algorithms with ACI technology.

  • GPR implements a 32-bit integer population count custom instruction. The population count instruction is useful for many algorithms, for example to calculate the Hamming weight.

  • MVE implements 128-bit vector instructions to accelerate algorithms for image and pixel manipulation. The custom instructions are may be used in the Arm-2D image processing library and the example demonstrates the performance gain.

Required Tools

All popular C/C++ compilers for Arm Cortex-M processors implement Arm C Language Extension (ACLE) support CDE intrinsic functions to execute ACI. Code that is using ACI is portable between C/C++ compilers. Debuggers do not require extensions as ACI uses processor registers that are already visible in debug views.

Custom instructions do not require changes to existing software or middleware. For example, any RTOS kernel with Cortex-M processor support will also work with devices that extend the processor with a set of ACI.

The example projects in this repository use the following tools:

  • Keil MDK: µVision or Keil Studio IDE for creating application software.
  • CMSIS-Toolbox for command-line build.
  • AVH-FVP simulation models for Cortex-M processors (uses Arm Fast Models).
  • GCC Compiler and Make to translate plugins for AVH-FVP simulation models on Linux or Windows Hosts.

GitHub Actions

The repository uses GitHub Actions to generate the plugins and verify examples and tests.

Action Description
build-plugins-linux.yml Generate the AVH-FVP plugin extensions for the ACI examples. Download plugin artifact for Linux.
build-plugins-windows.yml Generate the AVH-FVP plugin extensions for the ACI examples. Download plugin artifact for Windows.
GPR-test.yml Validation of AVH-FVP plugin for GPR ACI extension.
GRP-example.yml Build and execution test for GPR example project.
MVE-test.yml Validation of AVH-FVP plugin for MVE ACI extension.
MVE-example.yml Build and execution test for MVE example project.

Related

Arm Custom Instructions - Technology

Cortex-M Processor - Technical Information

Software Development Tools

Hardware IP

License

The example projects in this repository are licensed under License.

Issues

Please feel free to raise an issue on GitHub to report misbehavior (i.e. bugs) or start discussions about enhancements. This is your best way to interact directly with the maintenance team and the community.

About

ACI-GetStarted

Resources

License

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 7