Custom Matrix Multiplier IP (3x3, 6-bit)

A 3x3 unsigned matrix multiplier implemented in two ways:

Pure VHDL for Digilent NEXYS A7-100T.
Zynq-7000 (ZedBoard) HW/SW co-design with an AXI4-Lite wrapper and bare-metal driver.

Repository Layout

Custom_Matrix_Multiplier/Pure_VHDL/ — Vivado project for the pure VHDL build targeting NEXYS A7-100T.
Custom_Matrix_Multiplier/ip_repo/matrix_multiplier_1_0/ — Custom AXI4-Lite 3x3 multiplier IP.
Custom_Matrix_Multiplier/Block_Design/ — Vivado block design with Zynq PS, AXI interconnect, and matrix_multiplier_1_0, producing zynq_design_1_wrapper.xsa.
matrix_multiplier/ — Vitis workspace for the driver that writes matrices, triggers the accelerator, and prints results with timing (src/mm.c).
zynq_design_1_wrapper.xsa — Hardware definition used by the Vitis custom platform component.

Data Format and Register Map (AXI IP)

Each matrix row is packed into one 32-bit register using 8-bit fields (values should stay within 6-bit/0–63 per spec):

slv_reg0-slv_reg2: A rows (A1, A2, A3)
slv_reg3-slv_reg5: B rows (B1, B2, B3)
slv_reg6-slv_reg14: Result rows (P_a1..P_c3)

Outputs are zero-extended to 32 bits with the adder carry in the MSB of the stored sum (matrix_mult.vhd uses INPUT_WIDTH=6, extends products to SUMW, then concatenates carry + sum to 32 bits).

Boards and Tool Versions

Vivado/Vitis 2025.1 (projects and IP cache built with this toolchain).
Pure VHDL target: Digilent NEXYS A7-100T (Artix-7) with seven-segment output in hexadecimal.
Co-design target: ZedBoard (Zynq-7000 PS + custom AXI4-Lite IP).

Build and Run: Hardware/Software Co-Design (Zynq)

In Vitis 2025.1, create a custom platform component with zynq_design_1_wrapper.xsa.
Import the app from matrix_multiplier/ and select the platform component.
Build the matrix_multiplier driver.
Program the ZedBoard with the Vivado bitstream, then run the Vitis application on PS7.
The app writes sample matrices, measures latency with XTime_GetTime, and prints input/output matrices in hex and decimal (matrix_multiplier/src/mm.c).
To test other matrices, edit the pack() calls for A and B in matrix_multiplier/src/mm.c, rebuild, and rerun (keep per-element values <= 63).

Build and Run: Pure VHDL (NEXYS A7-100T)

Open Custom_Matrix_Multiplier/Pure_VHDL/Pure_VHDL.xpr in Vivado 2025.1.
Customize matrix inputs under Custom_Matrix_Multiplier\Pure_VHDL\Pure_VHDL.srcs\sources_1\new\input_rom.vhd.
Synthesize, implement, and generate the bitstream for NEXYS A7-100T.
The design drives seven-segment displays with the 3x3 product in hex.
Program the NEXYS A7-100T and verify output against your reference matrices.

Design Notes

RTL core (matrix_mult.vhd) slices each 32-bit AXI row into three 6-bit elements, multiplies all 9 combinations, and accumulates with two-stage ripple-carry adders per element.
AXI wrapper (matrix_multiplier_slave_lite_v1_0_S00_AXI.vhd) exposes 15 registers: 6 write-only inputs, 9 read-only results. Results update combinationally as soon as inputs change.
C driver uses XPAR_MATRIX_MULTIPLIER_0_BASEADDR from xparameters.h and accesses the registers directly; timing is PS-side only because the AXI peripheral is combinational.

Performance Comparison

Pure VHDL (fully combinational): worst-case critical path delay is 4.126 ns (Vivado post-implementation), so the 3x3 multiply finishes within one 10 ns period at 100 MHz. Implied Fmax ≈ 1 / 4.126 ns = 242 MHz.
HW/SW co-design (AXI4-Lite + ARM): end-to-end latency from the ARM PS through AXI4-Lite averages 52 ns. Overhead is dominated by AXI transfers and software call latency; results are averaged over 1000 runs.
Takeaway: raw combinational IP is far faster than needed for 100 MHz; the co-design trades 48 ns of interface/software overhead for flexibility.

Results

Block design (Zynq PS + AXI + IP):
RTL elaboration (VHDL core structure):
Post-implementation timing summary and waveform:
AXI4-Lite measured timing and console output (HW/SW co-design):

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Custom_Matrix_Multiplier		Custom_Matrix_Multiplier
Mult		Mult
images		images
matrix_multiplier		matrix_multiplier
.gitignore		.gitignore
README.md		README.md
zynq_design_1_wrapper.xsa		zynq_design_1_wrapper.xsa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Custom Matrix Multiplier IP (3x3, 6-bit)

Table of Contents

Repository Layout

Data Format and Register Map (AXI IP)

Boards and Tool Versions

Build and Run: Hardware/Software Co-Design (Zynq)

Build and Run: Pure VHDL (NEXYS A7-100T)

Design Notes

Performance Comparison

Results

About

Uh oh!

Releases

Packages

Languages

Danial-Changez/Matrix-Multiplier-AXI4-IP

Folders and files

Latest commit

History

Repository files navigation

Custom Matrix Multiplier IP (3x3, 6-bit)

Table of Contents

Repository Layout

Data Format and Register Map (AXI IP)

Boards and Tool Versions

Build and Run: Hardware/Software Co-Design (Zynq)

Build and Run: Pure VHDL (NEXYS A7-100T)

Design Notes

Performance Comparison

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages