Skip to content

Add Automatic On-Chip Instrumentation example #87

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Nov 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ Example | Description
[axi_target](./axi_target)|Example of an AXI4-Target top-level interface.
[Canny_RISCV](./Canny_RISCV)|Integrating a SmartHLS module created using the IP Flow into the RISC-V subsystem.
[ECC_demo](./ECC_demo)|Example of Error Correction Code feature.
[auto-instrument](./auto-instrument/)|Example of Automatic On-Chip Instrumentation feature.

## Simple Examples
Example | Description
Expand Down
3 changes: 3 additions & 0 deletions auto_instrument/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
SRCS=main.cpp
LOCAL_CONFIG = -legup-config=config.tcl
HLS_INSTRUMENT_ENABLE=1
577 changes: 577 additions & 0 deletions auto_instrument/README.md

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/empty_signals_delay_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/empty_signals_delay_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/example_design.drawio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/identify_gui.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/identify_gui_connect.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/identify_gui_setup.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/local_remote.drawio.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/usedw_delay_12.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added auto_instrument/assets/write_data.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 6 additions & 0 deletions auto_instrument/config.tcl
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
source $env(SHLS_ROOT_DIR)/examples/legup.tcl
set_project PolarFireSoC MPFS250T Icicle_SoC

# Set other parameters and constraints here
# Refer to the user guide for more information: https://microchiptech.github.io/fpga-hls-docs/constraintsmanual.html
set_parameter CLOCK_PERIOD 10
142 changes: 142 additions & 0 deletions auto_instrument/main.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
#include <iostream>
#include <csignal>
#include <hls/streaming.hpp>
#include <hls/hls_alloc.h>

// FIFO depths:
#define FIFO1_DEPTH 256
#define FIFO2_DEPTH 256
#define FIFO3_DEPTH 256
#define FIFO4_DEPTH 256

//------------------------------------------------------------------------------
// Write data to `fifo` at a rate of 1 element per clock-cycle.
// Arguments:
// go: controls the loop execution (1 = run, 0 = stop)
// fifo: a reference to an hls::FIFO of int type where elements are written.
void producer(volatile unsigned char& go, hls::FIFO<int>& fifo) {
short int counter = 0b0;
#pragma HLS loop pipeline
while (go) {
// We write twice to the FIFO to overlap the loading of the "go" variable.
// Even though the II=2 for this function we effectively write one word
// every cycle, as if we had II=1
fifo.write(0xFEED0000 | counter++);
fifo.write(0xFEED0000 | counter++);
}
// Special flag that indicates the end of the program. This flag will propagate
// through the pipeline.
fifo.write(0x0FF);
}

//------------------------------------------------------------------------------
// Wait for the first element to appear in `inputFifo`, then wait for `delay`
// clock-cycles before start forwarding elements from `inputFifo` to `outputFifo`
// at a rate of 1 element per clock-cycle.
void fifoToFifo(hls::FIFO<int>& inputFifo, hls::FIFO<int>& outputFifo, unsigned long long int delay) {
#pragma HLS function replicate

// Wait until the first element appears in `inputFifo`.
#pragma HLS loop pipeline
while(inputFifo.empty());

// Induce a delay of `delay` clock-cycles.
#pragma HLS loop pipeline
for (unsigned long long int i = 0; i < delay; i++) {
// `printf` is a nice way to avoid the loop being optimized away, however,
// it only executes in software, in hardware it is ignored.
printf("Stall...\n");
}

// Forward dat from `inputFifo` to `outputFifo` until the end-of-program flag is reached.
int inputElement;
#pragma HLS loop pipeline
while ((inputElement = inputFifo.read()) != 0x0FF) {
outputFifo.write(inputElement);
}
outputFifo.write(0x0FF);
}

//------------------------------------------------------------------------------
// Wait for the first element to appear in `fifo`, then wait for `delay` clock-cycles
// before start reading the fifo and drop the contents.
void consumer(hls::FIFO<int>& fifo, unsigned long long int delay) {
// Wait until the first element appears in `inputFifo`.
#pragma HLS loop pipeline
while(fifo.empty());

// Induce a delay of `delay` clock-cycles.
#pragma HLS loop pipeline
for (unsigned long long int i = 0; i < delay; i++) {
// `printf` is a nice way to avoid the loop being optimized away, however,
// it only executes in software, in hardware it is ignored.
printf("Stall...\n");
}

#pragma HLS loop pipeline
while(fifo.read() != 0x0FF); // Until the end-of-program flag is reached
}


//------------------------------------------------------------------------------
// Design pipeline - top-level HLS module
void hlsModule(volatile unsigned char& go,
unsigned long long int delay1,
unsigned long long int delay2,
unsigned long long int delay3,
unsigned long long int delay4) {
#pragma HLS function dataflow top
#pragma HLS interface default type(axi_target)

hls::FIFO<int> fifo1(FIFO1_DEPTH);
hls::FIFO<int> fifo2(FIFO2_DEPTH);
hls::FIFO<int> fifo3(FIFO3_DEPTH);
hls::FIFO<int> fifo4(FIFO4_DEPTH);

producer(go, fifo1);
fifoToFifo(fifo1, fifo2, delay1);
fifoToFifo(fifo2, fifo3, delay2);
fifoToFifo(fifo3, fifo4, delay3);
consumer(fifo4, delay4);
}


//------------------------------------------------------------------------------
// When compiling for RISC-V CPU (i.e. not generating hardware)
#ifndef __SYNTHESIS__
#include "hls_output/accelerator_drivers/auto_instrument_accelerator_driver.h"

// The virtual base address for the HLS module in the RISC-V memory.
// This is initialized by the hlsModule_setup() function
void* virtualAddress;

// Signal handler for SIGINT. Writes 0 to the `go` argument, which effectively
// "stops" the HLS module.
void reset(int signal) {
printf("\nCaught SIGINT (Ctrl + C). Stopping the HLS module.\n");
unsigned char go = 0;
hlsModule_memcpy_write_go(&go, 1, virtualAddress);
}

int main(int argc, char** argv) {
if (argc != 5) {
printf("usage: %s delay1 delay2 delay3 delay4\n", argv[0]);
exit(-1);
}

// The following code uses driver functions to perform the following
// * Set up the virtual address for the on-chip memory
// * Write 1 to `go` and write all delay values (this will launch the accelerator)
virtualAddress = hlsModule_setup();
if (virtualAddress == NULL) {
printf("%s: Error: Could not set up virtual address.\n", argv[0]);
exit(-1);
}
unsigned char go = 1;
signal(SIGINT, reset);
printf("Starting the pipeline. Send SIGINT (Ctrl + C) anytime to stop the hardware accelerator.\n");
hlsModule_write_input_and_start(&go, atoi(argv[1]), atoi(argv[2]), atoi(argv[3]), atoi(argv[4]), virtualAddress);
hlsModule_join_and_read_output(virtualAddress);
hlsModule_teardown();
}
#endif // __SYNTHESIS__