This repository provides modules for the FlooNoC, a Network-on-Chip (NoC) which is part of the PULP (Parallel Ultra-Low Power) Platform. The repository includes Network Interface IPs (named chimneys), Routers and further NoC components to build a complete NoC. FlooNoC mainly supports AXI4+ATOPs, but can be easily extended to other On-Chip protocols. Arbitrary topologies are supported with several routing algorithms. FlooNoC is designed to be scalable and modular, and can be easily extended with new components.
Our NoC design is grounded in the following key principles:
- Full AXI4 Support: Our design fully supports AXI4+ATOPs from AXI5 as outlined here, particularly multiple outstanding burst transactions. It utilizes low-complexity routers and a decoupled link-level protocol to ensure scalability, thereby enabling tolerance to high-latency off-chip accesses.
- Decoupled Links and Networks: We use a link-level protocol that is decoupled from the network-level protocol. This allows us to move the complexity of the network-level protocol into the network interfaces, while deploying low-complexity routers in the network, that enable better scalability.
- Wide Physical Channels: We incorporate wide physical channels in order to meet the high-bandwidth requirements at network endpoints without being constrained by the operating frequency. This is in contrast to the traditional narrow link approach. Further, the NoC avoids any kind of serialization and sends entire messages in a single flit including header and tail information.
- Separation of traffic: Our design acknowledges the diversity in traffic patterns, as it decouples links and networks to handle wide, high-bandwidth, burst-based traffic and narrow, latency-sensitive traffic with separate physical channels.
- Modularity: Our design principles also emphasize modularity. We have developed a set of IPs that can be instantiated together to build a NoC. This approach not only promotes reusability but also facilitates flexibility in designing custom NoCs to cater to a variety of specific system requirements.
The names of the IPs are inspired by the Harry Potter universe, where the Floo Network is a magical transportation system. The Network interfaces are named after the fireplaces and chimneys used to access the Floo Network.
In use for centuries, the Floo Network, while somewhat uncomfortable, has many advantages. Firstly, unlike broomsticks, the Network can be used without fear of breaking the International Statute of Secrecy. Secondly, unlike Apparition, there is little to no danger of serious injury. Thirdly, it can be used to transport children, the elderly and the infirm."
Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All hardware sources and tool scripts are licensed under the Solderpad Hardware License 0.51 (see LICENSE
)
If you use FlooNoC in your research, please cite the following paper:
FlooNoC: A Multi-Tbps Wide NoC for Heterogeneous AXI4 Traffic
@misc{fischer2023floonoc,
title={FlooNoC: A Multi-Tbps Wide NoC for Heterogeneous AXI4 Traffic},
author={Tim Fischer and Michael Rogenmoser and Matheus Cavalcante and Frank K. Gürkaynak and Luca Benini},
year={2023},
eprint={2305.08562},
archivePrefix={arXiv},
primaryClass={cs.AR}
}
FlooNoC uses bender to manage its dependencies and to automatically generate compilation scripts. Further Python >= 3.8
is required with the packages listed in requirements.txt
.
Currently, we do not provide any open-source simulation setup. Internally, the FlooNoC was tested using QuestaSim, which can be launched with the following command:
# Compile the sources
make compile-sim
# Run the simulation
make run-sim-batch VSIM_TB_DUT=tb_floo_dut
or in the GUI, with prepared waveforms:
# Compile the sources
make compile-sim
# Run the simulation
make run-sim VSIM_TB_DUT=tb_floo_dut
By replacing tb_floo_dut
with the name of the testbench you want to simulate.
This repository includes the following NoC IPs:
- Routers: A collection of different NoC router designs with varying features such as virtual channels, input/output buffering, and adaptive routing algorithms.
- Network Interfaces (NIs): A set of NoC network interfaces for connecting IPs to the NoC.
- Topologies: A collection of NoC topologies, such as mesh, to enable the creation of various on-chip interconnects.
- Common IPs A set of IPs used by the NoC IPs, such as FIFOs, Cuts and arbiters.
- Verification IPs (VIPs): A set of VIPs to verify the correct functionality of the NoC IPs.
- Testbenches: A set of testbenches to evaluate the performance of the NoC IPs, including throughput, latency.
Name | Description | Doc |
---|---|---|
floo_router | A simple router with configurable number of ports, physical and virtual channels, and input/output buffers | |
floo_narrow_wide_router | Wrapper of a multi-link router for narrow and wide links |
Name | Description | Doc |
---|---|---|
floo_axi_chimney | A bidirectional network interface for connecting AXI4 Buses to the NoC | |
floo_narrow_wide_chimney | A bidirectional network interface for connecting narrow & wide AXI Buses to the multi-link NoC |
Name | Description | Doc |
---|---|---|
floo_mesh | A mesh topology with configurable number of rows and columns | |
floo_mesh_ruche | A mesh topology with ruche channels and a configurable number of rows and columns |
Name | Description | Doc |
---|---|---|
floo_fifo | A FIFO buffer with configurable depth | |
floo_cut | Elastic buffers for cuting timing paths | |
floo_cdc | A Clock-Domain-Crossing (CDC) module implemented with a gray-counter based FIFO. | |
floo_wormhole_arbiter | A wormhole arbiter | |
floo_vc_arbiter | A virtual channel arbiter | |
floo_rob | A table-based Reorder Buffer | |
floo_simple_rob | A simplistic low-complexity Reorder Buffer | |
floo_rob_wrapper | A wrapper of all available types of RoBs including RoB-less version |
Name | Description | Doc |
---|---|---|
axi_bw_monitor | A AXI4 Bus Monitor for measuring the throughput and latency of the AXI4 Bus | |
axi_reorder_compare | A AXI4 Bus Monitor for verifying the order of AXI transactions with the same ID | |
floo_axi_rand_slave | A AXI4 Bus Multi-Slave generating random AXI respones with configurable response time | |
floo_axi_test_node | A AXI4 Bus Master-Slave Node for generating random AXI transactions | |
floo_dma_test_node | An endpoint node with a DMA master port and a Simulation Memory Slave port | |
floo_hbm_model | A very simple model of the HBM memory controller with configurable delay |
The data structs for the flits and the links are auto-generated and can be configured in util/*cfg.hjson
. The size of the links is automatically determined to fit the largest message going over the link into a single flit, in order to avoid any serialization.
The AXI channels(s) needs to be configured in util/*cfg.hjson
. The following example shows the configuration for a single AXI channel with 64-bit data width, 32-bit address width, 3-bit ID width, and 1-bit user width (beware that ID width can be different for input and output channels).
axi_channels: [
{name: 'axi', direction: 'input', params: {dw: 64, aw: 32, iw: 3, uw: 1 }}
]
Multiple physical links can be declared and the mapping of the AXI channels to the physical link can be configured in util/*cfg.json
. The following example shows the configuration for two physical channels, one for requests and one for responses. The mapping of the AXI channels to the physical link is done by specifying the AXI channels in the map
field.
channel_mapping: {
req: {axi: ['aw', 'w', 'ar']}
rsp: {axi: ['b', 'r']}
}
FlooNoC does not send any header and tail flits to avoid serilization overhead. Instead additional needed routing information is sent in parallel and needs to be specified in the routing
field. Examples for the different routing algorithms can be found in util/*cfg.hjson
. The following example shows the configuration for a XY routing algorithm with 3-bit X and Y coordinates, 36-bit address offset, and 8-bit RoB index.
routing: {
route_algo: XYRouting
num_x_bits: 3
num_y_bits: 3
addr_offset_bits: 36
rob_idx_bits: 8
}
Finally, the package source files can be generated with:
make sources