-
Notifications
You must be signed in to change notification settings - Fork 34
Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 #35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
740cf47 to
cd2ee51
Compare
Victor-Jung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Run, great PR addressing lots of issues and building strong ground for every fp execution on PULPOpen! A few comments to address but no critical ones.
Signed-off-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com>
Victor-Jung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the interface that you used (CodeGenVerbosity) but I don't like that the pass is in the PULPTiling pass. A small change and this will roll.
9624c91 to
178741f
Compare
Victor-Jung
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for addressing my comments!
…sa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Update CCT on PULP with Tiling --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com> Update CCT on PULP with Tiling Add PULPProfileUntiled Pass
…sa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Update CCT on PULP with Tiling --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com>
…sa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Update CCT on PULP with Tiling --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com> Update CCT on PULP with Tiling Add PULPProfileUntiled Pass
…sa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Update CCT on PULP with Tiling --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com> Update CCT on PULP with Tiling Add PULPProfileUntiled Pass
* Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 (#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Fixed CCT on L3 Bugs --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com>
…sa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Update CCT on PULP with Tiling --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com> Update CCT on PULP with Tiling Add PULPProfileUntiled Pass
…sa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Update CCT on PULP with Tiling --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com>
* Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 (pulp-platform#35) * Update CCT onnx without broadcast and Upload CCT two version(16,32) * Fixed CCT on L3 Bugs --------- Co-authored-by: Victor Jung <33875047+Victor-Jung@users.noreply.github.com>
This release contains major architectural changes, new platform support, enhanced simulation workflows, floating-point kernel support, training infrastructure for CCT models, memory allocation strategies, and documentation improvements. After merging this into `main`, the release process will proceed with: - Pushing a Git tag for the release after merging this PR - Creating a GitHub release with the prepared tag. Note: Since the release tag references the Docker container tagged with the release tag (`ghcr.io/pulp-platform/deeploy:v0.2.0`), the CI will initially fail. The Deeploy Docker image must be built after the release PR is merged and the CI restarted. ### List of Pull Requests - Prepare v0.2.0 release [#102](#102) - Add Luka as Code Owner [#101](#101) - Fix CI, Docker Files, and Documentation Workflow [#100](#100) - Chimera Platform Integration [#96](#96) - Add Tutorial and Refactor README [#97](#97) - Reduce Mean Float Template [#92](#92) - Reshape Memory Freeing and Generic Float GEMM Fixes [#91](#91) - Prepare for Release and Separate Dependencies [#90](#90) - Fix input offsets calculation [#89](#89) - Move PULP SDK to main branch/fork [#88](#88) - Finite Lifetime for IO Tensors [#51](#51) - Improved Memory Visualization and Multi-Layer Tiling Profiling [#56](#56) - Fix Linting in CI and Reformat C Files [#86](#86) - Fix Broken CMake Flow For pulp-sdk [#87](#87) - Refactor Changelog For Release [#85](#85) - ARM Docker Container and Minor Bug Fix [#84](#84) - Added Kernel for Generic Float DW Conv2D [#63](#63) - Autoselect Self-Hosted Runners if the Action is on Upstream [#81](#81) - TEST_RECENT linking on MacOS [#78](#78) - Add RV32IMF Picolibc support for Siracusa platform [#66](#66) - Improve Documentation and VSCode Support [#76](#76) - Debug Print Topology Pass and Code Transformation [#75](#75) - Find all subdirectories of Deeploy when installing with pip install [#70](#70) - Add milestone issue template [#71](#71) - Bunch of fixes and changes [#58](#58) - Add SoftHier platform [#65](#65) - rv32imf_xpulpv2 ISA support for Siracusa platform [#64](#64) - One LLVM To Compile Them All [#60](#60) - One GVSoC to Simulate Them All [#59](#59) - Add Support for CCT Last Layer Training with Embedding Dim 8-128 [#55](#55) - Add CCT Classifier Training Support [#53](#53) - L3 Bugs: DMA Struct Datatype and Maxpool Margin Error [#45](#45) - DeepQuant Quantized Linear Support [#54](#54) - Implemented Dequant Layer for Generic and Siracusa [#52](#52) - Infinite Lifetime Buffers Considered in Tiling & Memory Allocation (+ Visualization) [#44](#44) - Implemented Quant Layer for Generic and Siracusa [#49](#49) - Increase maximal Mchan DMA transfer sizes from 64KiB to 128KiB [#47](#47) - Add MiniMalloc and Decouple Memory Allocation and Tiling [#40](#40) - Float CCT Bugs on L3 [#37](#37) - Memory Allocation Strategies and Visualization [#36](#36) - Add CODEOWNERS [#42](#42) - Add Tiling Support to All CCT Kernels and Fix CCT Operators on Siracusa Platform for L2 [#35](#35) - Add Fp gemm and Softmax for Snitch platform [#31](#31) - Add Float Kernels for CCT [#29](#29) - documentation deployment [#34](#34) - main.c Float Cast Bugs [#28](#28) - Add Float GEMM on PULP with Tiling [#26](#26) - Add Float Support & Float GEMM for Generic [#25](#25) - GVSOC support for the Snitch Cluster platform [#23](#23) - Snitch Cluster Tiling Support [#22](#22) - Snitch support integration [#14](#14) - Update bibtex citation [#20](#20) - the PR template location, bump min python to 3.10, change install command [#17](#17) - Add pre-commit for python formatting [#15](#15) - FP integration (v2) [#12](#12) - shell for sequential tests of Generic, Cortex, and Mempool platforms [#11](#11) - Add issue templates [#10](#10) - Minor CI and Readme Improvements [#8](#8) - Fix GHCR Link for Docker Build [#7](#7) - neureka's ccache id [#6](#6) - GitHub-based CI/CD Flow [#4](#4) - Generic Softmax Kernel [#2](#2) - Port GitLab CI [#1](#1)
Description
This update improves CCT's kernel tiling support and resolves multiple operator issues on the Siracusa platform. The new kernel templates for convolution and max-pooling enhance padding integration while adopting an HWC layout. Additionally, key constraints for tiling have been introduced, fixing several execution issues in GEMM, MatMul, and float-based computations. The layers has also been refined to handle bias broadcasting correctly, ensuring accurate output shape inference.
Added
Float Bindings, Tilers for Pulp Target
Float Convolution, MaxPool Parser, Template, Kernel
Tiling Constraints
convgatherandlayernormand exisitng constraints for other kernels.Fixed
CycleMeasure Pass for Siracusa Untiling Profilling
GEMM Tiling Constraints Issue
transAand `transB' not supported.MatMul Multi-Dimensional Input Issue
Add Layer for Broadcasted Bias
float32withfcausedinferrors.Changed
addandgemmto avoid unnecessary broadcasting.PR Merge Checklist
develcommit and pointing todevel.CHANGELOG.mdfile has been updated.