Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
5c28a69
Add minimalloc dept to dockerfile and provide makefile install
Victor-Jung Feb 18, 2025
a52c64d
Update Dockerfile
Victor-Jung Feb 20, 2025
b2e6702
Add minimalloc install makefile
Victor-Jung Feb 20, 2025
71687eb
WIP MiniMalloc and decouple tiling to memory allocation
Victor-Jung Feb 20, 2025
7ec156c
Update gitignore
Victor-Jung Feb 20, 2025
9a5078a
Update CI
Victor-Jung Feb 20, 2025
0544a00
Take into account the multibuffer coefficient in MiniMalloc
Victor-Jung Feb 20, 2025
8f8e75f
Linting
Victor-Jung Feb 20, 2025
398ce3e
Add memAllocStrategy and searchStrategy interfaces to the CI and remo…
Victor-Jung Feb 21, 2025
423c76a
Interface L2 size and fix testMVP for shouldFail cases
Victor-Jung Feb 21, 2025
5d27d28
Refactor Minimalloc call and catch memory allocation failures
Victor-Jung Feb 21, 2025
1b5a0a4
Add CI test for memory allocation with should-fails
Victor-Jung Feb 21, 2025
fc24ed6
Linting
Victor-Jung Feb 21, 2025
fc072e3
Remove useless TODOs and cleanup CI
Victor-Jung Feb 21, 2025
27a471a
Align docker link
Victor-Jung Feb 25, 2025
e3fbbf2
Update Makefile echo-bash
Victor-Jung Feb 27, 2025
6564ac3
Add comments
Victor-Jung Feb 27, 2025
b6177aa
Assert we are not using DFT and have uniform memory level allocation
Victor-Jung Feb 27, 2025
a5ba5cf
Linting
Victor-Jung Feb 27, 2025
b497281
Align CI
Victor-Jung Feb 27, 2025
3e81751
Add comment
Victor-Jung Mar 10, 2025
b0fad66
Add message after assert
Victor-Jung Mar 10, 2025
a3eb7a3
Fix formatting
Victor-Jung Mar 10, 2025
5f15b3b
Update CHANGELOG
Victor-Jung Mar 12, 2025
f0b1a93
Update container link
Victor-Jung Mar 12, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions .github/workflows/CI.yml
Original file line number Diff line number Diff line change
Expand Up @@ -662,6 +662,25 @@ jobs:


### Deeploy Extension and Internal Tests ###
deeploy-memory-allocation:
runs-on: ubuntu-22.04
container:
image: ghcr.io/pulp-platform/deeploy:main
steps:
- name: Checkout Repo
uses: actions/checkout@v4
with:
submodules: recursive
- name: Build Deeploy
run: pip install -e .
- name: Run Test
run: |
cd DeeployTest
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=40000 --memAllocStrategy=MiniMalloc
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=30000 --memAllocStrategy=MiniMalloc --shouldFail
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=80000 --memAllocStrategy=TetrisRandom
python testMVP.py -t Tests/CCT/CCT_1_16_16_8 -p Siracusa --defaultMemLevel=L2 --l1=64000 --l2=40000 --memAllocStrategy=TetrisRandom --shouldFail

deeploy-state-serialization:
runs-on: ubuntu-22.04
container:
Expand Down
10 changes: 9 additions & 1 deletion .github/workflows/TestRunnerTiledSiracusa.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,14 @@ on:
required: false
default: false
type: boolean
memory-allocation-strategy:
required: false
default: "MiniMalloc"
type: string
search-strategy:
required: false
default: "random-max"
type: string

jobs:

Expand Down Expand Up @@ -57,6 +65,6 @@ jobs:
mkdir -p /app/.ccache
export CCACHE_DIR=/app/.ccache
source /app/install/pulp-sdk/configs/siracusa.sh
python testRunner_tiled_siracusa.py -t Tests/${{ inputs.test-name }} --cores=${{ inputs.num-cores }} --l1 ${{ matrix.L1 }} --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }}
python testRunner_tiled_siracusa.py -t Tests/${{ inputs.test-name }} --cores=${{ inputs.num-cores }} --l1 ${{ matrix.L1 }} --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }} --memAllocStrategy=${{ inputs.memory-allocation-strategy }} --searchStrategy=${{ inputs.search-strategy }}
shell: bash

10 changes: 9 additions & 1 deletion .github/workflows/TestRunnerTiledSiracusaSequential.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ on:
required: false
default: false
type: boolean
memory-allocation-strategy:
required: false
default: "MiniMalloc"
type: string
search-strategy:
required: false
default: "random-max"
type: string

jobs:

Expand Down Expand Up @@ -53,7 +61,7 @@ jobs:
L1_values=$(echo "$test" | jq -r '.L1[]')
for L1_value in $L1_values; do
echo "Running test: $testName with L1: $L1_value"
python testRunner_tiled_siracusa.py -t Tests/$testName --cores=${{ inputs.num-cores }} --l1 $L1_value --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }}
python testRunner_tiled_siracusa.py -t Tests/$testName --cores=${{ inputs.num-cores }} --l1 $L1_value --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }} --memAllocStrategy=${{ inputs.memory-allocation-strategy }} --searchStrategy=${{ inputs.search-strategy }}
done
done
shell: bash
Expand Down
11 changes: 10 additions & 1 deletion .github/workflows/TestRunnerTiledSiracusaWithNeureka.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,19 @@ on:
required: false
default: false
type: boolean
memory-allocation-strategy:
required: false
default: "MiniMalloc"
type: string
search-strategy:
required: false
default: "random-max"
type: string
neureka-wmem:
required: false
default: false
type: boolean


jobs:

Expand Down Expand Up @@ -61,6 +70,6 @@ jobs:
mkdir -p /app/.ccache
export CCACHE_DIR=/app/.ccache
source /app/install/pulp-sdk/configs/siracusa.sh
python testRunner_tiled_siracusa_w_neureka.py -t Tests/${{ inputs.test-name }} --cores=${{ inputs.num-cores }} --l1 ${{ matrix.L1 }} --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }} ${{ inputs.neureka-wmem && '--neureka-wmem' || '' }}
python testRunner_tiled_siracusa_w_neureka.py -t Tests/${{ inputs.test-name }} --cores=${{ inputs.num-cores }} --l1 ${{ matrix.L1 }} --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }} ${{ inputs.neureka-wmem && '--neureka-wmem' || '' }} --memAllocStrategy=${{ inputs.memory-allocation-strategy }} --searchStrategy=${{ inputs.search-strategy }}
shell: bash

Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ on:
required: false
default: false
type: boolean
memory-allocation-strategy:
required: false
default: "MiniMalloc"
type: string
search-strategy:
required: false
default: "random-max"
type: string
neureka-wmem:
required: false
default: false
Expand Down Expand Up @@ -57,7 +65,7 @@ jobs:
L1_values=$(echo "$test" | jq -r '.L1[]')
for L1_value in $L1_values; do
echo "Running test: $testName with L1: $L1_value"
python testRunner_tiled_siracusa_w_neureka.py -t Tests/$testName --cores=${{ inputs.num-cores }} --l1 $L1_value --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }} ${{ inputs.neureka-wmem && '--neureka-wmem' || '' }}
python testRunner_tiled_siracusa_w_neureka.py -t Tests/$testName --cores=${{ inputs.num-cores }} --l1 $L1_value --defaultMemLevel=${{ inputs.default-memory-level }} ${{ inputs.double-buffer && '--doublebuffer' || '' }} ${{ inputs.neureka-wmem && '--neureka-wmem' || '' }} --memAllocStrategy=${{ inputs.memory-allocation-strategy }} --searchStrategy=${{ inputs.search-strategy }}
done
done

Expand Down
10 changes: 9 additions & 1 deletion .github/workflows/TestRunnerTiledSnitchSequential.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,14 @@ on:
required: false
default: "L2"
type: string
memory-allocation-strategy:
required: false
default: "MiniMalloc"
type: string
search-strategy:
required: false
default: "random-max"
type: string
simulators:
required: true
type: string
Expand Down Expand Up @@ -54,7 +62,7 @@ jobs:
L1_values=$(echo "$test" | jq -r '.L1[]')
for L1_value in $L1_values; do
echo "Running test: $testName with L1: $L1_value using $simulator"
python testRunner_tiled_snitch.py -t Tests/$testName --cores=${{ inputs.num-cores }} --simulator=$simulator --l1 $L1_value --defaultMemLevel=${{ inputs.default-memory-level }} --toolchain_install_dir /app/install/riscv-llvm/
python testRunner_tiled_snitch.py -t Tests/$testName --cores=${{ inputs.num-cores }} --simulator=$simulator --l1 $L1_value --defaultMemLevel=${{ inputs.default-memory-level }} --toolchain_install_dir /app/install/riscv-llvm/ --memAllocStrategy=${{ inputs.memory-allocation-strategy }} --searchStrategy=${{ inputs.search-strategy }}
done
done
fi
Expand Down
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@ dist
*.vscode
.DS_Store
*.html
*.csv
.ipynb_checkpoints/
*#
install/
Expand Down
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -175,3 +175,15 @@ Change main.c to use OUTPUTTYPE instead of float

### Fixed
- Updated printinput nodetemplate for float handling.

## Add MiniMalloc and Decouple Memory Allocation and Tiling

## Added
- Installation and compilation flow for MiniMalloc through Makefile.
- Adapt the docker to install MiniMalloc and declare necessary symbols.
- Add the `constraintTileBuffersWithOverlappingLifetime` method to the memory scheduler to add the necessary memory constraint when we decouple memory allocation and tiling.
- Add the `minimalloc` method to the `Tiler` class. MiniMalloc comes as a precompiled cpp library using CSV for I/O. Hence, this method converts Deeploy's memory map to MiniMalloc's CSV representation, calls a subprocess to run MiniMalloc, reads the output CSV, and translates it back to Deeploy's memory map.
- Add MiniMalloc to the memory allocation strategies and add a new argument to the test runner to control the L2 size.

## Fixed
- Fix `testMVP.py` to get a proper should fail test.
10 changes: 9 additions & 1 deletion Container/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,6 @@ COPY Makefile ./
RUN apt-get upgrade
RUN apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y git-lfs \
cmake \
build-essential \
ccache \
ninja-build \
Expand All @@ -44,6 +43,14 @@ RUN DEBIAN_FRONTEND=noninteractive apt-get install -y git-lfs \
gcc-multilib \
wget

# Install cmake 3.25.1 (required by minimalloc)
RUN wget https://github.com/Kitware/CMake/releases/download/v3.25.1/cmake-3.25.1-linux-x86_64.sh && \
chmod +x cmake-3.25.1-linux-x86_64.sh && \
./cmake-3.25.1-linux-x86_64.sh --prefix=/usr --skip-license

# Compile minimalloc
RUN make minimalloc

# Install Python
RUN wget https://www.python.org/ftp/python/${PYTHON_VERSION}/Python-${PYTHON_VERSION}.tgz
RUN tar xzf Python-${PYTHON_VERSION}.tgz
Expand Down Expand Up @@ -117,6 +124,7 @@ ENV PULP_SDK_HOME=/app/install/pulp-sdk
ENV LLVM_INSTALL_DIR=/app/install/llvm
ENV SNITCH_HOME=/app/install/snitch_cluster
ENV GVSOC_INSTALL_DIR=/app/install/gvsoc
ENV MINIMALLOC_INSTALL_DIR=/app/install/minimalloc
ENV MEMPOOL_HOME=/app/install/mempool
ENV PATH=/app/install/qemu/bin:/app/install/banshee:$PATH
ENV PATH="/root/.cargo/bin:${PATH}"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,6 @@ def apply(self,
executionBlock: ExecutionBlock,
name: str,
verbose: CodeGenVerbosity = _NoVerbosity) -> Tuple[NetworkContext, ExecutionBlock]:
# TODO: JUNGVI: These have to be core only barriers
executionBlock.addLeft(_synchTemplate, {})
executionBlock.addRight(_synchTemplate, {})
return ctxt, executionBlock
44 changes: 43 additions & 1 deletion Deeploy/TilingExtension/MemoryScheduler.py
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,7 @@ def overlap(lifetimeA: Tuple[int, int], lifetimeB: Tuple[int, int]) -> bool:

return overlap

def __init__(self, stringSuffix: str, tileScheduler: bool, seed: int = 19960801):
def __init__(self, stringSuffix: str, tileScheduler: bool, seed: int = 1996080121):
self._stringSuffix = stringSuffix
self.stringSuffix = ""
self.tileScheduler = tileScheduler
Expand Down Expand Up @@ -484,6 +484,9 @@ def _scheduleMemoryConstraints(self,
permutationList = self.heuristicPermutation(adjacencyMatrix, costVector)
permAdj, permCost, permutationMatrix = self._stablePermutation(adjacencyMatrix, costVector,
permutationList)
elif memoryAllocStrategy == "MiniMalloc":
#JUNVI: When using MiniMalloc we don't perform memory allocation with Tiling, hence we don't add the permutation constraints
continue
else:
raise ("Unrecognized memory allocation strategy!")

Expand All @@ -507,6 +510,45 @@ def scheduleMemoryConstraints(self,
return self._scheduleMemoryConstraints(tilerModel, ctxt, allMemoryConstraints, memoryHierarchy,
memoryAllocStrategy, memoryLevel)

@staticmethod
def constraintTileBuffersWithOverlappingLifetime(tilerModel: TilerModel, ctxt: NetworkContext,
patternMemoryConstraint: PatternMemoryConstraints,
memoryHierarchy: MemoryHierarchy):
"""JUNGVI: This method adds the necessay constraints for tiling to be performed before the static memory allocation of the tile buffers.
To perform static memory allocation after tiling (i.e. decouple tiling and memory alloc), we need to do two assumptions
1. All tile buffers for each node have overlapping lifetime, so we can find their memory footprint by just summing their sizes and hence we don't need to know the specific memory allocation. This assumption is true as soon as we don't do tile several nodes together (ask me if you don't know what I mean here).
2. We don't allocate the tensors of the graph in the same memory level than the tiles (for instance we put all tensor in L2 and the tiles only live in L1).
"""

for nodeConstraint in patternMemoryConstraint.nodeConstraints:
tileMemoryConstraint = {}

for tensorMemoryConstraints in nodeConstraint.tensorMemoryConstraints.values():
for memoryConstraint in tensorMemoryConstraints.memoryConstraints.values():
if isinstance(memoryConstraint.size, IntVar):

_buffer = ctxt.lookup(tensorMemoryConstraints.tensorName)

if not isinstance(_buffer, TransientBuffer):
_typeWidthFactor = int(_buffer._type.referencedType.typeWidth / 8)
else:
_typeWidthFactor = 1

tileMemoryConstraint[tensorMemoryConstraints.tensorName] = {
"sizeVar": memoryConstraint.size,
"typeWidthFactor": _typeWidthFactor,
"memoryLevel": memoryConstraint.memoryLevel,
"multiBufferCoeff": memoryConstraint.multiBufferCoefficient,
}

for memoryLevel in memoryHierarchy.memoryLevels.values():
sumExpr = 0
for infoDict in tileMemoryConstraint.values():
if memoryLevel.name == infoDict['memoryLevel']:
sumExpr += infoDict['sizeVar'] * infoDict['typeWidthFactor'] * infoDict['multiBufferCoeff']
if sumExpr != 0:
tilerModel.addConstraint(sumExpr <= memoryLevel.size)

def getSymbolicCostName(self, patternIdx: int, memoryLevel: str) -> str:
stringSuffix = self._stringSuffix + f"_{memoryLevel}"

Expand Down
Loading
Loading