The framework of Buffalo is developed upon DGL(pytorch backend)
We use Ubuntu 20.04, CUDA 12.1 (the package version you need to install are denoted in install_requirements.sh).
The requirements: pytorch >= 2.1, DGL >= 2.2
bash install_requirements.sh
.
As we do optimization in DGL to speedup the block generation of Buffalo. Hence, you need install the modified DGL first. The steps are shown below. Or you can run the evalutaion in the node we provied which installed modified DGL and all requirements.
ssh -i passwd.key cc@192.5.87.75
the passwd.key
file is located in the folder pytorch/
. If you have trouble accessing the node, you can contact syang127@ucmerced.edu
directly.
cd
git clone --recurse-submodules https://github.com/dmlc/dgl.git
cd dgl/
mkdir build
cd build
cmake -DUSE_CUDA=ON ..
make -j120
cd ../python
sudo python3 setup.py install
python setup.py build_ext --inplace
alias python='python3'
source ~/.bashrc
Then use 10 files in https://github.com/PASAUCMerced/Range_sampler.git to replace the 10 files in the original dgl
`Dgl/include/dgl/random.h`
`def RangeInt(lower, upper)`
`Dgl/include/dgl/aten/csr.h`
CSRRowWiseSampling( range= true)
`Dgl/include/dgl/sampling/neighbor.h`
SampleNeighbors_YSY
`Dgl/src/array/array_op.h`
CSRRowWiseSamplingUniformYSY
`Dgl/src/array/cuda/rowwise_sampling.cu`
CSRRowWiseSamplingUniformYSY
`Dgl/python/dgl/sampling/neighbor.py`
def sample_neighbors_range()
DGLGraph.sample_neighbors_range = utils.alias_func(sample_neighbors_range)
`Dgl/src/graph/sampling/neighbor/neighbor.cc`
DGL_REGISTER_GLOBAL("sampling.neighbor._CAPI_DGLSampleNeighbors_Range")
`Dgl/src/array/cpu/rowwise_sampling.cc`
GetSamplingYSYRangePickFn()
`Dgl/src/array/array.cc`
CSRRowWiseSamplingYSY
'Dgl/src/random/cpu/choice.cc'
YSYChoice
Then :
cd dgl/
sudo rm -r build/
mkdir build
cd build
cmake -DUSE_CUDA=ON ..
make -j120
cd ../python
sudo python3 setup.py install
Buffalo introduces a system addressing the bucket explosion and enabling load balancing between graph partitions for GNN training.