Open
Description
Now that distributed inference is supported thanks to the work of @evanmiller in #2099 it would be fun to try to utilize it for something cool. One such idea is to connect a bunch of Raspberry Pis in a local network and run the inference using MPI:
# sample cluster of 8 devices (replace with actual IP addresses of the devices)
$ cat ./hostfile
192.168.0.1:1
192.168.0.2:1
192.168.0.3:1
192.168.0.4:1
192.168.0.5:1
192.168.0.6:1
192.168.0.7:1
192.168.0.8:1
# build with MPI support
$ make CC=mpicc CXX=mpicxx LLAMA_MPI=1 -j
# run distributed inference over 8 nodes
$ mpirun -hostfile ./hostfile -n 8 ./main -m /mnt/models/65B/ggml-model-q4_0.bin -p "I believe the meaning of life is" -n 64
Here we assume that the 65B model data is located on a network share in /mnt
and that mmap
works over a network share.
Not sure if that is the case - if not, then it would be more difficult to perform this experiment.
Looking for people with access to the necessary hardware to perform this experiment