R3.4 + OpenMPI 3.0.0 + Rmpi inside macOS – little bit of mess ;)
As usual, there are no easy solutions when it comes to R and mac ;)
First of all, I suggest to get clean, isolated copy of OpenMPI so you can be sure that your installation has no issues with mixed libs. To do so, simply compile OpenMPI 3.0.0
# Get OpenMPI sources mkdir -p ~/opt/src cd ~/opt/src curl "https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.gz" \ -o openmpi-3.0.0.tar.gz tar zxf openmpi-3.0.0.tar.gz # Create location for OpenMPI mkdir -p ~/opt/openmpi/openmpi-3.0.0 ./configure --prefix=$HOME/opt/openmpi/openmpi-3.0.0 make make install
It’s time to verify that OpenMPI works as expected. Put content (presented below) into hello.c and run it.
/* Put this text inside hello.c file */ #include <mpi.h> #include <stdio.h> int main(int argc, char** argv) { int rank; int world; MPI_Init(NULL, NULL); MPI_Comm_rank(MPI_COMM_WORLD, &rank); MPI_Comm_size(MPI_COMM_WORLD, &world); printf("Hello: rank %d, world: %d\n",rank, world); MPI_Finalize(); }
To compile and run it make sure to do following
export PATH=$HOME/opt/openmpi/openmpi-3.0.0/bin:${PATH} mpicc -o hello ./hello.c mpirun -np 2 ./hello
If you get output as below – it’s OK. If not – “Huston, we have a problem”.
Hello: rank 0, world: 2 Hello: rank 1, world: 2
Now, it’s time to install Rmpi – unfortunately, on macOS, you need to compile it from sources. Download source package and build it
mkdir -p ~/opt/src/Rmpi cd ~/opt/src/Rmpi curl "https://cran.r-project.org/src/contrib/Rmpi_0.6-6.tar.gz" -o Rmpi_0.6-6.tar.gz R CMD INSTALL Rmpi_0.6-6.tar.gz \ --configure-args="--with-Rmpi-include=$HOME/opt/openmpi/openmpi-3.0.0/include\ --with-Rmpi-libpath=$HOME/opt/openmpi/openmpi-3.0.0/lib\ --with-Rmpi-type=OPENMPI"
As soon as it is ready, you can try whether everything works fine. Try to run it outside R. Just to make sure everything was compiled and works as expected:
mkdir -p ~/tmp/Rmpi_test cp -r /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi ~/tmp/Rmpi_test cd ~/tmp/Rmpi_test/Rmpi mpirun -np 2 ./Rslaves.sh \ `pwd`/slavedaemon.R \ tmp needlog \ /Library/Frameworks/R.framework/Versions/3.4/Resources/ # If it works, that's fine. Nothing will happen in fact, it will simply run. # Now, you may be tempted to run more instances (you will probably get error) mpirun -np 4 ./Rslaves.sh \ `pwd`/slavedaemon.R \ tmp needlog \ /Library/Frameworks/R.framework/Versions/3.4/Resources/ -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 4 slots that were requested by the application: ./Rslaves.sh Either request fewer slots for your application, or make more slots available for use. -------------------------------------------------------------------------- # You can increase number of slots by putting # localhost slots=25 # inside ~/default_hostfile and running mpirun following way mpirun --hostfile=~/default_hostfile -np 4 \ ./Rslaves.sh \ `pwd`/slavedaemon.R \ tmp \ needlog \ /Library/Frameworks/R.framework/Versions/3.4/Resources/
Now, we can try to run everything inside R
R ... ... > library(Rmpi) > mpi.spawn.Rslaves() -------------------------------------------------------------------------- There are not enough slots available in the system to satisfy the 4 slots that were requested by the application: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi/Rslaves.sh Either request fewer slots for your application, or make more slots available for use. -------------------------------------------------------------------------- Error in mpi.comm.spawn(slave = system.file("Rslaves.sh", package = "Rmpi"), : MPI_ERR_SPAWN: could not spawn processes >
Ups. The issue here is that Rmpi runs MPI code via MPI APIs and it doesn’t call mpirun. So, we can’t pass hostfile directly. However, there is hope. Hostfile is one of ORTE parameters (take a look here for more info: here and here).
This way, we can put location of this file here: ~/.openmpi/mca-params.conf. Just do following:
mkdir -p ~/.openmpi/ echo "orte_default_hostfile=$HOME/default_host" >> ~/.openmpi/mca-params.conf
Now, we can try to run R once more:
R ... ... > library(Rmpi) > mpi.spawn.Rslaves() 4 slaves are spawned successfully. 0 failed. master (rank 0, comm 1) of size 5 is running on: pi slave1 (rank 1, comm 1) of size 5 is running on: pi slave2 (rank 2, comm 1) of size 5 is running on: pi slave3 (rank 3, comm 1) of size 5 is running on: pi slave4 (rank 4, comm 1) of size 5 is running on: pi
This time, it worked ;) Have fun with R!
Thank you for sharing!
I have did this work until
echo “orte_default_hostfile=$HOME/default_host” >> ~/.openmpi/mca-params.conf
and it’s OK. But in my R or RStudio, Rmpi is not work.
> library(Rmpi)
> mpi.spawn.Rslaves()
Error in mpi.comm.spawn(slave = system.file(“Rslaves.sh”, package = “Rmpi”), :
MPI_ERR_SPAWN: could not spawn processes
————————————————————————–
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
/Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi/Rslaves.sh
Either request fewer slots for your application, or make more slots available
for use.
————————————————————————–
>
and I test the code:
mpirun –hostfile ~/default_host -np 4 \
./Rslaves.sh \
`pwd`/slavedaemon.R \
tmp \
needlog \
/Library/Frameworks/R.framework/Versions/3.4/Resources/
It’s work.
Do you have some idea?
I’d suggest to put inside this file: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi/Rslaves.sh
mpirun -version at the top
This way, you will make sure that you are using correct version of MPI. It’s hard to guess what can be the source of the problem here.
Thank you for your reply!
I found out one mistake in my default_host file.
I inputed the slots=2 in the file, and when I changed it to 4, it works.
But I inputed an other PC’s IP in the host file, RStudio broken. I think there are some problems in the RStudio.
This is a completely different story :)
If you want to use multiple machines as resources for MPI, you need to properly configure your env. Try to run simple hello world on distributed resources and make sure it works. Take a look here: http://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/
Thank you, Michal!
Finally, I used OpenBLAS to replace RBLAS for improving my Mac speed.
It seems good.
#RBLAS
> x system.time(tmp
#OpenBLAS
> x system.time(tmp
Cool! I will leave your comment here for other people, so they can benefit from your tests!