R3.4 + OpenMPI 3.0.0 + Rmpi inside macOS – little bit of mess ;)

As usual, there are no easy solutions when it comes to R and mac ;)

First of all, I suggest to get clean, isolated copy of OpenMPI so you can be sure that your installation has no issues with mixed libs. To do so, simply compile OpenMPI 3.0.0

# Get OpenMPI sources
mkdir -p ~/opt/src
cd ~/opt/src
curl "https://www.open-mpi.org/software/ompi/v3.0/downloads/openmpi-3.0.0.tar.gz" \
  -o openmpi-3.0.0.tar.gz
tar zxf openmpi-3.0.0.tar.gz

# Create location for OpenMPI
mkdir -p ~/opt/openmpi/openmpi-3.0.0
./configure --prefix=$HOME/opt/openmpi/openmpi-3.0.0
make
make install

It’s time to verify that OpenMPI works as expected. Put content (presented below) into hello.c and run it.

/* Put this text inside hello.c file */
#include <mpi.h>
#include <stdio.h>
    
int main(int argc, char** argv) {
  int rank;
  int world;
    
  MPI_Init(NULL, NULL);
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);
  MPI_Comm_size(MPI_COMM_WORLD, &world);
  printf("Hello: rank %d, world: %d\n",rank, world);
  MPI_Finalize();
}

To compile and run it make sure to do following

export PATH=$HOME/opt/openmpi/openmpi-3.0.0/bin:${PATH}
mpicc -o hello ./hello.c
mpirun -np 2 ./hello

If you get output as below – it’s OK. If not – “Huston, we have a problem”.

Hello: rank 0, world: 2
Hello: rank 1, world: 2

Now, it’s time to install Rmpi – unfortunately, on macOS, you need to compile it from sources. Download source package and build it

mkdir -p ~/opt/src/Rmpi
cd ~/opt/src/Rmpi
curl "https://cran.r-project.org/src/contrib/Rmpi_0.6-6.tar.gz" -o Rmpi_0.6-6.tar.gz
R CMD INSTALL Rmpi_0.6-6.tar.gz \
  --configure-args="--with-Rmpi-include=$HOME/opt/openmpi/openmpi-3.0.0/include\
  --with-Rmpi-libpath=$HOME/opt/openmpi/openmpi-3.0.0/lib\
  --with-Rmpi-type=OPENMPI"

As soon as it is ready, you can try whether everything works fine. Try to run it outside R. Just to make sure everything was compiled and works as expected:

mkdir -p ~/tmp/Rmpi_test
cp -r /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi ~/tmp/Rmpi_test
cd ~/tmp/Rmpi_test/Rmpi
mpirun -np 2 ./Rslaves.sh \
  `pwd`/slavedaemon.R \
  tmp needlog \
  /Library/Frameworks/R.framework/Versions/3.4/Resources/
# If it works, that's fine. Nothing will happen in fact, it will simply run.
# Now, you may be tempted to run more instances (you will probably get error)
mpirun -np 4 ./Rslaves.sh \
  `pwd`/slavedaemon.R \
  tmp needlog \
  /Library/Frameworks/R.framework/Versions/3.4/Resources/
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
  ./Rslaves.sh

Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------

# You can increase number of slots by putting 
# localhost slots=25
# inside ~/default_hostfile and running mpirun following way
mpirun --hostfile=~/default_hostfile -np 4 \
  ./Rslaves.sh \
  `pwd`/slavedaemon.R \
  tmp \
  needlog \
  /Library/Frameworks/R.framework/Versions/3.4/Resources/

Now, we can try to run everything inside R

R
...
...
> library(Rmpi)
> mpi.spawn.Rslaves()
--------------------------------------------------------------------------
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
  /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi/Rslaves.sh

Either request fewer slots for your application, or make more slots available
for use.
--------------------------------------------------------------------------
Error in mpi.comm.spawn(slave = system.file("Rslaves.sh", package = "Rmpi"),  :
  MPI_ERR_SPAWN: could not spawn processes
>

Ups. The issue here is that Rmpi runs MPI code via MPI APIs and it doesn’t call mpirun. So, we can’t pass hostfile directly. However, there is hope. Hostfile is one of ORTE parameters (take a look here for more info: here and here).

This way, we can put location of this file here: ~/.openmpi/mca-params.conf. Just do following:

mkdir -p ~/.openmpi/
echo "orte_default_hostfile=$HOME/default_host" >> ~/.openmpi/mca-params.conf

Now, we can try to run R once more:

R
...
...
> library(Rmpi)
> mpi.spawn.Rslaves()
	4 slaves are spawned successfully. 0 failed.
master (rank 0, comm 1) of size 5 is running on: pi
slave1 (rank 1, comm 1) of size 5 is running on: pi
slave2 (rank 2, comm 1) of size 5 is running on: pi
slave3 (rank 3, comm 1) of size 5 is running on: pi
slave4 (rank 4, comm 1) of size 5 is running on: pi

This time, it worked ;) Have fun with R!

Comments (6)

anonymousOctober 20th, 2017 at 5:23 am

Thank you for sharing!
I have did this work until

echo “orte_default_hostfile=$HOME/default_host” >> ~/.openmpi/mca-params.conf

and it’s OK. But in my R or RStudio, Rmpi is not work.
> library(Rmpi)
> mpi.spawn.Rslaves()
Error in mpi.comm.spawn(slave = system.file(“Rslaves.sh”, package = “Rmpi”), :
MPI_ERR_SPAWN: could not spawn processes
————————————————————————–
There are not enough slots available in the system to satisfy the 4 slots
that were requested by the application:
/Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi/Rslaves.sh

Either request fewer slots for your application, or make more slots available
for use.
————————————————————————–
>

and I test the code:

mpirun –hostfile ~/default_host -np 4 \
./Rslaves.sh \
`pwd`/slavedaemon.R \
tmp \
needlog \
/Library/Frameworks/R.framework/Versions/3.4/Resources/

It’s work.

Do you have some idea?

anonymousOctober 20th, 2017 at 7:53 am

I’d suggest to put inside this file: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/Rmpi/Rslaves.sh

mpirun -version at the top

This way, you will make sure that you are using correct version of MPI. It’s hard to guess what can be the source of the problem here.

anonymousOctober 23rd, 2017 at 1:28 pm

Thank you for your reply!
I found out one mistake in my default_host file.
I inputed the slots=2 in the file, and when I changed it to 4, it works.
But I inputed an other PC’s IP in the host file, RStudio broken. I think there are some problems in the RStudio.

anonymousOctober 23rd, 2017 at 9:44 pm

This is a completely different story :)

If you want to use multiple machines as resources for MPI, you need to properly configure your env. Try to run simple hello world on distributed resources and make sure it works. Take a look here: http://mpitutorial.com/tutorials/running-an-mpi-cluster-within-a-lan/

anonymousNovember 15th, 2017 at 5:33 pm

Thank you, Michal!
Finally, I used OpenBLAS to replace RBLAS for improving my Mac speed.
It seems good.

#RBLAS
> x system.time(tmp
#OpenBLAS
> x system.time(tmp

anonymousNovember 16th, 2017 at 12:33 pm

Cool! I will leave your comment here for other people, so they can benefit from your tests!