Skip to content

Latest commit

 

History

History
219 lines (151 loc) · 7.84 KB

File metadata and controls

219 lines (151 loc) · 7.84 KB

MPI, Message Passing Interface

MPI is a library-based specification for parallel and distributed computing, similar to OpenMP and focused on Fortran and C languages. MPI is designed to explicitly implement parallelism in distributed memory environments using processes and a library-based syntax, while POSIX threads and OpenMP are used for implementing shared memory parallel programming with threads. MPI supports many parallel architectures and not just distributed systems. Compilation tools for MPI are typically wrappers that set the required flags and environment on top of existing compilers.

Basic functions

#include <mpi.h>

int main(int argc, char argv[])
{
    /* No MPI call before this line */
    MPI_Init(&argc, &argv);

    /* MPI calls go here */

    MPI_Finalize();
    /* No MPI call after this line */

    return 0;
}

MPI_init(int * argc_p, char *** argv_p) is used to initialize the MPI environment and must be called by all processes in a parallel program before any other MPI function can be called. MPI_Init is responsible for setting up the communication infrastructure that allows the processes to communicate and coordinate their work. argc_p and argv_p are the parameters of the main function. In case of multithreaded programs it is used MPI_Init_thread when the program has to run on threads and not on a cluster. Dually int MPI_Finalize(void) is to de-allocates the message buffers and cleans up all.

The MPI_COMM_WORLD is the main communicator is a collection of processes that can communicate with each other, but there are many communicators. The MPI_Comm_size(MPI_Comm , int * size) function returns the number of processes in the communicator, while the MPI_Comm_rank(MPI_Comm , int * rank) function returns the rank of a particular process in the communicator. These functions are used to identify and distinguish processes within the communicator. The return value of these functions is an error code, while the relevant data is inserted into a pointer provided as an argument.

#include <stdio.h>
#include <mpi.h>

int main(int argc, char** argv) {
  // Initialize the MPI environment
  MPI_Init(NULL, NULL);

  // Get the number of processes
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &size);

  // Get the rank of the process
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &rank);

  // Get the name of the processor
  char processor_name[MPI_MAX_PROCESSOR_NAME];
  int name_len;
  MPI_Get_processor_name(processor_name, &name_len);

  // Print off a hello world message
  printf("Hello world from processor %s, rank %d out of %d processors\n",
         processor_name, world_rank, world_size);

  // Finalize the MPI environment.
  MPI_Finalize();
}

MPI_Send and MPI_Recv

The MPI_Send routine is used to write a message to a communicator.

int MPI_Send(void *buf, int count, MPI_Datatype datatype, int dest, int tag, MPI_Comm comm)

The MPI_Recv routine is used to read a message from a communicator.

int MPI_Recv(const void *buf, int count, MPI_Datatype datatype, int source, int tag, MPI_Comm comm, MPI_Status *status) 

Both are a blocking routine: the function will not return until the targeted processes reads the message or have written the message.

  • buf is the array ready to receive data or it's the data ready to send.
  • count states how many replicas of the data type will be sent, or the maximum allowed in the buffer
  • The status parameter is used to store information about the message.
  • the source and dest argument specifies the rank of the process in the communicator comm from which the message is received. This value can be a specific rank, such as 0 or 1, or it can be the special value MPI_ANY_SOURCE, which indicates that the message can be received from any process in the communicator comm. This can be useful if you don't know the rank of the process that will be sending the message.
  • tag is used to distinguish messages travelling on the same connection (nonnegative int).
  • Note that sending and receiving messages introduces synchronization between processes.
  • Where there is synchronization, deadlocks may arise.

MPI_Datatype

We use this type to define what kind of message MPI is delivering.

Possible datatypes:

  • MPI_INT
  • MPI_CHAR
  • MPI_BYTE
  • MPI_FLOAT
  • MPI_DOUBLE
  • MPI_SHORT
  • MPI_LONG
  • MPI_LONG_DOUBLE
  • MPI_UNSIGNED
  • MPI_UNSIGNED_CHAR
  • MPI_UNSIGNED_SHORT
  • MPI_UNSIGNED_LONG

Typically al processes can access the standard output, while only the process 0 can access the standard input.

Communication functions

Data Distribution

Broadcast

int MPI_ Bcast (void *buffer, int count, MPI_Datatype datatype, int root, MPI_Comm comm)

Gather

int MPI_Gather(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int dest, MPI_Comm comm)

Join even chunks of data from all processes in a communicator. The result stored in recvbuf is a single vector of elements in dest process, with size = communicator size * recvcount .

Scatter

int MPI_Scatter(const void *sendbuf, int sendcount, MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)

Scatters data evenly across all processes in the communicator. sendcount is the amount of element to be stored in each recvbuf structure.

Scatterv

int MPI_Scatterv(const void *sendbuf, int sendcounts[k], const int displs[k], MPI_Datatype sendtype, void *recvbuf, int recvcount, MPI_Datatype recvtype, int root, MPI_Comm comm)

Scatters data across all processes according to a user defined distribution. Process with rank k will receive sendcounts[k] elements, starting from the position displs[k] of the sendbuf.

Data collection

Reduce

int MPI_Reduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, int dest, MPI_Comm comm)

Reduces the values in the sendbuf using the specified operation and stores the result in the receive buffer. Note that count specifies how many elements of send/input buffer should be reduced.

MPI_Op Operation
MPI_MAX Maximum
MPI_MIN Minimum
MPI_SUM Sum
MPI_PROD Product
MPI_LAND Logical AND (Between booleans)
MPI_BAND Bit-wise AND
MPI_LOR Logical OR
MPI_BOR Bit-wise OR
MPI_LXOR Logical XOR
MPI_BXOR Bit-wise XOR
MPI_MAXLOC Maximum value and location
MPI_MINLOC Minimum value and location

Allreduce

int MPI_Allreduce(const void *sendbuf, void *recvbuf, int count, MPI_Datatype datatype, MPI_Op op, MPI_Comm comm)

MPI_Allreduce is a collective operation that combines the values of all the processes in a communicator and broadcasts the result to all the processes, while MPI_Reduce is a collective operation that combines the values of all the processes in a communicator and returns the result to a single process (the root one). It's the same of Reduce but the result of the reduction is stored in the recvbuf of all processes in the communicator.

Miscellaneous functions

Barrier

Barrier explicit synchronization for all processes in a communicator.

// Synchronize all processes with a barrier MPI_Barrier(MPI_COMM_WORLD);

Wtime

double MPI_Wtime(void) provides a processor dependent time measurement to evaluate the computing time of a portion of program.

Comm_split

Partitions the group associated with comm into disjoint subgoups , one for each value of color.

MPI_Comm new_comm;
int color = 0;
int key = 0;

if (rank < n/2)
    color = 0;
else
    color = 1;

MPI_Comm_split(MPI_COMM_WORLD, color, key, &new_comm);

/*
If `rank` is less than `n/2`, the process belongs to the 'global' communicator, otherwise to the new communicator.
*/