PROJECT 2
Submitted
By
Priyank Tiwari
Project Description:
Problem
In 3D, N particles are arranged, initially, in a 101010 cube, i.e., any triplet of random numbers in [0, 10] can be the coordinates of a particle. These particles interact under the so-called Lennard-Jones potential
where, Vij is the pair-wise potential between particles i and j with distance rij. Write a parallel program to minimize the total (Lennard-Jones potential) energy of the particle system by adjusting particle positions. Please do the following three parts:
(1)Compute the minimal energy for N=16 particles by any method of your choice by 1 node. You will invest the best but practical efforts for this part of calculation (e.g., run your code for a couple of hours on one node.)
(2)Now, use P=2, 4, 8 nodes to minimize the system energy, individually. You terminate the minimizations when you reach at a minimal energy similar to the minimal energy obtained by P=1 (as in (1) above). Results within 5 percent of relative errors are considered the similar.
(3)Plot the speedup curve.
ALGORITHM
- Assign the temperature value Some initial temperature=10,000
- Assign some initial position to the particles
- Calculate the potential at that position of the particles and assign it as OldCost
- While temperature > lower threshold(0.1)
- randomize the co-ordinates of the particle locally at each processor
- Calculate the potential at this position of the particles and assign it as NewCost
- Calculate the difference Deltacost=NewCost-Old Cost
- If delta cost <0…Copy the NewCost into OldCost(indicating good move)
- Else check the condition of acceptance by simulated annealing module
- keep on cooling the temperature till step 4 allows to go further
Computation of Cost function(before and after randomizing) is done as follows
/* The forces between the n = N/p local particles on each processor are computed locally*/
- calculate n=N/p
- process=0
- copy the coordinates from one processor P to processor P+2 in Ring topology
( Copying done by interleaving i.e 0 sends to 2, 1 sends to 3..)
- process++
- Is process=n-1?
- If no go to 3
- Calculate the local potential on each Processor after getting the rotated coordinates
- Each processor sends the calculated local potential to processor 0
/* The above loop will have to be executed proportional to the time we want the code to run and get an optimum value of positions of the particle*/
The Pseudo Code is given as below:
PSEUDO-CODE
TEMPERATURE = INIT_TEMP;
X = INIT_POSITION
OldCost = COST(X)
While (TEMPERATURE > 0.1)
{
While ( INNER_LOOP_CRITERION = TRUE)
{
New_position = GENERATE(X);
NewCost = COST(New_position);
DeltaCost = NewCost – OldCost ;
If (DeltaCost < 0)
{
X = New_position ;
OldCost = NewCost;
}
else
{
r = RANDOM(0,1);
y = exp( -DeltaCost / TEMPERATURE);
if(r < y)
{
X = New_position ;
OldCost = NewCost;
}
else
REJECT(New_position);
}
}
TEMPERATURE = ALPHA * TEMPERATURE;// ( 0 < ALPHA < 1 )
}
The computation of the cost function is done as follows:
/* The forces between the n = N/p local particles on each processor are computed. This computation can proceed fully in parallel.
The local particles are copied to the Rotating particles on each processor,
for (i = 0; i < particlesPerProcess; i++)
for (k = 0; k < 3; k++)
Rotating[i][k] = r[i][k];
The following steps are repeated p - 1 times:
The rotating particles are shifted to the right neighbor processor in the ring Topology. The movement of the p sets of traveling particles can proceed in parallel.
for (process = 0; process < numberOfProcesses - 1; process++)
{
// Copy traveling particles to send buffer
for (i = 0; i < particlesPerProcess; i++)
for (k = 0; k < 3; k++)
sendBuffer[i][k] = Rotating [i][k];
// interleave send/receive to prevent deadlock
if (my_rank % 2 == 0)
{
// Even processes send/receive
// Send traveling particles to next process
MPI_Send(sendBuffer[0], 3 * particlesPerProcess, MPI_DOUBLE,
nextProcess, 0, MPI_COMM_WORLD);
// Receive traveling particles from previous process
MPI_Recv(Rotating [0], 3 * particlesPerProcess, MPI_DOUBLE,
previousProcess, 0, MPI_COMM_WORLD, &status);
}
else
{
// Odd processes receive/send
MPI_Recv(Rotating [0], 3 * particlesPerProcess, MPI_DOUBLE,
previousProcess, 0, MPI_COMM_WORLD, &status);
MPI_Send(sendBuffer[0], 3 * particlesPerProcess, MPI_DOUBLE,
nextProcess, 0, MPI_COMM_WORLD);
}
The forces between the n local particles and the n traveling particles on each processor is computed. This computation can proceed fully in parallel.
- The GENERATE(X) function :- This function changes the positions of the particles by moving them and assigning new co-ordinates randomly.
- The Simulated Annealing algorithm parameters :-
Initial temperature = 10000
Freezing temperature = 0.1
Cooling Schedule parameter “alpha” = 0.8
Inner loop count = 100
PROGRAM CODE :-
#include<stdio.h>
#include<math.h>
#include<stdlib.h>
#include<mpi.h>
double compute_potentials();// To compute the potential on each processor
#define Num_particles 16// Specify number of particles over here.
#define alpha 0.8// Specify value of cooling schedule parameter alpha.
// Global Declarations
int my_rank,p;
int part_per_proc, proc, min,max, Rotate, nextprocess, previousprocess, target_processor, target_particle;
int z,i,j,k,x,change,move_partnum, INNER_LOOP,offset, Recv_processor,change_particle;
double potential, starttime, communicationtime, temp,r;
double Rotating[Num_particles][3],sendbuffer[Num_particles][3];
double particle[Num_particles][3],temp_particle[Num_particles][3], newcod_particle[Num_particles][3], temp_potential[Num_particles];
double Temperature,oldcost,newcost,delta_cost,m,n,y,lowestcost=0.0;
double starttime=0.0,endtime=0.0,totaltime=0.0,timeelapsed=0.0,time_taken=0.0;
MPI_Status status;
main(int argc, char *argv[])
{
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
MPI_Comm_size(MPI_COMM_WORLD, &p);
Temperature = 10000;// Initial temperature value for SA.
printf("\n I am %d", my_rank);
part_per_proc = Num_particles/p;// Particles per processor.
printf("\n PArticles per Process = %d", part_per_proc);
if(my_rank==0)
{
nextprocess = my_rank+1;
previousprocess = p-1;
printf("\n Nextprocess = %d and Previous Process =%d", nextprocess, previousprocess);
}
else
{
if(my_rank==(p-1))
{
nextprocess = 0;
previousprocess = my_rank-1;
printf("\n Nextprocess = %d and Previous Process =%d", nextprocess, previousprocess);
}
else
{
nextprocess = my_rank+1;
previousprocess = my_rank-1;
printf("\n Nextprocess = %d and Previous Process =%d", nextprocess, previousprocess);
}
}
if(my_rank==0)
{
offset = 0;
//CALCULATING THE INITIAL POSITIONS OF PARTICLES
potential = 0.0;
//srand();
for(i=0;i<Num_particles;i++)
{
for(x=0;x<3;x++)
particle[i][x]=rand()%11;
}
// Processor 0 sending the co-ordinates of n = N/p particles to every other processor
for(proc=1;proc<p;proc++)
{
offset = offset+part_per_proc;
printf("\nSending %d particles to processor %d offset=%d ",part_per_proc,proc,offset);
MPI_Send(&offset, 1, MPI_INT, proc, 1, MPI_COMM_WORLD);
MPI_Send(&particle[offset][0], 3*part_per_proc, MPI_DOUBLE, proc, 1, MPI_COMM_WORLD);
}
for(i=0;i<part_per_proc;i++)
{
//printf("\n******************");
for(k=0;k<3;k++)
{
temp_particle[i][k] = particle[i][k];
//printf(" %f\t",temp_particle[i][k]);
}
}
}
else
{
MPI_Recv(&offset, 1, MPI_INT, 0, 1, MPI_COMM_WORLD, &status);
MPI_Recv(&particle, 3*part_per_proc, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD, &status);
for(i=0;i<part_per_proc;i++)
for(k=0;k<3;k++)
temp_particle[i][k] = particle[i][k];
}
//COMPUTE THE POTENTIAL FOR ALL THE PARTICLES
oldcost = compute_potentials();
// START THE TIMER
starttime = MPI_Wtime(); //starting timer
// OUTER LOOP FOR SA
while(Temperature>0.1)// FREEZING TEMPERATURE = 0.1
{
INNER_LOOP = 100;// No. of interations for INNER_LOOP = 100
while(INNER_LOOP!=0)
{
for(z=0;z<Num_particles;z++)
{
Recv_processor = z/part_per_proc;
change_particle = z % part_per_proc;
if(my_rank==0)
{
srand(rand());
for(k=0;k<3;k++)
{
temp_particle[z][k] = rand() % 11;// Move one particle at a time randomly
// printf(" %f\t",temp_particle[z][k]);
}
if(p>1 & Recv_processor!=0)
MPI_Send(&temp_particle[z][0],3, MPI_DOUBLE, Recv_processor, 1, MPI_COMM_WORLD);
}
if(p>1 & Recv_processor!=0 & my_rank==Recv_processor)
{
MPI_Recv(&temp_particle[change_particle][0],3, MPI_DOUBLE, 0, 1, MPI_COMM_WORLD, &status);
}
// COMPUTE THE POTENTIAL AS THE NEW COST FUNCTION.
temp_potential[z] = compute_potentials();
if(my_rank==0)
{
for(k=0;k<3;k++)
{
newcod_particle[z][k] = temp_particle[z][k];
temp_particle[z][k] = particle[z][k];
}
}
if(p>1 & Recv_processor!=0 & my_rank==Recv_processor)
{
{
for(k=0;k<3;k++)
{
newcod_particle[change_particle][k] = temp_particle[change_particle][k];
temp_particle[change_particle][k] = particle[change_particle][k];
}
}
}
// COMPARISON OF THE COST FUNCTION AND ACCEPTANCE OF THE NEW STATE
if(my_rank==0)
{
change = 0;
move_partnum = 0;
newcost = temp_potential[0];
for(i=0;i<Num_particles;i++)
{
if(temp_potential[i] < newcost)
{
newcost = temp_potential[i];
move_partnum = i;
}
}
//CALCULATE THE DIFFERENCE BETWEEN THE COST FUNCTIONS
delta_cost = newcost - oldcost;
if(delta_cost < 0)
{
oldcost = newcost;
change = 1;
}
else
{
m = (float)(rand())/RAND_MAX;
// printf("\nvalue of m is %f",m);
n = delta_cost/Temperature;
y = 1/exp(n);// ACCEPTANCE PROBABILITY FOR THE UPCLIMB MOVES.
if(m<y)// = EXP( - DELATCOST / TEMPERATUER)
{
oldcost = newcost;// MAKE THE MOEV PERMANENT
change = 1;
}
else change = 0;
}
if(oldcost < lowestcost)
if(change == 1)
{
for(k=0;k<3;k++)
particle[move_partnum][k] = newcod_particle[move_partnum][k];
//printf("\n The move_partnum = %d",move_partnum);
}
target_processor = move_partnum/part_per_proc;
target_particle = move_partnum % part_per_proc;
MPI_Bcast(&target_processor,1, MPI_INT,0, MPI_COMM_WORLD);
MPI_Bcast(&target_particle,1, MPI_INT,0, MPI_COMM_WORLD);
}
if(my_rank!=0)
{
MPI_Bcast(&target_processor,1, MPI_INT,0, MPI_COMM_WORLD);
MPI_Bcast(&target_particle,1, MPI_INT,0, MPI_COMM_WORLD);
}
if(my_rank==target_processor)
{
for(k=0;k<3;k++)
particle[target_particle][k] = newcod_particle[target_particle][k];
}
INNER_LOOP--;
}// END of INNER_LOOP
Temperature = alpha*Temperature;// COOLING SCHEDULE
}// END OF OUTER LOOP
printf("\n Lowest Cost = %f ", lowestcost);
endtime = MPI_Wtime();//stopping the time
time_taken=endtime-starttime;//computing time taken
// TIMING CALCULATIONS
for(j=0;j<p;j++)
{
if(my_rank==j)
MPI_Send(&time_taken,1,MPI_DOUBLE,0,j*3,MPI_COMM_WORLD);
}
if(my_rank==0)
{
totaltime=time_taken;
for(j=1;j<p;j++)
{
MPI_Recv(&timeelapsed,1,MPI_DOUBLE,j,j*3,MPI_COMM_WORLD,&status);
totaltime+=timeelapsed;
}
printf("\n\n Time taken for execution of the program is %f seconds\n",totaltime/p);
}
MPI_Finalize();
}//End of Main
/*************************************************************************************************************/
// COMPUTING THE POTENTIALS ON EACH PROCESSOR PARALLELY
double compute_potentials()
{
potential = 0.0;
// CALCULATING THE LOCAL POTENTIALS
for(i=0;i<part_per_proc;i++)
{
for(j=0;j<part_per_proc;j++)
{
r = pow(temp_particle[i][0] - temp_particle[j][0],2)+ pow(temp_particle[i][1] - temp_particle[j][1],2) + pow(temp_particle[i][2] - temp_particle[j][2],2);
if(r!=0)
potential+= (1.0/pow(r,6)) - (2.0/pow(r,3));
}
}
if(p>1)
{
//COPY THE POSITIONS OF THE LOCAL PARTICLES TO ROTATING PARTICLES
for(i=0;i<part_per_proc;i++)
for(k=0;k<3;k++)
Rotating[i][k] = temp_particle[i][k];
//The following steps are repeated p - 1 times:
for(Rotate=0;Rotate<p-1;Rotate++)
{
// Copy traveling particles to send buffer
for(i=0;i<part_per_proc;i++)
for(k=0;k<3;k++)
sendbuffer[i][k] = Rotating[i][k];
// Interleave send/receive to prevent deadlock
if(my_rank%2 == 0)
{
// Even processes send/receive
// Send traveling particles to next process
MPI_Send(&sendbuffer[0], 3*part_per_proc, MPI_DOUBLE, nextprocess,0, MPI_COMM_WORLD);
MPI_Recv(&Rotating[0], 3*part_per_proc, MPI_DOUBLE, previousprocess, 0, MPI_COMM_WORLD, &status);
}
else
{
MPI_Recv(&Rotating[0], 3*part_per_proc, MPI_DOUBLE, previousprocess, 0, MPI_COMM_WORLD, &status);
MPI_Send(&sendbuffer[0], 3*part_per_proc, MPI_DOUBLE, nextprocess, 0, MPI_COMM_WORLD);
}
//CALCULATING THE CONTRIBUTIONS TO LOCAL FORCES FROM ROTATING PARTICLES
for(i=0;i<part_per_proc;i++)
{
for(j=0;j<part_per_proc;j++)
{
r = pow(temp_particle[i][0] - Rotating[j][0],2)+ pow(temp_particle[i][1] - Rotating[j][1],2) + pow(temp_particle[i][2] - Rotating[j][2],2);
if(r!=0)
potential+= (1.0/pow(r,6)) - (2.0/pow(r,3));
}
}
}
// SENDING THE POTENTIAL VALUES FOR EACH PROCESSOR TO NODE 0
if(my_rank!=0)
MPI_Send(&potential, 1, MPI_DOUBLE, 0, my_rank*10, MPI_COMM_WORLD);
if(my_rank==0)
{
for(j=1;j<p;j++)
{
MPI_Recv(&temp, 1, MPI_DOUBLE, j, j*10, MPI_COMM_WORLD, &status);
potential+=temp;
}
}
}
return potential;
}
ANALYSIS :
The Results have been plotted for Number of particles as 16 & 32 (for 1,2,4&8 processors)
--for number of particles as 16 we can see the speed up only for 2 processor ,there after the time taken for execution increases with the number of processors. The reason may be attributed to the large overhead due to parallel processing, i.e the communication time is more than that of the computation time.
For number of particles 32 we can see the speed up for the 2,4,8 processor as well, because here the computation time is more than that of communication time.
16 particles are not too many for watching the speed up .So for getting the speed up we had to observe the readings for 32 particles as well in which a decent speed up can be observed.
RESULTS:
Timing Results:
Speed up curve:
MONTE –CARLO on next page…
- We have also tried the Monte carlo method….The code is attached along with this mail…..
In Monte Carlo Method speed up curve is not obtained for the 16 particle case but a very feeble speed up is obtained in case of 32 particles..
PROGRAM ::
#include<math.h>
#include<mpi.h>
#include<stdlib.h>
#include<stdio.h>
//definition of maximum no of particles and the number of processors
#define MAX_PARTICLE 16 //change this value to change the maximum number
of
particles
#define NUM_PROCS 8
void change_coord(int pos);
void reset_potential();
void calc_potential();
void calc_energy();
double calc_kt();
struct particle
{
int x,y,z;
double potential;
};
//defining a number of particles
struct particle par[16];
double r;
double trial,current,energy,minimum_eng;
double kt,total1,total2,avg1;
double eng[MAX_PARTICLE],diff[MAX_PARTICLE];
double each_ans[NUM_PROCS], temp_result[NUM_PROCS]; //declaration of
variables for holding values
int p,my_rank,min,max,each;
//MPI code
MPI_Status status;
//Main function
main(int argc, char *argv[])
{
//rank of processor
int i,j,k,particle_no,count; //declaration of variables required
for
calculations
double c,a;//distance between 2 particles
double starttime=0, endtime=0; //variables used
for
calculating execution time
double time_taken=0,timeelapsed=0,totaltime=0; //variables used
for
calculating execution time
MPI_Init(&argc, &argv); //Initiate MPI
MPI_Comm_rank(MPI_COMM_WORLD, &my_rank); //Set COMM and my_rank
MPI_Comm_size(MPI_COMM_WORLD, &p); //find COMM's size
//assigning number of particles to each processor
each = MAX_PARTICLE/p;
/* Assigning the charges and co-ordinates to the particles */
starttime = MPI_Wtime(); //starting timer
if(my_rank==0)
{
for(i=0; i<MAX_PARTICLE; i++)
{
par[i].x=((int)((rand()%10)+1));
par[i].y=((int)((rand()%10)+1));
par[i].z=((int)((rand()%10)+1));
par[i].potential = 0;
for(j=0;j<p;j++)
{
MPI_Send(&par[i].x,1,MPI_INT,j,j*10,MPI_COMM_WORLD);
MPI_Send(&par[i].y,1,MPI_INT,j,j*10,MPI_COMM_WORLD);
MPI_Send(&par[i].z,1,MPI_INT,j,j*10,MPI_COMM_WORLD);
MPI_Send(&par[i].potential,1,MPI_DOUBLE,j,j*10,MPI_COMM_WORLD);
}
}
}
else
{ for(j=0;j<p;j++)
if((my_rank==j)&(my_rank!=0))
for(i=0;i<MAX_PARTICLE;i++)
{
MPI_Recv(&par[i].x,1,MPI_INT,0,j*10,MPI_COMM_WORLD,&status);
MPI_Recv(&par[i].y,1,MPI_INT,0,j*10,MPI_COMM_WORLD,&status);
MPI_Recv(&par[i].z,1,MPI_INT,0,j*10,MPI_COMM_WORLD,&status);
MPI_Recv(&par[i].potential,1,MPI_DOUBLE,0,j*10,MPI_COMM_WORLD,&stat
us);
}
}
calc_potential();
energy=0;
if(my_rank==0)
{
for(count=0;count<MAX_PARTICLE;count++)
{
energy+=par[count].potential;
}
printf("\nEnergy of the system before particles are moved is
%lf",energy);
}
current=energy;
kt=calc_kt();
if(my_rank==0) printf("\n\nWait patiently while program is being
executed");
if(p>1) minimum_eng=-4.074829;
else minimum_eng=current;
while(trial>minimum_eng)
for(count=0;count<10;count++)
{
//reset_potential();
if(my_rank==0)
{
particle_no=(rand()%MAX_PARTICLE)+1;
change_coord(particle_no);
reset_potential();
for(i=0;i<NUM_PROCS;i++)
{
each_ans[i]=0;
temp_result[i]=0;
}
}
calc_potential();
if(my_rank==0)calc_energy();
if(my_rank==0)
{
if(p>1)
{
if(trial>minimum_eng)
{
if(trial<current) current=trial;
else if(trial>=current)
{
c=((rand()%1)+1)/(sqrt(rand()%5));
a=(-(trial-current))/10;
if(c<exp(a))
current=trial;
else current=current;
}
}
}
else
{
if(trial<current) current=trial;
else if(trial>=current)
{
c=((rand()%1)+1)/(sqrt(rand()%5));
a=(-(trial-current))/10;
if(c<exp(a))
current=trial;
else current=current;
}
}
}
}
endtime = MPI_Wtime();//stopping the timer
time_taken=endtime-starttime;//computing time taken
if(my_rank==0)
printf("\n\nminimum energy of the system is %lf",current);
//Calculating the time taken by each processor and then sending it to
processor with Rank 0
for(j=0;j<p;j++)
{
if(my_rank==j)
MPI_Send(&time_taken,1,MPI_DOUBLE,0,j*3,MPI_COMM_WORLD);
}
if(my_rank==0)
{
totaltime=time_taken;
for(j=1;j<p;j++)
{
MPI_Recv(&timeelapsed,1,MPI_DOUBLE,j,j*3,MPI_COMM_WORLD,&status);
totaltime+=timeelapsed;
}
printf("\n\n Time taken for execution of the program is %f
seconds\n",totaltime/p);
}
MPI_Finalize();
}
//END OF MAIN FUNCTION
double calc_kt()
{
int count,particle_no,j;
double res;
for(count=0;count<3;count++)
{
particle_no=(rand()%3)+1;
change_coord(particle_no);
reset_potential();
calc_potential();
for(j=0;j<MAX_PARTICLE;j++)
eng[count]+=par[j].potential;
}
for(count=0;count<3;count++)
total1+=eng[count];
avg1=total1/3;
for(count=0;count<3;count++)
{diff[count]=eng[count]-avg1;
if(diff[count]<0) diff[count]=-(diff[count]);
total2+=diff[count];
}
res=total2/3;
return(res);
}
void change_coord(int pos)
{
par[pos].x=((int)((rand()%10)+1));
par[pos].y=((int)((rand()%10)+1));
par[pos].z=((int)((rand()%10)+1));
par[pos].potential = 0;
}
void reset_potential()
{
int i;
for(i=0;i<MAX_PARTICLE;i++)
{
par[i].potential=0;
}
}
void calc_potential()
{
int i,j=0,k,count;
for(i=0; i<MAX_PARTICLE; i++)
{
for(j=0;j<p;j++)
{
each_ans[j]=0;
if(my_rank==j)
{
min = (my_rank*each); //calculating which particles
r
being given to processor with rank j
max = (each*(my_rank+1));
for(k=min; k<max; k++)
{
r = sqrt(pow(par[i].x - par[k].x,2)+ pow(par[i].y - par[k].y,2)+
pow(par[i].z - par[k].z,2)); //compute distance between 2 particles
if(r!=0)
each_ans[j]+= (1/pow(r,12))-(2/pow(r,6)); //calculation of
potential by
each processor
else each_ans[j]+= 0;
}
if(my_rank!=0)
MPI_Send(&each_ans[j],1,MPI_DOUBLE,0,j*10,MPI_COMM_WORLD);//Send
answer
to processor with rank 0
}
}
if(my_rank==0)
{
par[i].potential = each_ans[0];
for(j=1;j<p;j++)
{
MPI_Recv(&temp_result[j],1,MPI_DOUBLE,j,j*10,MPI_COMM_WORLD,&status)
;//Receive
data from other processors
par[i].potential+=temp_result[j]; //Total potential on particle i
}
//printf("\npotential of particle
%d=%lf",i,par[i].potential);
}
}
}
void calc_energy()
{
int j;
trial=0;
for(j=0;j<MAX_PARTICLE;j++)
{
trial+=par[j].potential;
}
}