# 并行科学计算代写｜COMP3577 Parallel Scientific Computing I

## 这是一篇来自英国的关于n体模拟并行科学计算I的计算机代写

**Step 1: A full ***N***-body simulator **

The first task is to extend the class NBodySimulation into an *N*-body simulator. This code will be the starting point for the optimisations and parallelisation in later sections.

**Step 1.1: Multiple free objects **

In the template code in the NBodySimulation class, one object (the first one) is free, while the other *N **− *1 are fixed in space. Alter the code so that all objects move freely through space;they should be subject to gravitational attraction, but collisions should be ignored. Ensure that the final output of the simulation (minimal object distances, maximum velocities, final number of objects and objects’ positions) remains valid. This can be tested by setting up some simple simulations and checking that their final states are as expected.

What is the largest stable time-step? Modify the code so that the time-step taken is not fixed,and ensure that a stable time-step is always taken. Marks will be given for the correctness of the solution.

The Python script create_initial_conditions.py allows you to generate arbitrarily complex,random initial configurations of objects. When marking, we will use this script to generate test scenarios, with masses *m*1, . . . , *m**N *normalised so that *m**i **∈ *]0*, *1].

**Questions **

- What is the complexity of your implementation in terms of
*N*, the number of bodies at the beginning of the simulation? Why?

- What methods could be used to reduce the complexity of the implementation? Note: you do not need to implement these improvements.

- What is the convergence behaviour of the implemented time-stepping scheme? Create a plot which clearly shows the convergence order. Which order is it? Is it the order you expected? Why?

**Step 1.1: Collisions **

The aim of this step is to modify the code of the NBodySimulation class to make the objects merge upon collision.

Two objects *b*1 and *b*2 collide if after a time step they end up in positions *x*1 and *x*2 such that

*|**x*1 *− **x*2*| ≤ **C*(*m*1 + *m*2)*, C *= 10*−*2 */N, *

where *m*1 and *m*2 are the masses of *b*1 and *b*2, respectively, and *N *is the number of bodies at the beginning of the simulation. It is not necessary to check the trajectory of the objects, you can simply use their position at the end of each step.

The merger of the two objects has mass

*m *= *m*1 + *m*2*. *

Velocity and position should be the mass-weighted mean of the velocities and positions of the colliding objects: the new object should have position *x *and velocity *v *defined by

*x *=*m*1*x*1 + *m*2*x*2/*m*1 + *m*2*, v *=*m*1*v*1 + *m*2*v*2/*m*1 + *m*2*, *

where *v*1 and *v*2 are the velocities of *b*1 and *b*2, respectively.

**Questions **

- Check the convergence order again. Is it still the same?

**Step 2: Molecular Forces **

Now assume that the force acting on the bodies is no longer gravity, but some short range force molecular force

*F**ij *(*r**ij *) ={10 0*.*1*r**ij * 13*− * 0*.*1*r**ij * 9 *r**ij **, *for *r**ij **≤ **r**c**, *0*, *for *r**ij **> r**c**,*

with a cut-off radius *r**c*. In step-2.cpp, you will implement an *N*-body simulation, based on NBodySimulation, which uses the force (1) instead of gravity. A convenient way to do this is by adding methods to the NBodySimulationMolecularForces class, which extends the NBodySimulation class used in the previous step.

You should carefully consider which methods of NBodySimulation require overriding. Even though this is not required, you may want to split the methods in that class into smaller methods to improve code reusability. You may drop the code handling the collision of object: once the particles come close enough we now expect them to repel (the new force (1) should handle this on its own for a sufficiently small time step). Make sure to choose a reasonable value for the cut-off radius *r**c*.

The existing data structure will lead to inefficient code. You should replace it with one that will allow you to find all bodies within the cut-off radius more efficiently.

**Questions **

- How did you choose the cut-off radius? Why?
- Describe the data structure you have chosen to implement the method.
- Would you change your choice if the particles were moving very quickly? Or if they were only vibrating?

**Step 3: Vectorisation **

In step-3.cpp, you will improve the performance of the code in the NBodySimulation class by exploiting vectorisation. To do this, add methods to the NBodySimulationVectorised class,which extends NBodySimulation.

Think carefully about the data structures the code is using, and assess the efficiency of your solution. Where appropriate, add OpenMP single-thread vectorisation pragmas. You might want to use performance analysis tools to find the appropriate places.

Marks will mainly be awarded for performance, but you have to ensure that you don’t break the code semantics.

**Step 4: Instruction-level parallelism **

In step-4.cpp, you should extend the code in the NBodySimulationVectorised class by adding another level of parallelism. Parallelise your code with OpenMP multi-thread parallelisation pragmas. You can do this by adding methods to the NBodySimulationParallelised in step-4.cpp.

This class inherits from the NBodySimulationVectorised class you used in the previous steps.

You are allowed to change this in step-4.cpp and make the NBodySimulationParallelised class extend NBodySimulation instead. You may wish to do so, for example, in the following cases.

- If you struggled with step 3 and your vectorised code does not work or is slower than the sequential code in NBodySimulation.

- When none of the methods you added to NBodySimulationVectorised can be reused in NBodySimulationParallelised. If that is the case, you may prefer to copy the code of the NBodySimulationVectorised class over to step-4.cpp, then rename the copied class NBodySimulationParallelised, and finally make any required modifications.

Ensure that you do not break the global statistics *v**max *and *dx**min *which are plotted after each step. This step assesses your understanding of the OpenMP parallelisation paradigm.

Marks will be awarded for performance, but you have to ensure that you don’t break the code semantics.