Skip to content

ShiqiHe000/DG_wave_c

Repository files navigation

Dynamic Load Balancing for a hp-adaptive Discontinuous Galerkin Wave Equation Solver via Space-Filling Curve and Advanced Data Structure

Documentation Status

Introduction

We combine a high-order method -- the discontinuous Galerkin spectral element method (DG-SEM), with parallel adaptive mesh refinement and coarsening (AMR) techniques and apply it to a two-dimensional wave equation solver.

Advanced data structures and dynamic load balancing are applied to the solver to achieve efficient data management and high-level parallelism.

Setup

What You Need

  • C++
  • CMake (at least version 3.9)
  • GCC 7.5.0 (GNU Compiler Collection)
  • OpenMPI 4.0.2

Compile and Execute

mkdir build && cd build
cmake ..
make 
mpirun -np 4 main

You also can use parallel make, e.g., make -j 16. 16 threads will be used to compile the code.

mpirun -np 4 main executes the program with 4 processors, you could change the processor number as you want.

Documentation

A detailed documentation an be found at here.

Source Code Documentation

Source code explanation: Source code documentation

Approximation of Wave Equation

The basic model of wave propagation is the wave equation:

The variable p represents the acoustic pressure and c is the sound speed.

AMR Refinement Types: hp-adaptivity

Two types of refinements are implemented in this work: h-refinement and p-refinement.

h-refinement

Subdivide an element into children elements.

p-refinement

Raise polynomial orders inside the targeted element.

AMR Data Structure: Hash Table

In stead of using the conventional tree data structure, hash table data structure is employeed.

Dynamic Load Balancing

The adaptivity of the program introduces load imbalance among the processors, causing the program to slow down. In order to achieve high-level parallelism, dynamic load balancing strategy is apllied to balance the workload.

The workload redistribution problem is an optimization problem with two main goals: the workload should be distributed evenly among the processors with small memory overhead and the interfacing boundaries between partitions should be as small as possible. The optimization problem has been proven to be NP-hard.

Traditional Way: Graph-based Repartitioning Algorithm

pros:

  • Well-studied.
  • Multiple libraries.

cons:

  • Reached scalability limits (requires "global" graph knowledge).
  • High memory consumption.

My Proposal: Space-Filling Curves (SFCs) Based Repartitioning Algorithm

pros:

  • Simplifies a multi-dimensional partitioning problem into a one-dimensional one.
  • SFCs have good locality and can be generated fast.
  • Low memory usage.
  • Favours distributed systems.

cons:

  • The partitioning boundaries are decided by the trajectry of the space-filling curve.

Hilber curve is chosen for this work. The figure below shows the first three levels of Hilbert curve.

Program Proformace

Speedup

The test is done on Cedar with processor number ranging from 32 to 2048. We gained a maximum speedup around 8.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages