You must be signed in to change notification settings - Fork 16
Measuring XGC with TAU on Summit and Spock
Here are some instructions for using TAU to measure XGC on Summit and Spock at OLCF.
There is a version of the CAMTIMERS library which has been integrated with the PerfStubs support. On OLCF resources, that library is now installed in /gpfs/alpine/world-shared/phy122/lib/install/summit/camtimers-perfstubs/nvhpc21.7
. For more information on PerfStubs, see https://github.com/khuck/perfstubs/blob/master/perfstubs_api/README.md
The source for this installation is in https://github.com/khuck/camtimers.
TAU has tool support for the PerfStubs interface.
TAU is installed in /gpfs/alpine/world-shared/phy122/lib/install/summit/tau2/nvhpc21.7
on summit. For use with GENE, it is installed in /gpfs/alpine/world-shared/phy122/lib/install/summit/tau2/gcc9.3
TAU was installed from https://github.com/UO-OACISS/tau2/, using these commands:
git clone https://github.com/UO-OACISS/tau2.git
cd tau2
module load nvhpc/21.7 spectrum-mpi cuda/11.4 papi/ binutils/2.36.1
./configure -mpi \
-c++=mpicxx \
-cc=mpicc \
-fortran=mpif90 \
-iowrapper \
-otf=download \
-ompt \
-cuda=/sw/summit/nvhpc_sdk/rhel8/Linux_ppc64le/21.7/cuda/11.4 \
-papi=/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/papi- \
-bfd=/sw/summit/spack-envs/base/opt/linux-rhel8-ppc64le/gcc-8.3.1/binutils-2.36.1-abgveowfozcbngvoli6duel7zsfguvui \
-prefix=/gpfs/alpine/world-shared/phy122/lib/install/summit/tau2/nvhpc21.7 \
make -j16 install
The TAU configuration for use with GENE was built with this configuration:
module load otf2/2.3
./configure -mpi \
-c++=mpicxx \
-cc=mpicc \
-fortran=mpif90 \
-iowrapper \
-otf=${OLCF_OTF2_ROOT} \
-cuda=${OLCF_CUDA_ROOT} \
-papi=${OLCF_PAPI_ROOT} \
-prefix=/gpfs/alpine/world-shared/phy122/lib/install/summit/tau2/gcc9.3 \
make -j16 install
TAU is used at runtime with the tau_exec
wrapper script. The script will preload TAU libraries and set appropriate environment variables. A sample script for running the XGC test program is:
#BSUB -W 0:02
#BSUB -nnodes 1
#BSUB -N [email protected]
#BSUB -B [email protected]
# Load modules and set paths, same as the build environment
source /ccs/home/khuck/ECP-WDM/src/sourceme-summit.sh
# VERY IMPORTANT! Darshan and TAU both try to wrap MPI_Init/MPI_Finalize, but only one library can...
module unload darshan-runtime
cd /gpfs/alpine/world-shared/projectid/userid/summit/XGC1Example
# create restart file directory
mkdir -p restart_dir
export xgc_bin_path=/ccs/home/userid/ECP-WDM/src/XGC-Devel/build_full_summit/bin/xgc-es-cpp-gpu
export tau_path=/gpfs/alpine/world-shared/phy122/lib/install/summit/tau2/nvhpc21.7/ibm64linux/bin
cmd="tau_exec -T nvhpc21.7_omp -ompt"
jsrun -n 4 -r 4 -a 1 -g 1 -c 7 -b rs $cmd $xgc_bin_path --test
APEX also has tool support for the PerfStubs interface. APEX is a slightly different, but related, tool to TAU. For more information on using APEX, see https://github.com/UO-OACISS/apex.
APEX is installed in in /gpfs/alpine/world-shared/phy122/lib/install/summit/apex/nvhpc21.7
. on summit.
APEX was installed with these commands:
module load nvhpc/21.7 spectrum-mpi cuda/11.4 papi/ binutils/2.36.1
module load otf2/2.3
module load gperftools/2.8.1
git clone https://github.com/UO-OACISS/apex.git
cd apex
rm -rf ${builddir} ${instdir}/include ${instdir}/lib
mkdir ${builddir}
cd ${builddir}
set -x
cmake \
-DCMAKE_C_COMPILER=`which nvc` \
-DCMAKE_CXX_COMPILER=`which nvc++` \
-DCMAKE_INSTALL_PREFIX=/gpfs/alpine/world-shared/phy122/lib/install/summit/apex/nvhpc21.7 \
make -j8
make -j install
Here's a sample job script for running with APEX:
#BSUB -W 0:02
#BSUB -nnodes 1
#BSUB -N [email protected]
#BSUB -B [email protected]
# Load modules and set paths, same as the build environment
source /ccs/home/khuck/ECP-WDM/src/sourceme-summit.sh
# VERY IMPORTANT! Darshan and APEX both try to wrap MPI_Finalize, but only one library can...
module unload darshan-runtime
cd /gpfs/alpine/world-shared/projectid/userid/summit/XGC1Example
# create restart file directory
mkdir -p restart_dir
export xgc_bin_path=/ccs/home/userid/ECP-WDM/src/XGC-Devel/build_full_summit/bin/xgc-es-cpp-gpu
# Options: see the output of `apex_exec --apex:help` for more info
apex_cmd="apex_exec --apex:quiet --apex:ompt --apex:kokkos --apex:cuda --apex:gtrace"
jsrun -n 4 -r 4 -a 1 -g 1 -c 10 -b rs $apex_cmd $xgc_bin_path --test
Here's a view of the Google Trace Events trace generated from the above command, visualized in Google Chrome:
And the same trace, visualized in Perfetto:
- There is one problematic timer in the current XGC code base - the “F_SOURCE_FIRST_PART” timer that is started in
(in XGC_core/main_loop_f90_routines.F90) and stopped inadd_particle_and_grid_dist_funcs_wrap
overlaps with other timers (“UPDATE_ANALYTIC_F0” for example), and should either be removed or promoted up a function call or two, if it is intended to measure a phase. If I commented out, TAU handles the timers fine. - Also, it appears that even though nvc/nvc++/nvfortran claims to support OpenMP 5.0 OMPT callbacks, TAU and APEX are not getting any. More investigation is needed, but it appears that NVIDIA/PGI implemented provides the
header but doesn't actually provide any OMPT support in the runtime. Otherwise, we do see all the camtimers, Kokkos, and CUDA (host and device) events in the measurements. - Darshan and TAU don't play well together - see above. Make sure the Darshan runtime module is unloaded when running the simulation with TAU.
Still have questions? Check out the official documentation or contact [email protected] for help.
- Home
- Installing TAU
- Using TAU
- Measuring XGC with TAU on Summit and Spock
- Configuring TAU to measure IO libraries
- Instrumenting CXX Applications
- Measuring the Papyrus Key Value Store
- Using TAU to Profile and or Trace ADIOS
- Using the Monitoring Plugin
- Quick Start for p2z with TAU
- Quick Start for LULESH with TAU
- Paraprof with X11 Forwarding
- Using the TAU Skel Plugin
- Using TAU with Python
- Streaming TAU data to ADIOS2 Profiles
- Frequently Asked Questions (FAQ)