-
Notifications
You must be signed in to change notification settings - Fork 120
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[email protected] + OpenMP: fixes including for -O1
- Add explicit mapping clauses to avoid a crash in test-solver when compiled with optimisations enabled. - Use an explicit if (nt->compute_gpu) block instead of relying on if clauses in OpenACC and OpenMP directives; nvc++ 22.3 does not seem to respect these clauses at least in some cases. - To avoid code duplication, introduce a solve_interleaved2_loop_body function that combines the triangularisation and back substitution steps. - Because the compiler cannot deal with many layers of function calls in device code, manually inline function calls into solve_interleaved2_loop_body.
- Loading branch information
Showing
1 changed file
with
118 additions
and
107 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters