GPU computation in recent years has seen extensive growth due to advancement in both hardware and software stack. This has led to increase in the use of GPUs as accelerators across a broad spectrum of applications. This work deals with the use of general purpose GPUs for performing CFD computations.
The paper discusses strategies and findings on porting a large multi-functional CFD code to the GPU architecture. Within this framework, the most compute intensive segment of the software, the BiCGSTAB linear solver using additive Schwarz block pre-conditioners with point Jacobi iterative smoothing is optimized for the GPU platform using various techniques in CUDA Fortran. Representative turbulent channel and pipe flow are investigated for validation and benchmarking purposes. Both single and double precision calculations are highlighted.
It is found that the precision has a negligible effect on the accuracy of predicted turbulent statistics. However, it was found that single precision calculations led to instabilities in the initial convergence of the pressure equation if the convergence criterion was set at too low a value. This was remedied by limiting the number of iterations during the initial stages of the calculation. For a modest single block grid of 64×64×64, the turbulent channel flow computations showed a speedup of about 8 fold in double precision whereas it was more than 13 fold for single precision on the NVIDIA Tesla GPU. For the pipe flow consisting of 1.78 million grid cells distributed over 36 blocks, the gains were more modest at 4.5 and 6.5 for double and single precision respectively.