IGA2025

Trace Theory Based Methodology For Constructing Optimal Concurrent Algorithms for integrating three-dimensional B-splineFfunctionsIintoMmachines With Shared memory such as GPU

  • Szyszka, Anna (AGH University of Krakow)
  • Studziński, Patryk (AGH University of Krakow)
  • Sumara, Szymon (AGH University of Krakow)
  • Woźniak, Maciej (AGH University of Krakow)

Please login to view abstract download link

We examined efficient methods for integrating B-spline basis functions in IGA-FEM using trace theory \cite{1,3}. We analyzed the cost of assembling the mass matrix and explored parallelization scenarios for two standard integration methods (classical and sum factorization). Our goal was to effectively utilize modern clusters with hybrid memory by addressing the shared memory component of concurrency. We evaluated computation performance on a GPU and offered a practical implementation strategy with near-optimal scheduling methodology. We analyze the performance of classical integration and sum factorization in different scenarios \cite{2} by examining experimental computational times. Our focus was on the scenario with polynomial order basis functions, as it was anticipated to be the best scenario for sum factorization. For a mesh size of , we evaluated three scenarios: 1) single-core CPU, 2) shared memory CPU, and 3) GPU. The classical integration on a single core took 11752.54 seconds, while the 12-core OpenMP implementation reduced the time to 1125.72 seconds. The estimated GPU implementation was expected to take only 5.87 seconds. For sum factorization, single-core execution took 394.28 seconds, the 4-core OpenMP implementation took 112.65 seconds, and estimated GPU implementation was expected to take 10.78 seconds.