Parallelization by dylan-copeland · Pull Request #60 · llnl/scaleupROM

dylan-copeland · 2025-08-26T00:57:37Z

Parallelization of ROM and FOM linear system assembly and solution, using HypreParMatrix.

This is the first step in parallelizing the library, focused on ComponentTopologyHandler and the Stokes and Poisson examples. Some things still remain to be parallelized and optimized:

SubMeshTopologyHandler is not supported in parallel.
MFEMROMHandler::Solve still has global vectors for reduced_rhs and reduced_sol.
Some sparse matrices have their entries copied inefficiently, see PoissonSolver::CreateHypreParMatrix() and StokesSolver::CreateHypreParMatrix().
InterfaceForm::AssembleInterfaceMatrices should use a sparse array of SparseMatrix pointers, rather than a dense Array2D.
MFEMROMHandler::CreateHypreParMatrix should use a sparse array of blocks.
For problems where a component type is used many times, we can optimize by constructing one Mesh and FiniteElementSpace per component type. Currently, these are constructed for each instance of the component.
TopologyHandler::LoadBalance() is written only for simple Cartesian cases and has room for improvement in general

…d in parallel.

chldkdtn · 2025-09-02T05:24:41Z

I guess the failing test is because libROM PR #316 has not been merged yet, right?

dreamer2368

@dylan-copeland , I've gone through your PR. While I appreciate your efforts on this parallelization work, I'm a bit cautious about merging it into main. Most importantly, this intermediate stage doesn't keep the compatibility for other solvers that are not yet parallelized.

I suggest to keep this branch until the parallelization is done for the entire framework. Another minor suggestion is to have a short definition/description on newly introduced variables for parallel workflow. Also please see the other comments.

dreamer2368 · 2026-06-09T21:44:46Z

   // Receive topology info
   numSub = topol_data.numSub;
+   numSubLoc = topol_handler->GetNumLocalSubdomains();
+   numSubStored = meshes.Size();


I don't see the difference between numSubLoc and numSubStored. Based on this initialization, they should be identical. Can you only use numSubLoc, or explain the difference in PR or in the class definition?

dreamer2368 · 2026-06-09T21:57:23Z

   HYPRE_BigInt sys_glob_size;
   HYPRE_BigInt sys_row_starts[2];
   HypreParMatrix *globalMat_hypre = NULL;
+   HypreParMatrix *systemOp_hypre = NULL;


This doesn't seem to get deleted at the destruction of PoissonSolver.

dreamer2368 · 2026-06-09T22:02:32Z

            break;
         }
         Array<int> &bdr_marker = *bfnfi_marker[k];
+	 /*


why is this commented out?

dreamer2368 · 2026-06-09T22:02:57Z

            break;
         }
         Array<int> &bdr_marker = *bfnfi_marker[k];
+	 /*


same as above.

dreamer2368 · 2026-06-09T22:03:08Z

      else
      {
         Array<int> &bdr_marker = *bfnfi_marker[k];
+	 /*


same as above.

dreamer2368 · 2026-06-09T22:04:28Z

      int midx = -1;
      int vidx = (separate_variable) ? b % num_var : 0;
-      for (int m = 0; m < numSub; m++)
+      for (int m = 0; m < numSubLoc; m++)


topol_handler->GetMeshType(m) seems to take global index, not local index. Is this still valid?

dreamer2368 · 2026-06-09T22:11:40Z

-         int local_dim = CAROM::split_dimension(dim_ref_basis[k], MPI_COMM_WORLD);
-         basis_reader = new CAROM::BasisReader(basis_name + basis_tags[k].print(), CAROM::Database::formats::HDF5_MPIO, local_dim);
+         int local_dim = dim_ref_basis[k];
+         basis_reader = new CAROM::BasisReader(basis_name + basis_tags[k].print() + ".000000", CAROM::Database::formats::HDF5, local_dim, MPI_COMM_NULL);


I'm not objecting this convention change. However, until we fully implement parallel setting on all solvers, the code needs to be compatible in both cases.

dreamer2368 · 2026-06-09T22:14:53Z

+   for (int mm = 0; mm < numSubLoc; mm++)
+     {
+      const int m = mm + ossub;
+      const int m_global = topol_handler->GlobalSubdomainIndex(mm);


what is the difference between m and m_global? If they are identical, can we choose m_global only?

dreamer2368 · 2026-06-09T22:17:06Z

+	localBlocks[0] = bsum;
+    }
+
+  MFEM_VERIFY(sum == gsize && bsum == num_rom_blocks, "");


Can we print out a clear error message for this?

dreamer2368 · 2026-06-09T22:18:48Z

   {
-      if (!sample_generator->IsMyJob(s)) continue;
+      if (!sample_generator->IsMyJob(s)) {
+	s++;


dreamer2368 · 2026-06-09T22:45:29Z

+       systemOp->SetBlock(0,1, Bt);
+       systemOp->SetBlock(1,0, B);
+     }
+   else


So parallel path doesn't seem to set M, B, Bt, and systemOp, which are used in !direct_solve case in StokesSolver::Solve. Can we at lease raise an error if parallel iterative case is set up?

dreamer2368 · 2026-06-09T22:47:38Z

      Vector pres_view(sol_byvar, vblock_offsets[1], vblock_offsets[2] - vblock_offsets[1]);

      // TODO(kevin): parallelization.
      double tmp = pres_view.Sum() / pres_view.Size();


I think this part needs a proper average calculation in parallel path.

dreamer2368 · 2026-06-09T22:49:47Z

+  if (set_oper) mumps->SetOperator(*systemOp_hypre);
+}
+
+void StokesSolver::SetupMUMPSSolverParallel()


do we not need delete commands similar to the serial version?

dreamer2368 · 2026-06-09T22:53:02Z


   pmMat = new BlockMatrix(p_offsets);
-   for (int m = 0; m < numSub; m++)
+   for (int m = 0; m < numSubStored; m++)


This may not have run in parallel case since it is only used for iterative solver. the pressure mass matrix needs to be properly set up in parallel case. If the PR does not support iterative solver, then raise an error if this function is called at parallel case.

dylan-copeland added 5 commits February 13, 2025 15:20

Generalized the system matrix as a HypreParMatrix, which can be solve…

6fd79a0

…d in parallel.

Merge branch 'main' of github.com:LLNL/scaleupROM into parallel

65152e4

Parallelized ROM and FOM global systems with HypreParMatrix.

f1ea8f5

Parallelized Poisson solve.

8f18020

Refactoring.

a0c2a73

dylan-copeland added the enhancement New feature or request label Aug 26, 2025

dylan-copeland added 6 commits August 29, 2025 09:47

New parallel Poisson example.

339833c

Minor fixes.

cc58d14

Adding mesh files.

dbaec76

Fixed memory bug. Removed redundant mesh files.

ae26fbb

Fixed more memory bugs.

72380dd

Merge branch 'main' of github.com:LLNL/scaleupROM into parallel

af3fcc5

chldkdtn requested a review from dreamer2368 September 2, 2025 05:23

dylan-copeland mentioned this pull request Sep 3, 2025

Parallel reading of serial basis files llnl/libROM#316

Merged

dylan-copeland added 8 commits September 27, 2025 20:57

Docker

57d7633

Bug fix.

7cd5c9d

Restoring SubMeshTopologyHandler.

a53cf3c

Fixed test_topol.

33f6875

Fixed test_workflow.

9d328bf

Input file option for hypre matrix assembly.

b312640

Update Stokes example.

e459ee8

Missing change.

c9c162c

dreamer2368 requested changes Jun 9, 2026

View reviewed changes

dreamer2368 reviewed Jun 9, 2026

View reviewed changes

Uh oh!

Conversation

dylan-copeland commented Aug 26, 2025 • edited by dreamer2368 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

chldkdtn commented Sep 2, 2025

Uh oh!

dreamer2368 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dylan-copeland commented Aug 26, 2025 •

edited by dreamer2368

Loading