Skip to content

Parallelization#60

Open
dylan-copeland wants to merge 19 commits into
mainfrom
parallel
Open

Parallelization#60
dylan-copeland wants to merge 19 commits into
mainfrom
parallel

Conversation

@dylan-copeland

@dylan-copeland dylan-copeland commented Aug 26, 2025

Copy link
Copy Markdown
Collaborator

Parallelization of ROM and FOM linear system assembly and solution, using HypreParMatrix.

This is the first step in parallelizing the library, focused on ComponentTopologyHandler and the Stokes and Poisson examples. Some things still remain to be parallelized and optimized:

  • SubMeshTopologyHandler is not supported in parallel.
  • MFEMROMHandler::Solve still has global vectors for reduced_rhs and reduced_sol.
  • Some sparse matrices have their entries copied inefficiently, see PoissonSolver::CreateHypreParMatrix() and StokesSolver::CreateHypreParMatrix().
  • InterfaceForm::AssembleInterfaceMatrices should use a sparse array of SparseMatrix pointers, rather than a dense Array2D.
  • MFEMROMHandler::CreateHypreParMatrix should use a sparse array of blocks.
  • For problems where a component type is used many times, we can optimize by constructing one Mesh and FiniteElementSpace per component type. Currently, these are constructed for each instance of the component.
  • TopologyHandler::LoadBalance() is written only for simple Cartesian cases and has room for improvement in general

@dylan-copeland dylan-copeland added the enhancement New feature or request label Aug 26, 2025
@chldkdtn chldkdtn requested a review from dreamer2368 September 2, 2025 05:23
@chldkdtn

chldkdtn commented Sep 2, 2025

Copy link
Copy Markdown
Collaborator

I guess the failing test is because libROM PR #316 has not been merged yet, right?

@dreamer2368 dreamer2368 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dylan-copeland , I've gone through your PR. While I appreciate your efforts on this parallelization work, I'm a bit cautious about merging it into main. Most importantly, this intermediate stage doesn't keep the compatibility for other solvers that are not yet parallelized.

I suggest to keep this branch until the parallelization is done for the entire framework. Another minor suggestion is to have a short definition/description on newly introduced variables for parallel workflow. Also please see the other comments.

Comment thread src/multiblock_solver.cpp
// Receive topology info
numSub = topol_data.numSub;
numSubLoc = topol_handler->GetNumLocalSubdomains();
numSubStored = meshes.Size();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the difference between numSubLoc and numSubStored. Based on this initialization, they should be identical. Can you only use numSubLoc, or explain the difference in PR or in the class definition?

HYPRE_BigInt sys_glob_size;
HYPRE_BigInt sys_row_starts[2];
HypreParMatrix *globalMat_hypre = NULL;
HypreParMatrix *systemOp_hypre = NULL;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to get deleted at the destruction of PoissonSolver.

Comment thread src/rom_nonlinearform.cpp
break;
}
Array<int> &bdr_marker = *bfnfi_marker[k];
/*

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this commented out?

Comment thread src/rom_nonlinearform.cpp
break;
}
Array<int> &bdr_marker = *bfnfi_marker[k];
/*

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Comment thread src/rom_nonlinearform.cpp
else
{
Array<int> &bdr_marker = *bfnfi_marker[k];
/*

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

Comment thread src/rom_handler.cpp
int midx = -1;
int vidx = (separate_variable) ? b % num_var : 0;
for (int m = 0; m < numSub; m++)
for (int m = 0; m < numSubLoc; m++)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

topol_handler->GetMeshType(m) seems to take global index, not local index. Is this still valid?

Comment thread src/rom_handler.cpp
int local_dim = CAROM::split_dimension(dim_ref_basis[k], MPI_COMM_WORLD);
basis_reader = new CAROM::BasisReader(basis_name + basis_tags[k].print(), CAROM::Database::formats::HDF5_MPIO, local_dim);
int local_dim = dim_ref_basis[k];
basis_reader = new CAROM::BasisReader(basis_name + basis_tags[k].print() + ".000000", CAROM::Database::formats::HDF5, local_dim, MPI_COMM_NULL);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not objecting this convention change. However, until we fully implement parallel setting on all solvers, the code needs to be compatible in both cases.

Comment thread src/multiblock_solver.cpp
for (int mm = 0; mm < numSubLoc; mm++)
{
const int m = mm + ossub;
const int m_global = topol_handler->GlobalSubdomainIndex(mm);

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is the difference between m and m_global? If they are identical, can we choose m_global only?

Comment thread src/rom_handler.cpp
localBlocks[0] = bsum;
}

MFEM_VERIFY(sum == gsize && bsum == num_rom_blocks, "");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we print out a clear error message for this?

Comment thread src/main_workflow.cpp
{
if (!sample_generator->IsMyJob(s)) continue;
if (!sample_generator->IsMyJob(s)) {
s++;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indents

Comment thread src/stokes_solver.cpp
systemOp->SetBlock(0,1, Bt);
systemOp->SetBlock(1,0, B);
}
else

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So parallel path doesn't seem to set M, B, Bt, and systemOp, which are used in !direct_solve case in StokesSolver::Solve. Can we at lease raise an error if parallel iterative case is set up?

Comment thread src/stokes_solver.cpp
Vector pres_view(sol_byvar, vblock_offsets[1], vblock_offsets[2] - vblock_offsets[1]);

// TODO(kevin): parallelization.
double tmp = pres_view.Sum() / pres_view.Size();

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this part needs a proper average calculation in parallel path.

Comment thread src/stokes_solver.cpp
if (set_oper) mumps->SetOperator(*systemOp_hypre);
}

void StokesSolver::SetupMUMPSSolverParallel()

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not need delete commands similar to the serial version?

Comment thread src/stokes_solver.cpp

pmMat = new BlockMatrix(p_offsets);
for (int m = 0; m < numSub; m++)
for (int m = 0; m < numSubStored; m++)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may not have run in parallel case since it is only used for iterative solver. the pressure mass matrix needs to be properly set up in parallel case. If the PR does not support iterative solver, then raise an error if this function is called at parallel case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants