Skip to content

[Core] Correcting a BIG performance bug of model_part_io#14433

Open
RiccardoRossi wants to merge 3 commits into
masterfrom
ReadingRegressionFix
Open

[Core] Correcting a BIG performance bug of model_part_io#14433
RiccardoRossi wants to merge 3 commits into
masterfrom
ReadingRegressionFix

Conversation

@RiccardoRossi
Copy link
Copy Markdown
Member

@RiccardoRossi RiccardoRossi commented May 6, 2026


name: ✨ Feature
about: Correcting a BIG performance bug of model_part_io


📝 Description

At some point a BIG performance bug was introduced in the ReadGeometriesBlock function of ModelPartIO.

Key changes

Essentially an insert was triggered for every new geomety which was triggering quadratic times in the reading phase when geometries were to be read in non-cosecutive order (has happens when multiple parts are written)

Validation

CI passes.

🆕 Changelog

At some point a BIG performance bug was introduced in the ReadGeometriesBlock function of ModelPartIO

essentially an insert was triggered for every new geomety which was triggering quadratic times in the reading phase when geometries were to be read in non-cosecutive order (has happens when multiple parts are written)
@RiccardoRossi RiccardoRossi requested a review from a team as a code owner May 6, 2026 12:56
@RiccardoRossi RiccardoRossi added FastPR This Pr is simple and / or has been already tested and the revision should be fast Bugfix labels May 6, 2026
roigcarlo
roigcarlo previously approved these changes May 6, 2026
Comment thread kratos/sources/model_part_io.cpp Outdated
@loumalouomega loumalouomega changed the title correcting a BIG performance bug of model_part_io [Core] Correcting a BIG performance bug of model_part_io May 6, 2026
@RiccardoRossi
Copy link
Copy Markdown
Member Author

the code is

    /// Inserts a list of pointers to geometries
    template<class TIteratorType >
    void AddGeometries(TIteratorType geometries_begin,  TIteratorType geometries_end)
    {
        EntityRangeChecker<GeometryContainerType>()(this, geometries_begin, geometries_end);

        InsertEntityRange([](ModelPart* pModelPart) {
            return &(pModelPart->GetMesh().Geometries());
        }, geometries_begin, geometries_end);
    }

in the InsertEntityRange it does create a container, but it does an insert there...

@matekelemen
Copy link
Copy Markdown
Contributor

but it does an insert there...

geometries are stored in a PointerVectorSet. it has an insert function. that must make sure that items are inserted in a sorted manner. if that's not the case, we have a bigger issue than performance

@RiccardoRossi
Copy link
Copy Markdown
Member Author

I do.not question thqt at the end they are sorted.

The ppint is that inserting sorted values is (way) faster

@roigcarlo
Copy link
Copy Markdown
Member

roigcarlo commented May 14, 2026

Mate is right, it follows this sequence:
1 - You insert in a temporal std::vectro
2 - That goes to the modelpart AddGeometries range.
3 - Since the interal container (PVS) and your type (std::vector) are different it is considered case 1, so it inserts the whole range into a PVS.
4 - Inside the PVS we end up executing:

    template <class InputIterator>
    void insert(InputIterator first, InputIterator last)
    {
        // We copy always the input range to a temp not to have the input range mutated.
        std::vector<TPointerType> temp;
        temp.reserve(std::distance(first, last));
        for (auto it = first; it != last; ++it) {
            temp.push_back(GetPointer(it));
        }
        std::sort(temp.begin(), temp.end(), CompareKey());
        auto new_last = std::unique(temp.begin(), temp.end(), EqualKeyTo());
        SortedInsert(temp.begin(), new_last);

        // KRATOS_ERROR << "HELLO THERE";
    }

Which ensure that we end up with a sorted PVS at the end.

The problem is not that every new geometry triggers a sort, the problem is that every new geometry added triggers a call to std::lower_bound which adds a O(logN) cost, and also which in the average case triggers a std::vector insert that causes a data shift that dominates the cost and effectively making it behave as a O(N²).

Still not sure why the new version of the code is making the constitutive law application detect duplicates.

Edit: @matekelemen is right about the sort, @RiccardoRossi is right about the cost.

@matekelemen
Copy link
Copy Markdown
Contributor

matekelemen commented May 14, 2026

Still not sure why the new version of the code is making the constitutive law application detect duplicates.

Because the TC decided to go with #12564 that is, to no one's surprise, difficult to maintain. Anyway, you can't just use range insertion by itself because you need to pre-check the array for permissible duplicates manually.

@matekelemen
Copy link
Copy Markdown
Contributor

The problem is not that every new geometry triggers a sort, the problem is that every new geometry added triggers a call to std::lower_bound which adds a O(logN) cost, and also which in the average case triggers a std::vector insert that causes a data shift that dominates the cost and effectively making it behave as a O(N²).

I understand why range insertion is more efficient. I just pointed out that doing sorting twice is unnecessary. If you still want to do it, I can't stop you. Please mark PR-s TC-only if you don't want others to review them.

@roigcarlo
Copy link
Copy Markdown
Member

I understand why range insertion is more efficient. I just pointed out that doing sorting twice is unnecessary. If you still want to do it, I can't stop you. Please mark PR-s TC-only if you don't want others to review them.

Nono, we want review. I am changing the code so there is no sort and the acceptance criteria is consistent across insertion methods. I just put the conclusion here so I don't forget what to do.

@RiccardoRossi
Copy link
Copy Markdown
Member Author

The problem is not that every new geometry triggers a sort, the problem is that every new geometry added triggers a call to std::lower_bound which adds a O(logN) cost, and also which in the average case triggers a std::vector insert that causes a data shift that dominates the cost and effectively making it behave as a O(N²).

I understand why range insertion is more efficient. I just pointed out that doing sorting twice is unnecessary. If you still want to do it, I can't stop you. Please mark PR-s TC-only if you don't want others to review them.

? I asked @roigcarlo to take a look since i am a bit overwhelmed as of now...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bugfix FastPR This Pr is simple and / or has been already tested and the revision should be fast Kratos Core Performance

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants