Skip to content

AMGraph should use unordered collections #102

@cakarsubasi

Description

@cakarsubasi

Right now, AMGraph uses std::map to store adjacency information. This is mostly unnecessary since abstract node identifiers don't have any logical lexicographical order and changing it to std::unordered_map will improve performance and memory usage. NodeSet type should also be changed to unordered_set.

The exact performance impact of this is difficult to say but right now it appears to be around 10% for the RsR which is too good to pass up. Other parts of GEL might benefit more especially if the average valency is very high.

Some quick and dirty benchmark runs:

n = 34834

# Ordered sets:

### Lazily generated neighbors
Spent 2156 milliseconds on Algorithm
Spent 554 milliseconds on Split components
Spent 293 milliseconds on Estimate and smooth normals
Spent 730 milliseconds on Correct normal orientation
Spent 3735 milliseconds on Whole process

### Eagerly generated neighbors
Spent 2232 milliseconds on Algorithm
Spent 543 milliseconds on Split components
Spent 280 milliseconds on Estimate and smooth normals
Spent 742 milliseconds on Correct normal orientation
Spent 3798 milliseconds on Whole process


# With unordered versions:

### Lazy:
Spent 2042 milliseconds on Algorithm
Spent 371 milliseconds on Split components
Spent 261 milliseconds on Estimate and smooth normals
Spent 653 milliseconds on Correct normal orientation
Spent 3330 milliseconds on Whole process

### Eager:
Spent 2104 milliseconds on Algorithm
Spent 393 milliseconds on Split components
Spent 291 milliseconds on Estimate and smooth normals
Spent 623 milliseconds on Correct normal orientation
Spent 3412 milliseconds on Whole process

The problem of course is that the graph algorithms in GEL have a very large API surface.

Some parts assuming sorted data structures:

  • AMGraph::shared_neighbors
  • Geometry::test_intersection
  • Geometry::NodeSet (typedef)
  • There are probably others but it's a bit hard to tell at a glance if a piece of code is assuming something is sorted.

It is of course possible that there are no other places since neighbors sorted by ID are not very useful unless you are using the C++ set operations. In which case, this is probably doable in a few hours.

There is also an issue that there are two definitions of Geometry::connected_components that are disjoint solely because of the difference in types. Making AMGraph::NodeSet unordered means we have a duplicate definition so one of them has to be removed. Although they do appear to do exactly the same thing so this shouldn't be a problem.


I would like to also add a way to lazily acquire the neighbors of a graph node since this is called quite a few times across RsR (in 9 places by my count) and doing so would reduce allocations. This one is fairly simple to add:

        [[nodiscard]] std::ranges::borrowed_range
        auto neighbors_lazy(const NodeID n) const
        {
            return edge_map[n] | std::views::keys;
        }

Util::Range should also be removed and replaced with iota_view.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions