Skip to content

fix(mnesia): load ram_copies table from 'better' copy when start#104

Merged
qzhuyan merged 2 commits into
emqx:emqx-OTP-24.3.4from
qzhuyan:backport/william/otp24/mnesia-ram-copies-safeload-during-start
May 12, 2026
Merged

fix(mnesia): load ram_copies table from 'better' copy when start#104
qzhuyan merged 2 commits into
emqx:emqx-OTP-24.3.4from
qzhuyan:backport/william/otp24/mnesia-ram-copies-safeload-during-start

Conversation

@qzhuyan
Copy link
Copy Markdown

@qzhuyan qzhuyan commented May 11, 2026

Without this fix, ram_copies is 'safe loaded' locally when remote nodes have not been connected (yet). This makes the table accessible by application becasue 'where_to_read' is set to local node().

mnesia:dirty_read(emqx_route_filters,1).
[]

Also when remote node is connected afterwards, ram_copies table will not load the 'better' copy, this makes the data inconsistent within the cluster.

This commit fix that during mnesia start, ram_copy table should NOT do local safe load when there is a better copy from the remote. the where_to_read will stay in 'nowhere' and table access won't be served.

mnesia:dirty_read(emqx_route_filters,1).
** exception exit: {aborted,{no_exists,[emqx_route_filters,1]}}

After remote node is connected, local node will do net_load_table from the remote.

Resue adopt_orphans function to resolve the conflicts when there is a deadlock of deciding the better copy, that is the same behaviour for disc_copies tables.

note1: BetterCopies0 = mnesia_lib:remote_copy_holders(Cs) -- Downs

note2: disc copy table has no such issue.

note3: if there is no better copy (when other nodes are down before current one), it is correct to load from local.

qzhuyan added 2 commits May 11, 2026 15:01
Without this fix, ram_copies is 'safe loaded' locally when remote nodes
have not been connected (yet). This makes the table accessible by application
becasue 'where_to_read' is set to local node().

```
mnesia:dirty_read(emqx_route_filters,1).
[]
```

Also when remote node is connected afterwards, ram_copies table will not load the 'better'
copy, this makes the data inconsistent within the cluster.

This commit fix that during mnesia start, ram_copy table should NOT
do local safe load when there is a better copy from the remote. the
where_to_read will stay in 'nowhere' and table access won't be served.

```
mnesia:dirty_read(emqx_route_filters,1).
** exception exit: {aborted,{no_exists,[emqx_route_filters,1]}}
```

After remote node is connected, local node will do `net_load_table` from
the remote.

Resue `adopt_orphans` function to resolve the conflicts when there is a
deadlock of deciding the better copy, that is the same behaviour for
disc_copies tables.

note1: BetterCopies0 = mnesia_lib:remote_copy_holders(Cs) -- Downs

note2: disc copy table has no such issue.

note3: if there is no better copy (when other nodes are down before
current one), it is correct to load from local.
@qzhuyan qzhuyan marked this pull request as ready for review May 12, 2026 04:47
@qzhuyan qzhuyan merged commit 319f3f6 into emqx:emqx-OTP-24.3.4 May 12, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants