Free listening port earlier in shutdown process#2625
Conversation
tylerkaraszewski
left a comment
There was a problem hiding this comment.
I think the diagnosis of this issue is wrong, or at least incomplete.
| // listening socket), and the database teardown that follows can take a long time when the memory-mapped | ||
| // database is large. Until the FD is closed, the kernel keeps the listening port bound while it unmaps that | ||
| // memory, which prevents a restarted node from rebinding the port. | ||
| _port = nullptr; |
There was a problem hiding this comment.
This is broken, we shouldn't free this until cluater communication is done.
There was a problem hiding this comment.
Cluster communication should be being done over established connections though? This is only closing the listening port for new connections?
There was a problem hiding this comment.
Existing connections are not guaranteed to stay open.
| // it unmaps that handle's memory-mapped pages, which can take a long time when the database is large. _port is a | ||
| // member and would otherwise only be destroyed after this destructor body returns, leaving the kernel holding the | ||
| // listening port bound for the duration of that unmap. | ||
| _port = nullptr; |
There was a problem hiding this comment.
This doesn't do anything. Well, it does what it says, which is to move the port freeing to before deleting pluginDB, but deleting pluginDB is not the slow part of this that you're worried about.
The show part of closing the DB is closing the last existing filehandle to it. This is owned by BedrockServer::_dbPool.
This currently gets deleted at the end of BedrockServer::sync(), but it's right after we delete the syncNode (by calling atomic_store for it and passing nullptr), which should release the peer port.
If the peer port is not released before the final DB deletion starts, it is probably because some other thread is holding a reference to _syncNode, but even that shouldn't be an issue, as that thread can release it while the deletion happens.
There was a problem hiding this comment.
Yeah I'll be honest, I'm out of my depth. What I saw yesterday when bouncing the cluster was that port 4445 was the one that was held during unmmap() - but only on the hosts that had their command port closed when shutdown was initiated. I'm not sure if that really helps knowing though.
There was a problem hiding this comment.
Do you have the time and date and host for when this happened so I can look at the logs?
|
Does this seem likely to be the log line for when db2.lax was shut down and this issue started on that node? |
|
I can't be positive since it was in the middle of a long saltfab run, but it seems highly likely. |
Close the SQLiteNode peer listening port (
-nodeHost) early during shutdown so the kernel releases it before the database memory is unmapped, instead of holding it bound for the duration of theunmap().When a node shuts down with a large memory-mapped database, the kernel keeps the peer listening port bound while it unmaps that memory, which can take a long time and prevents a new process starting.
The port was held because
~SQLiteNodedeletespluginDB(a separate handle whosesqlite3_close()unmaps its pages) in the destructor body, while_portis a member only destroyed after the body returns. This closes_portbeforedelete pluginDBin the destructor, and also frees it inbeginShutdown()so it's released early and deterministically on the graceful path, withupdate()guarded so it won't reopen the port while shutting down.Fixed Issues
Fixes https://expensify.slack.com/archives/C094TGUTZ/p1780360333816159?thread_ts=1780356276.321669&cid=C094TGUTZ
Tests