-
Notifications
You must be signed in to change notification settings - Fork 401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Socket mode: Unhandled event 'server explicit disconnect' in state 'connecting' #2225
Comments
Hi @okovpashko, we're sorry for the disruption during the upgrade and thank you for taking the time to report this. It seems there is an issue on the underlying We're curious about the versions of the underlying One thing I am wondering is if this issue might be a potential issue even with bolt-js 3.19 because there is no specific code on the bolt-js side to reconnect for Socket Mode disconnections. I'm not sure about the direct trigger that caused the new error pattern for your app yet, but upgrading Aside from that, our team has a long-term plan to rewrite the Socket Mode client for Node.js users to eliminate these kinds of errors, and the underlying module 2.x is already published, although bolt-js does not support it yet. I don't have detailed information about the future release plans at this moment, but @filmaj will be able to share more details on this. |
@seratch thank you for the rapid response. Here's my
I took a look at the Git history in our project and noticed that the Anyway, your reply is very helpful as now I understand that it's not the issue on our side, you are aware of it and working on the fix. |
Hello, this is unfortunately a design flaw with socket-mode 1.x. It is a duplicate of slackapi/node-slack-sdk#1787. You can find the details for this problem in that issue thread. I don't think this is a problem that was introduced between socket-mode 1.3.3 and 1.3.6, but you can try locking to the earlier version to see if the problem goes away on your end. Several customers started reporting issues related to this functionality in May of this year, which prompted a long investigation. Ultimately, socket-mode v2 was released that completely changes the design of the underlying If possible, if you have debug logs available leading up to the problem to share, please post them in slackapi/node-slack-sdk#1787 - I am collecting such behaviours and trying to add them to the integration tests in socket-mode to write more realistic test scenarios to better help prevent regressions. Understanding the state of the state machine in socket-mode and the sequence of events sent by Slack's backend would be helpful. |
@filmaj thank you for the explanation. I posted the debug logs to the issue you mentioned. |
I've also just noticed the disconnection reason |
Ah yes, definitely leftover / hanging network connections not cleanly closed would contribute to the backend sending Some context: each Slack app ID is allowed a maximum of 10 concurrent websocket (socket-mode) connections. Some customers sometimes create too many app instances (scale their app beyond 10 instances) which can lead to this problem. Possibly, as you mention, not explicitly stopping the server or cleanly closing the network connection could, at least temporarily, cause the error as well (until the network connection becomes severed or the Slack backend realizes the connection is no longer established). When Slack detects it has more than 10 concurrent websocket connections open with a single app, it will send a Ultimately, I suggest looking into the reason that Slack is sending Thank you for sharing logs in the other issue, I will inspect them and see if that can help me re-create a test scenario; once a test scenario is written, then it is much easier to experiment with different solutions. ❤️ |
FYI a release candidate with this fix is available in |
The fix here is available in the new major bolt v4 version, FYI. |
@filmaj thank you for the update. We will migrate to the new version but I'm afraid that it will fix only the unhandled errors. We still have no clue of the reason for the The latest occurrence was this night and it looked like a regular reconnection request from Slack API:
We have only one instance of the application with only one active connection. The time frame of the logs above is As I understand, Slack API drops connections every few hours causing all clients to reconnect. Is it possible that there were other 9 active connections since Saturday? Could you please suggest where to dig to debug this issue? UPD: I can see that we indeed have 10 active connections
Could you please suggest if there is a way to drop all unused connections? |
@slack/bolt
version3.21.1
Your
App
and Receiver Configuratione.g.
const myApp = new App({ ... what options are you using? })
Node.js runtime version
v20.17.0
Steps to reproduce:
N/A
Expected result:
No unexpected disconnection errors
Actual result:
Regular disconnection errors:
Here's the stack trace from Sentry
and the latest error history
Requirements
After upgrading Bolt.js from 3.19.0 to 3.21.1, we started receiving disconnection errors multiple times daily on various environments (prod and testing) of projects that use the Socket Mode.
At least one of those projects didn't have any other logic and/or infrastructure changes.
It looks like there's an error was introduced in one of the latest updates. Could you please confirm the issue or suggest where to dig if it's on our side?
The text was updated successfully, but these errors were encountered: