Clients repeatedly trying and failing to connect to a Deadline database with encryption enabled can cause the Database logs to grow quickly. Here we’ll explain how to handle when a computer with the Deadline client installed (one or all of the Monitor/Worker/Launcher) gets into this state. Normally, when each client application connects to the MongoDB database, it uses 10 connections. The issue is that when the client fails to connect to the database, it will re-attempt the connection as fast as it can, as each attempt consumes another set of 10 connections.
Confirm this is happening by opening your MongoDB log:
Windows: C:\DeadlineDatabase[VERSION]\mongo\data\logs
Mac OS X: /Applications/Thinkbox/DeadlineDatabase[VERSION]/mongo/data/logs
Linux: /opt/Thinkbox/DeadlineDatabase[VERSION]/mongo/data/logs
Where [VERSION] is the MAJOR version number of Deadline.
If you see messages like these, this is the focus of this guide:
E NETWORK [conn17] no SSL certificate provided by peer; connection rejected
I NETWORK [conn55795] SocketException handling request, closing client connection: 9001 socket exception [SEND_ERROR] server [10.1.2.3:54321]
{"t":{"$date":"2023-06-28T06:28:32.987+02:00"},"s":"I", "c”:"NETWORK", “id”:12345, “ctx”:"conn544","msg":"Error receiving request from client. Ending connection from remote","attr":{"error":{"code":141,"codeName":"SSLHandshakeFailed","errmsg":"no SSL certificate provided by peer; connection rejected"},"remote":"10.1.2.3:57046","connectionId":544}}
If there is a different error in the logs please reach out to the support team using the details here.
This failure mode is typically caused when the database certificate becomes unreadable for any reason. If the database certificate is hosted on a network share, be sure the Deadline10Client.pfx is accessible and readable by the client computers.
If it stays in this state for a more than 2 minutes a single computer will consume all available TCP connections on the database computer as it will reattempt connection more quickly than the previous set of connections expire. In that case, you need to find the computer or computers running the Deadline client that are failing to connect to the database and resolve their connection issue. Once there are no computers failing to connect, the issue will stop.
You can find the problem computer either by pulling the IP address from the database logs, or by checking the connections open on the database computer. In the example error messages above, the example IP address is 10.1.2.3. Copy what is shown in your message and connect to that computer and resolve whatever is keeping the client from connecting. Check the application logs if it is unclear why the client is failing to connect. Typical causes include missing database certificates, or the user running the Worker losing permission to read that certificate.
To check the connections on a Windows hosted database open a command line with Administrator permissions and so start the command prompt with the right-click “Run as Administrator” option:netstat -p tcp -b > connections.txt
Open the created connections.txt and look for an IP in the ‘Foreign Address’ column that appears over 100 times that has [mongod.exe]
on the following line.
To check the connections on a Linux hosted database:netstat -tlp | grep mongo
Once the client’s connection issue has been corrected, you’ll need to either restart the database computer or wait for the connection threads to die and be cleaned up by the operating system. Again, check the application logs on the client with the most connections if it is unclear why the client is failing to connect.
Comments
0 comments
Article is closed for comments.