We have seen incorrect number of cores counts displayed in CPU Affinity panel on Deadline Monitor which can limit the number of threads that the Worker can use. As an example, a machine with two Xeon E5-2696 (44 logical cores each) but the panel is only displaying as 44 CPUs instead of 88.
Resolution
Temporary Fix
To workaround Launch N number of Deadline Workers (where N is the number of processors on the machine). Go to the Task Manager, under the Details tab, right-click on each Deadline Worker instances and select Set Affinity. Tie each instance of a Deadline Worker to a processor (do not overlap these).
For large amounts of instances like this, a custom script using the START command can be used to launch Deadline Workers tied to processors.
START ["title"] [/D path] [/I] [/MIN] [/MAX] [/SEPARATE | /SHARED]
[/LOW | /NORMAL | /HIGH | /REALTIME | /ABOVENORMAL | /BELOWNORMAL]
[/NODE <NUMA node>] [/AFFINITY <hex affinity mask>] [/WAIT] [/B]
[command/program] [parameters]
…
NODE Specifies the preferred Non-Uniform Memory Architecture (NUMA)
node as a decimal integer.
AFFINITY Specifies the processor affinity mask as a hexadecimal number.
Note that this would bypass the Launcher and would therefore not allow the Worker to run as a service through Deadline's systems. Windows also does not allow you to run a script as a true service, but it is possible to run a script when the machine boots. Any remote commands which involve the Launcher interacting with the Slave may have unexpected results.
For Windows we have identified a more efficient fix as a workaround to this limitation:
you would need to launch the more Workers with CPU affinity configured on the same machines so that they can show correct number of CPUs. And you would also need to run the Workers through a scheduled script for start so that the CPU affinity gets configured automatically when a machine reboots.
To do that, you would need to create a bat file and then use command script to start the worker application with affinity:
START ["title"] [/D path] [/I] [/MIN] [/MAX] [/SEPARATE | /SHARED]
[/LOW | /NORMAL | /HIGH | /REALTIME | /ABOVENORMAL | /BELOWNORMAL]
[/NODE <NUMA node>] [/AFFINITY <hex affinity mask>] [/WAIT] [/B]
[command/program]
AFFINITY Specifies the processor affinity mask as a hexadecimal number.
Below is a attached "affinity.bat" file you would need to use to launch a worker with affinity. This example is launching 2 workers (on one machine) on a node which has 8 cores, CPU affinity for the first worker is first 4 CPUs and for second is the next 4. You can set the CPU affinity to match you workflow.
Follow this forums post to make the script for auto logon as an example of creating a startup script.
https://forums.thinkboxsoftware.com/t/launch-slave-at-log-off/12766/7
For AWS Instances, this issue is fixed the issue in 10.1. So if your AWS Deadline farm or AMI image is on a older version than 10.1.x.x, you can upgrade to Deadline 10.1 latest version and it would fix the issue. Here is a documentation you can follow to upgrade: (https://docs.thinkboxsoftware.com/products/deadline/10.1/1_User%20Manual/manual/upgrading.html)
Comments
0 comments
Article is closed for comments.