Version: Deadline 9.0 and later
Task timeouts are another tool that Deadline gives you to manage your render jobs. For example, if the application being used for a render job is known to hang or stall, a task timeout can be applied to the job to ensure it doesn't render indefinitely. There are other cases where task timeouts can be useful, and we'll cover those below. In addition, Deadline lets you control the timeout behaviour, giving you the ability to setup alerts, and to control whether a timeout is treated as an actual error or not.
SETTING TASK TIMEOUTS
So how do you set task timeouts you may ask? One method is to set them from the Deadline Monitor after the job has been submitted. Right click on a job in the Jobs Panel and Select "Modify Job Properties". Then select the "Timeouts" page.
There are a few options for timeouts:
- Minimum Task Render Time: This lets you set a minimum time that tasks for your job must run for. For example, if you set this to 5 minutes, and a task takes less than 5 minutes to render, a timeout will occur. This can be useful if you are using a rendering application that sometimes exits quickly without reporting an error, and doesn't save out any frames as a result.
- Maximum Task Render Time: This is used for typical task timeouts, and limits the amount of time a task can render for. For example, if this is set to 5 minutes, and a task takes longer, a timeout will occur. As mentioned in the Introduction above, this can be useful if you are using a rendering application that is known to hang or stall.
- Maximum Start Job Time: You may notice when a task is first picked up by Deadline, it is in a Waiting To Start state. This timeout controls how long a task can be in this state before a timeout occurs. Some application plugins start up the rendering application and load the scene file during this time, and this timeout can be useful if the rendering application sometimes becomes unresponsive during this time.
- Maximum Load Plugin Time: This lets you set a timeout for how long Deadline takes to load the application plugin. This shouldn't be required for any of the application plugins that ship with Deadline, but can be useful if you are using a customized plugin that occasionally hangs while the plugin is loading.
RESPONDING TO TASK TIMEOUTS
The "On Task Timeout" trigger tells Deadline what to do if any of these timeouts are reached. By default, it reports an error for the task and requeues it so that another render node can pick it up. This can be changed based on your preferences.
The available responses are:
- Notify: This will not affect the task that is rendering. Instead, a notification is simply sent to the job's user when the timeout is reached. Note that a notification can also be sent alongside the other responses below.
- Requeue: This option will requeue the task, but it will not report an error. This can be useful if you know the occasional task timeout will occur, but you don't want those timeouts to count against the job's error count.
- Error: As mentioned above, this is the default behavior. The task will be requeued and an error will be reported.
- Fail: This option will mark the task as failed instead of requeing it, and an error will also be reported. This can be useful if you don't want a task to be retried after a timeout occurs.
- Complete: This option will mark the task as complete, and no error will be reported. This is useful if you want to treat a timeout as a success. For example, let's say you have a single-task simulation job that will actively write out data indefinitely, and you have another job dependent on it. If you want to calculate 2 hours of data, you could set the task timeout to 2 hours, and then have the timeout mark the task complete. Once the task is marked complete, the dependent job can then use the data that the simulation job saved out during its allotted time.
How do you get notifications from Deadline? Check out this handy blog post on Deadline Notifications.
OTHER AVAILABLE OPTIONS
The remaining timeout options Deadline offers are:
- Enable timeouts for Pre/Post Job Scripts: If the job has a pre or post job script, this timeout value will be applied to that script. You can learn more about Job Scripts in the Deadline documentation.
- Use Frame Timeouts: Applies the timeout to the individual frames in the task instead of the task itself. This is useful if the tasks consist of multiple frames, and some tasks have more frames than others.
- Enable Auto Task Timeout: If enabled, the timeout value will be calculated based on other tasks that have completed for the job. This is covered in more detail in the next section.
AUTO TASK TIMEOUTS
The timeout options discussed above all require you to estimate how long each task will take to render. However, it's possible to enable automatic task timeouts, which will dynamically calculate timeout values for a job based on the tasks that have already completed for it. To enable this feature from the Deadline Monitor, go to "Tools"-> "Configure Repository Options" and look at the "Job Settings" page. There is a section called "Automatic Job Timeout".
The available options are:
- Minimum number of completed tasks required before calculating a timeout: If this option is enabled, Deadline will only apply automatic timeouts to a job after the specified number of tasks have completed.
- Minimum percentage of completed tasks required before calculating a timeout: Similar to the above option, but instead of using a fixed number, Deadline will only apply automatic timeouts to a job after the specified percentage of tasks have completed.
These two options can be used separately or together. They give Deadline a baseline on how long each task should be expected to take, and can be used to ensure Deadline has a big enough sample size. Once the minimum sample size has been reached, Deadline will calculate the average render time of all completed tasks for the job, and multiply that value by the "Timeout Multiplier" setting to get the timeout value.
For example, let's use a sample size of 3 tasks, and a "Task Multiplier" of 2. Now let's say a job's first 3 tasks take an average of 10 minutes. This will be multiplied by 2 for an auto timeout of 20 minutes. If any remaining task for the job takes longer than 20 minutes, a timeout will occur (providing the average of all completed tasks is still 10 minutes.
Finally, the "Enforce an automatic job timeout for all jobs" option can be used to apply automatic timeouts to all jobs in the queue. If this option is disabled, then automatic timeouts will only be applied to jobs that have Auto Task Timeout enabled. Note that you would only want to apply automatic timeouts to all jobs if you know that the task render times for each job will be relatively uniform. If you have jobs with varying task times, this could result in false positives.
For example, let's say you have a job with 100 tasks, and each task represents one frame. During the animation sequence, a large object is moving past the camera, resulting in much longer render times for the middle frames, and much shorter render times for the beginning and end frames. If automatic timeouts were enabled for this job, timeouts will likely occur during the middle frames, since the sample size will be based on the early, quicker frames. One way to avoid this would be to render the middle frames first, which could be done using a frame range like this:
This way, the initial sample is based on the longest tasks for the job. We also have a blog entry on Frame List Formatting if you're interested.
Before we wrap up, we should mention that all task timeout settings for a job can be set during submission, and can be specified in the Job Info file. This is handy if you are writing a custom submitter for an application, and you want to apply timeouts to all jobs that are submitted. Check out the Manual Job Submission documentation for more information.
As you can see, job timeouts can be useful in many situations, and hopefully they can help you ensure that precious time is not wasted.