Unresolved incidents of a process instance or a sub process instance are indicated by Cockpit as failed jobs. To localize which instance of a process failed, Cockpit allows you to drill down to the unresolved incident by using the process status dots. Hit a red status dot of the affected instance in the Process Definition View to get an overview of all incidents. The Incidents tab in the Detailed Information Panel lists the failed activities with additional information. Furthermore, you have the possibility of going down to the failing instance of a sub process.
Retry a Failed Job
On the process instance view, you can use the button on the right side to resolve a failed job.
A modal dialog opens where you can:
- Choose whether the previous due date should be kept or set to an absolute date/time of your choice.
- Select the failed jobs to be retried.
After clicking on Retry, the engine will re-trigger the jobs and increment their retry values in the database so the Job Executor can acquire and execute the jobs again.
Alternatively, you can change the retries of jobs asynchronously via the Batch Operation “Set retries of Jobs belonging to process instances”.
Please note that this feature is only included in the enterprise edition of the Camunda Platform, it is not available in the community edition.
You can also perform a synchronous bulk retry of failed jobs. This feature is available in the process definition view in the Job Definitions tab. If you hit this button, you will increment the number of retries for all the defined jobs of the process definition.