Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Occasionally Thor will fail to (re)start on a cluster.
Thor master waits 15 minutes to connect to all slaves before giving up and exiting.
It has been tracked down to when at least one host in cluster failed to start Thor slaves because init_thorslave startup rsync command hangs or fails and thus slaves.tmp file does not exist.