Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-17149

Jobs can fail with"System error: 9999; thgraphmaster.cpp(270)" if child graph initialization is particularly slow.

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 6.2.2, 6.2.6
    • Fix Version/s: 6.2.10
    • Component/s: Thor
    • Labels:
      None
    • Environment:
      Alpha dev

      Description

      There is a 30s timeout on a reply from a child graph requesting initialization.
      If this is exceeded Thor fails with this unhelpful error.

      Normally 30s is more than sufficient, however in this case it had 8 index reads in the child query, all reading indexes from a foreign environment.
      Index reads open the index parts during initialization, so there was a lot of work for this child graph initialization to do and it took >30s.

      Things can be improved by avoiding opening up all the parts immediately, or at this early initialization stage and/or increasing the fixed rather short timeout.

      Original report:

      Alpha dev jobs on the 6.2.x platforms are encountering intermittent errors of the following nature:

      9999: System error: 9999: Internal Error at /mnt/disk1/jenkins/workspace/LN-Candidate-with-Plugins-6.2.6-1/LN/centos-6.0-x86_64/HPCC-Platform/thorlcr/graph/thgraphmaster.cpp(270)

      Earlier, Jake Smith indicated:

      Is this reproducible ?

      The error could provide more info., but what it looks like is happening, is that the slaves are running a child query which are taking > 30seconds to initialize with meta data they request from the master.
      Unfortunately, the error doesn't make it clear if 1 or all hit this issue.

      If it's reproducible, I might have a stab. at spotting what it's doing while in flight.

      To my knowledge, it is not regularly re-creatable, but continues to occur. If we can be of help to try to cause this to occur while Jake Smith can watch, we'd be happy to try to help.

      Attached is a ZAP report from one of the occurrences.

      Tony KirkSrividya Varadarajan

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                jakesmith Jake Smith
                Reporter:
                kev77log Kevin Logemann
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: