Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-12301

thorslave_lcr's alternate with big mem, small mem

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Minor
    • Resolution: Invalid
    • Affects Version/s: 5.0
    • Fix Version/s: None
    • Component/s: Config Utils, Thor
    • Labels:
    • Environment:
      CentOS 6

      Description

      On Thor, with 2 slaves/node
      globalMemorySize = 80GB

      ECL something like (with each phase using the results from the previous):
      DISTRIBUTE(...); // graph1
      SORT(...) : INDEPENDENT;
      DISTRIBUTE(...); // graph2
      SORT(...) : INDEPENDENT;
      DISTRIBUTE(...); // graph3
      SORT(...) : INDEPENDENT;
      DISTRIBUTE(...); // graph4
      SORT(...) : INDEPENDENT;
      DISTRIBUTE(...); // graph5
      SORT(...) : INDEPENDENT;
      OUTPUT(...); // graph6

      During graph1, each (of 2) thorslave_lcr starts out with 80GB memory (per VIRT in top command - in my tests, they never use this full amount).

      When graph2 starts, 2 new thorslave_lcr's start up - but the graph1 thorslave_lcr's are still running, still at 80GB memory.
      BUT - The graph2 thorslave_lcr's each get 25GB memory. (Apparently from what's still available...)
      The graph2 thorslave_lcr processes continue for the duration of graph2 with only 25GB.

      When graph3 starts, it gets 80GB.
      When graph4 starts, it gets 25GB.
      When graph5 starts, it gets 80GB.
      When graph6 starts, it gets 25GB.

      I have not looked for other related resource patterns, like CPU, I/O, but I suspect those would not be affected, as long as needed memory does not max out.

      If a job required full available RAM, this could cause significant delays in the alternate graphs, it seems, depending on how globalMemorySize is set, vs. physical ram, # slaves, etc.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              jwilt James Wiltshire
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: