Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-22736

KeyedJoin can run out of memory (roxiemem) if dealing with a lot of large pending groups

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 7.0.x
    • Fix Version/s: 7.6.0
    • Component/s: Thor
    • Labels:
      None

      Description

      Thor's new KJ implementation handles a lot of pending join groups (1 LHS+RHS set)'s in parallel, if there are large sets of matches per LHS row and a lot of parts and order is preserved, the cumulative total memory in use can be large.

      This has only been seen once afaik. A success workaround is to reduce the number of pending groups waiting to be processed, and reduce the number of done groups to be transformed, with e.g.:

      #option('keyLookupMaxQueued', 1000); // the default is 10000
      #option('keyLookupMaxDone', 1000); // the default is 10000
      

      A longer term fix is to dynamically scale / limit the amount used and/or potentially spill some completed groups.

        Attachments

          Activity

            People

            • Assignee:
              jakesmith Jake Smith
              Reporter:
              jakesmith Jake Smith
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Due:
                Created:
                Updated:
                Resolved: