Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-12264 Rewrite the Thor SORT activity
  3. HPCC-12270

Improve partitioning for the sliding window join

    XMLWordPrintable

    Details

    • Type: Sub-task
    • Status: Unresourced
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: Thor

      Description

      If there is a sliding join e.g., name[1..3],name[4..*]
      and there are large numbers of duplicates for a given partition element
      (e.g., name[1..3]),
      then that partition element should be split until the number of matches is below the threshold. This will help ensure the records are distributed evenly over all the nodes, and should vastly reduce the skew.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              ghalliday Gavin Halliday
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated: