Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-23137

Balanced splitter doesn't adequately consider memory footprint of row usage where child datasets involved

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.6.14, 7.4.38, 7.2.54
    • Component/s: Thor
    • Labels:
      None

      Description

      A balanced splitter uses a [configurable] small memory buffer to hold rows.
      It blocks when this limit is read until other arms have caught up.
      To know when it has read ahead enough, it uses a rowSize method, but unfortunately this was based on the size returned from the generated output meta, and not the actual potential footprint.
      This is particularly significant when child datasets are involved, where the actual memory footprint is variable and quite likely much bigger than the nominal size of the parent record.

      Elsewhere in similar situations a size serializer is used to get the actual size.
      Balanced splitter should do the same.

        Attachments

          Activity

            People

            • Assignee:
              jakesmith Jake Smith
              Reporter:
              jakesmith Jake Smith
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: