Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-23137

Balanced splitter doesn't adequately consider memory footprint of row usage where child datasets involved

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 7.6.14, 7.4.38, 7.2.54
    • Component/s: Thor
    • Labels:
      None

      Description

      A balanced splitter uses a [configurable] small memory buffer to hold rows.
      It blocks when this limit is read until other arms have caught up.
      To know when it has read ahead enough, it uses a rowSize method, but unfortunately this was based on the size returned from the generated output meta, and not the actual potential footprint.
      This is particularly significant when child datasets are involved, where the actual memory footprint is variable and quite likely much bigger than the nominal size of the parent record.

      Elsewhere in similar situations a size serializer is used to get the actual size.
      Balanced splitter should do the same.

        Attachments

          Activity

            People

            Assignee:
            jakesmith Jake Smith
            Reporter:
            jakesmith Jake Smith
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

              Dates

              Created:
              Updated:
              Resolved: