Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-17143

SMART join losing records in Boca Dataland

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 6.2.12
    • Component/s: Thor
    • Labels:
      None
    • Environment:
      Boca Dataland

      Description

      We received a report from one of our SALT users that his internal linking module “lost” some records (i.e. the output dataset had fewer records than the input dataset). That should never happen, provided the field declared to be a unique record identifier (RIDFIELD) is indeed unique. I’ve confirmed that proviso to be true for his data.

      I traced the loss to the following join…

      pi1 := JOIN(ih, Patched_Infile_thin, LEFT.rcid=RIGHT.rcid, TRANSFORM(RECORDOF(ih),SELF:=RIGHT,SELF:=LEFT), KEEP(1), SMART); 
      

      This can be seen to occur in http://dataland_esp:8010/?Wuid=W20170219-133941&GraphName=graph17&SubGraphId=1383&SafeMode=false&Widget=GraphTreeWidget – 7,108,236,038 go in to the join on both the left and right, and 7,108,235,381 come out.

      Based on the history of the code I asked the user to change this join from SMART to HASH and rerun. This succeeded; the same number of records came out of the join as went in. This can be seen in http://dataland_esp:8010/?Wuid=W20170223-210259&GraphName=graph17&SubGraphId=1386&SafeMode=false&Widget=GraphTreeWidget

      The ECL can be seen in context on line 189 of BIPV2_LGID3_V362.matches

      This appears to be similar in some respects to HPCC-11944

      Within the SALT team we are tracking this as https://github.com/hpcc-systems/SALT/issues/2167

        Attachments

          Activity

            People

            • Assignee:
              jakesmith Jake Smith
              Reporter:
              leonarta Todd Leonard
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: