Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-15832

Regex unicode can core on illegal regex

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Not specified
    • Resolution: Fixed
    • Affects Version/s: 6.0.0
    • Fix Version/s: 6.0.4
    • Component/s: eclccserver
    • Labels:
      None

      Description

      Background: The ECL language ref indicates that the regex functions support unicode and point towards the ICU documentation for information about regular expression metacharacters and operators (http://userguide.icu-project.org/strings/regexp). That documentation, in turn, references another page regarding the properties that are supported via metacharacters (http://userguide.icu-project.org/strings/properties).

      Here is a snippet of code testing that support. The intention is to create a 'data pattern' result (uppercase characters become 'A', lowercase becomes 'a', etc.) that is unicode-friendly.

      MapUpperCharUni(UNICODE s) := REGEXREPLACE(u'\\p{Uppercase}', s, u'A');
      MapLowerCharUni(UNICODE s) := REGEXREPLACE(u'\\p{Lowercase}', s, u'a');
      MapDigitUni(UNICODE s) := REGEXREPLACE(u'\\p{Numeric_Type}', s, u'9');
      MapAllUni(UNICODE s) := (STRING)MapDigitUni(MapLowerCharUni(MapUpperCharUni(TRIM(s, LEFT, RIGHT))));
      
      MapAllUni(u'Über 1000');
      

      This fails during execution but does not show an error in the workunit. The error occurred in the compiling phase:

      0000000E 2016-06-29 15:00:54.898  3632  5346 "eclcc: Creating PIPE program process : 'eclcc -shared - --timings -fmaxCompileThreads=2 -oW20160629-150054 -platform=hthor --component=myeclccserver@10.211.55.15 -fcreated_by=ws_workunits -fcreated_for=dcamper -fapplyInstantEclTransformations=1 -fapplyInstantEclTransformationsLimit=100' - hasinput=1, hasoutput=0 stderrbufsize=0"
      0000000F 2016-06-29 15:00:55.266  3632  7752 "ERROR: Program was terminated by signal 11"
      00000010 2016-06-29 15:00:55.266  3632  5346 "eclcc: Pipe: process 7753 complete 254"
      

      So there may be two items here: 1) Failure of an error message appearing in the workunit, and 2) failure to properly compile the metacharacter support for an underlying ICU call.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                anybody Available for anyone
                Reporter:
                dcamper Dan S. Camper
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: