Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-13988

CSV Parsing QUOTE char and ESCAPE char cannot be the same

    XMLWordPrintable

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 5.2.4
    • Fix Version/s: 5.4.0
    • Component/s: DFS
    • Labels:
      None
    • Environment:
      Elsevier Disambiguation AWS development ... but this issue occurs in older platform versions as well.

      Description

      The following code fails because the CSV QUOTE character and the CSV ESCAPE character are not allowed to be the same character. Executing the code produces the following run-time error ...

      'System error: -1: Duplicate entry """ added to string matcher'

      page_info_raw_layout := RECORD
      string _skip1;
      unsigned page_id;
      string _skip2;
      string description;
      string url;
      END;

      a := DATASET('~web_log_demo::raw::page_info.txt', page_info_raw_layout,
      CSV(
      QUOTE('"'),
      ESCAPE('"')));

      a;

      This is particularly troublesome because it is common to encounter files where this is required ... for instance:

      1) Microsoft Excel .csv files.
      2) Java output escaped using org.apache.commons.lang.StringEscapeUtils.escapeCsv() ... no options available
      3) Software creating CSV files compliant with RFC-4180 ... no options specified.

        Attachments

          Activity

            People

            • Assignee:
              attilavamos Attila Vamos
              Reporter:
              brianb644 Brian Bounds
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: