Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-20946

Multiple outputs from PIPE



    • Type: Bug
    • Status: Awaiting Information
    • Priority: Not specified
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: eclcc, Roxie, Thor
    • Labels:


      Currently the ECL PIPE command has an input and a single output.


      It would be extremely useful if we could define (and use) multiple outputs from a single ECL pipe invocation.


      There are two primary use cases ....


      "Statistics gather" - this is the case where you are really doing a 'normal pipe' - so reading in a terabyte of information and spitting it out again a little mutated. But you really want to gather statistics on how the process went (perhaps for QA purposes). Here the second output is much much tinier (probably megabytes).

      You obviously can just 'append' the little data as columns, read/aggregate and strip. But now you are trebling the cost of the process (read 3 times, write twice - vs read/write once.


      "Vertical split" - here you are reading a stream and spiltting into into pieces based upon vertical (or horizonal and vertical) decisioning. A concrete case in hand is reading in xml and splitting out to a dozen flat datasets (but that is not the only case). In the example it is read once, write once rather than read 10 times write once.


      Not so fussed about syntax




            • Assignee:
              dabayliss David Bayliss
            • Votes:
              0 Vote for this issue
              2 Start watching this issue


              • Created: