Uploaded image for project: 'HPCC'
  1. HPCC
  2. HPCC-23562

Execute multiple workflow items in parallel

    Details

    • Compatibility:
      Minor
    • Roadmap:
      Future
    • Roadmap Summary:
      Reduce the latency for executing complex queries.

      Description

      It looks like a Jira was never created for this issue, so creating one now (cc David Bayliss)

      Currently the workflow engine executes a single item at a time, and waits for that workflow item to complete before continuing.  Some jobs would benefit significantly if separate persists or independent actions were executed in parallel.

      The workflow information already contains all the dependencies, and information about items that need to be executed sequentially.  What would be required would be

      i) Restructure the workflow engine to create a graph of tasks that can be use to track which tasks have been executed, and which tasks should be executed next.

      ii) Ensure that there are no multi threading issues in the workflow engine (e.g. the way persist information is calculated).

      iii) Check eclagent for any multi threading issues

      It may be sensible to initially only support this for roxie/thor, and then revisit hthor.

      This possibly becomes even more significant on cloud environments since it would be quite possible to spin up extra thors on demand, so allow multiple graphs to be processed in parallel.

       

        Attachments

        1.
        Ensure eclcc labels items with parallel mode when the PARALLEL action is used Sub-task Accepted Gavin Halliday
        2.
        Add workflow examples to the ECL regression suite Sub-task Resolved Nathan Halliday
        3.
        Allow very simple scalar workflow example to run in parallel Sub-task Accepted Nathan Halliday
        4.
        Allow very simple scalar workflow example to run in parallel - hthor Sub-task Resolved Nathan Halliday
        5.
        Allow workflow example with graphs to run in parallel - roxie Sub-task Accepted Nathan Halliday
        6.
        Allow workflow example with graphs to run in parallel - thor Sub-task Accepted Nathan Halliday
        7.
        Ensure eclcc labels items with ORDERED mode Sub-task Accepted Gavin Halliday
        8.
        Test new workflow with recovery  Sub-task Accepted Nathan Halliday
        9.
        Implement Success and Failure to run in parallel Sub-task Resolved Nathan Halliday
        10.
        search for all calls to workflow from eclagent Sub-task Accepted Nathan Halliday
        11.
        Remove Calls to getWorkflowId() from eclagent and roxie Sub-task Resolved Nathan Halliday
        12.
        Add the ability to track the maximum number of threads used. Sub-task Accepted Gavin Halliday
        13.
        Investigate optimizing evaluation order/scheduling Sub-task Unresourced Unassigned
        14.
        Clarify the ordering requirement of Ordered in the ECL Language reference Sub-task Resolved Unassigned
        15.
        Add more workflow examples to the ECL regression suite Sub-task Resolved Nathan Halliday
        16.
        Implement Persist Items to run in Parallel Sub-task Merge Pending Nathan Halliday
        17.
        Remove unlockWorkUnit() from EclAgent Sub-task Resolved Nathan Halliday
        18.
        Allow thor cluster to be changed in parallel workflow Sub-task Accepted Available for anyone
        19.
        Identify when a query is changing cluster for certain items. Run query on sequential engine. Sub-task Accepted Gavin Halliday
        20.
        Start multiple Thor jobs at the same time Sub-task Resolved Jake Smith
        21.
        Improve workunit documentation Sub-task Resolved Nathan Halliday
        22.
        Enable parallel workflow by default Sub-task Accepted Available for anyone
        23.
        Check for potential deadlock issues with parallel persist execution. Sub-task Accepted Available for anyone
        24.
        Check failcode/failmessage available in failure parallel contingencies Sub-task Accepted Available for anyone
        25.
        Investigate parallel workflow items aborting other threads Sub-task Accepted Available for anyone
        26.
        Implement Once items to run on parallel engine Sub-task Resolved Nathan Halliday

          Activity

            People

            • Assignee:
              Nathan Halliday Nathan Halliday
              Reporter:
              ghalliday Gavin Halliday
            • Votes:
              0 Vote for this issue
              Watchers:
              6 Start watching this issue

              Dates

              • Created:
                Updated: