Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Currently there is a ssumed to be a 1:1 correspondence between an input and a row stream. E.g., Thor's input interface is derived from IRowStream.
Logically that should be split. I'm not 100% sure of the correct structure. I think we want the following:
- Activities can have multiple logical outputs and logical inputs.
- Each of those inputs and outputs could be processed in parallel with different row streams.
- Different inputs (and in the future outputs) could each have different numbers of streams. For instance, the RHS of a lookup join may have a single stream, but the LHS may have multiple streams.
(Note input isn't the most accurate name, I am going to use a Row channel in the following to see if it is clearer.)
Here is a straw man to shoot down and improve:
interface IActivity unsigned numOutputs(); unsigned numInputs(); unsigned numInputStrands(unsigned whichInput); IRiowChannel * queryOutput(unsigned n); void setInput(unsigned whichInput, IRowChannel * input); void setInputStrand(unsigned whichInput, unsigned whichStrand, IRowStreamEx * stream); interface IRowChannel unsigned numStrands(); IRowStream * queryStrand(unsigned n); IOutputMetaData * queryMetaData(); interface IRowStreamEx const void * nextRow(); void stop(bool abort); ...
At this point the row channel interfaces will no longer be derived from the rowstream interface. I'm not sure if it is worth the effort of having a shared row channel in thor and roxie.
I think that matches our current activity structures relatively well. When connecting a particular output/input between activities, roxie should be able to automatically create the rowstream junctions to merge from 1->M M->1 etc. (It will need more information and logic...)
Note: Eventually the row probes will also potentially need to have multiple strand variants..