Details
-
Sub-task
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
Description
I propose adding the following syntax extensions.
ds' := UNORDERED(ds) - implies the that order of ds is not required
The following attributes available on all activities:
- PARALLEL[(n)] execute this activity in parallel. The exact meaning may depend on the activity. Optional number of strands.
- ORDERED(bool) - is the order of any output relied on.
- UNORDERED - the order of output rows for all outputs cannot be relied on. (Synonym for ORDERED(FALSE))
The following are relevant on a subset of activities:
- STABLE(bool) - indicates if the order of the input rows are significant.
- UNSTABLE - equivalent to STABLE(false)
- ALGORITHM(x) - allow the algorithm to be set independently of the STABLE flag
Note: It is more common to know that the order of the input dataset is not required at the point of use (and fits in better with reusing definitions) - which is why UNORDERED(ds) is useful.
Derived order (e.g., from HPCC-10144) is annotated by adding an unordered attribute to an activity.
Other related changes:
SORT, UNSTABLE implies that the input dataset is unordered.
(+)(a,b[,options]) as a functional form of the append operator.
filter(ds, c1, c2, options) as a functional form of a filter.
Great care will be needed to ensure that the options aren't lost of activities when they are optimized - especially IF and filter.
Attachments
Issue Links
- relates to
-
HPCC-14810 Concatenation of child datasets more than 10 times slower with + operator than with & operator
-
- Resolved
-
-
HPCC-10144 Add a global sort and distribution optimizer
-
- Accepted
-