In the first workunit the data was distributed before the persist was written. A read of the persisted attribute was then done without adding a distributed hint. The assumption was that I was reading the data in the same build instance so it should know about the distribution.
Graph 11 sg979 ID 982/983 missing distributed. Assumed by funnel in 986 even though it is only partially distributed. Its pulling metta from 1014.
Graph 11 sg1014; output flatds; loss of metadistribute
After lightweight join action is still local but distribution unknown
work around: W20190326-161600 sg 977
This shows a hash distribute after the read of the persist. I removed the distribute before the persist, in this case. Note the split right after the IF in subgraph 977 shows the metadistribute value properly. In SG 1014 ID1077 the lightweight join loses the distribution knowledge. I go into the local sort and local group and while im confident the lightweight join did not move any data from one node to another so the local will still work i'm mildly concerned the compiler may not know the data is distributed any more.
Further, W20190326-182415 shows that the Persist read does not honor a distribute. I distributed the dataset before the persist and then gave it a distributed hint upon reading and the data is not actually distributed properly. This seems like a show stopper as creating a persist in your build will lose all distribution.