Keyed Join detects which slaves have parts locally, and (by default) sends all requests to be fulfilled by that local slave.
If the index is 1-way (or much smaller than the cluster size if overlapping), then it sending all requests to be remotely fulfilled like this causes inefficiency.
It it probably better to read them directly (as if all parts were local), especially if the key is small.
That way the jhtree key node will service a lot of the requests, with the upshot that the speed is greatly increased.
For a large small width key it may still make sense to service remotely, but I suspect it is better to turn off remote keyed lookups where the key is significantly smaller than the cluster.
Normally the only case that is true, is where the key is 1-way, but it can also happen if there are clusters of different sizes that overlap the cluster they query is running on, and the key has been built on one of those smaller overlapping clusters.
The workaround if this is an issue for now, is to use either:
or to add:
to the keyed JOIN's affected.