Thor's new KJ implementation handles a lot of pending join groups (1 LHS+RHS set)'s in parallel, if there are large sets of matches per LHS row and a lot of parts and order is preserved, the cumulative total memory in use can be large.
This has only been seen once afaik. A success workaround is to reduce the number of pending groups waiting to be processed, and reduce the number of done groups to be transformed, with e.g.:
A longer term fix is to dynamically scale / limit the amount used and/or potentially spill some completed groups.