A job validates account numbers with a reference file using a Join stage, which is hash partitioned
by account number. Runtime monitoring reveals that some partitions process many more rows
than others. Assuming adequate hardware resources, which action can be used to improve the
performance of the job?

A.
Replace the Join with a Merge stage.
B.
Change the number of nodes in the configuration file.
C.
Add a Sort stage in front of the Join stage. Sort by account number.
D.
Use Round Robin partitioning on the stream and Entire partitioning on the reference.
Explanation: