Free Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Actual Exam Questions - Question 1 Discussion

Question No. 1
You have:
DataFrame A: 128 GB of transactions
DataFrame B: 1 GB user lookup table
Which strategy is correct for broadcasting?
Select one option, then reveal solution.
US
SX
Sohail X.
2026-02-09

B The key is avoiding shuffling the large DataFrame A, so broadcasting the smaller B is the move. Options involving broadcasting A don’t make sense given its size.

0
AY
Arjun Y.
2026-02-03

A vs B? Both push for broadcasting B since it’s smaller, but B is clearer about stopping the big DataFrame A from shuffling. That’s the key win here, so B’s explanation feels more on point.

0
AY
Arjun Y.
2026-01-31

A/B? Both say broadcast B since it’s smaller, but A is vague about "eliminating shuffling itself"—which doesn’t clearly specify which DataFrame. B explicitly says broadcasting B avoids shuffling the big DataFrame A, which is the heavy operation to avoid. Also, C and D can’t be right since broadcasting the huge 128 GB A is impractical. So between A and B, B’s explanation about cutting down the shuffle on A is more precise and seems correct if broadcasting 1 GB is allowed in the environment.

0
AY
Arjun Y.
2026-01-29

A vs B? Both say broadcast B since it’s smaller, but A’s wording about eliminating shuffling itself is a bit vague. B clearly says it stops shuffling the big DataFrame A, which is the main win here.

0
AY
Arjun Y.
2026-01-28

Actually, C and D can be ruled out since broadcasting the large DataFrame A doesn’t make sense. Between A and B, broadcasting B avoids shuffling the huge A, which is the costly part. So B’s reasoning stands stronger here.

0
HV
Hassan V.
2026-01-24

A vs B, both say broadcast B, but B’s reasoning about shuffling A feels clearer.

0
IS
Imran S.
2026-01-21

Broadcasting B is best since it's smaller and avoids shuffling A, so A.

0
IS
Imran S.
2026-01-13

Option A makes sense—broadcast the smaller DataFrame B to avoid shuffling the big one.

0