Question 3: Free Databricks Machine Learning Associate Actual Exam Questions

Question No. 3

A data scientist wants to parallelize the training of trees in a gradient boosted tree to speed up the
training process. A colleague suggests that parallelizing a boosted tree algorithm can be difficult.
Which of the following describes why?

Select one option, then reveal solution.

Osama E.

2026-02-15

Maybe D makes the most sense since gradient boosting builds trees one after another, each depending on the errors of the previous trees. Options A and C don’t really fit because gradient boosting isn’t strictly about linear algebra or using all cores for gradient calculation in a way that blocks parallelism. B also seems off because you can actually batch data or use subsets, so it’s not like you need all data at once every iteration. The main hold-up is definitely that iterative dependency between trees.

Arjun I.

2026-01-28

Parallelizing is tough because each tree depends on the previous one, so D.

Arjun I.

2026-01-24

Maybe D makes the most sense since each tree relies on the previous ones’ outputs, so you can't just run all trees at once. The iterative dependency messes with parallelization.

Osama C.

2026-01-17

B/D? B seems off since data can be split across cores, but D fits because each tree depends on previous trees' results, so you can't fully parallelize the sequence itself.

Osama C.

2026-01-16

D imo, because gradient boosting builds trees sequentially, each depends on the last one. A is a bit off since boosting isn’t about linear algebra per se. D captures that dependency chain well.

Free Databricks Machine Learning Associate Actual Exam Questions - Question 3 Discussion