Free Databricks Machine Learning Associate Actual Exam Questions - Question 3 Discussion
training process. A colleague suggests that parallelizing a boosted tree algorithm can be difficult.
Which of the following describes why?
Maybe D makes the most sense since gradient boosting builds trees one after another, each depending on the errors of the previous trees. Options A and C don’t really fit because gradient boosting isn’t strictly about linear algebra or using all cores for gradient calculation in a way that blocks parallelism. B also seems off because you can actually batch data or use subsets, so it’s not like you need all data at once every iteration. The main hold-up is definitely that iterative dependency between trees.
Parallelizing is tough because each tree depends on the previous one, so D.
Maybe D makes the most sense since each tree relies on the previous ones’ outputs, so you can't just run all trees at once. The iterative dependency messes with parallelization.
B/D? B seems off since data can be split across cores, but D fits because each tree depends on previous trees' results, so you can't fully parallelize the sequence itself.
D imo, because gradient boosting builds trees sequentially, each depends on the last one. A is a bit off since boosting isn’t about linear algebra per se. D captures that dependency chain well.