Home/google/Free Google Professional-Machine-Learning-Engineer Actual Exam Questions/Question 11

Free Google Professional-Machine-Learning-Engineer Actual Exam Questions - Question 11 Discussion

Question No. 11

You are training an object detection machine learning model on a dataset that consists of three
million X-ray images, each roughly 2 GB in size. You are using Vertex AI Training to run a custom
training application on a Compute Engine instance with 32-cores, 128 GB of RAM, and 1 NVIDIA P100
GPU. You notice that model training is taking a very long time. You want to decrease training time
without sacrificing model performance. What should you do?

Select one option, then reveal solution.

Andre K.

2026-02-19

D imo, distributing the training across multiple GPUs or nodes should cut time significantly without risking compatibility issues like with TPUs. Plus, more memory alone (A) won’t speed up GPU-bound tasks much.

Irfan R.

2026-02-15

B/D? I’m thinking D makes more sense here since the setup only has one GPU, so scaling out with tf.distribute.Strategy would speed things up. Also, the question doesn’t confirm TPU support or that the code can run on TPU, so swapping to a v3-32 TPU might cause compatibility issues or need code changes. Increasing memory alone (A) won’t help much without more GPUs or better parallelism. Early stopping (C) might reduce training time but risks hurting model performance, which the question says we don’t want. So D should be the best bet if distributed training is implemented.

Mason K.

2026-02-14

Not A, increasing memory alone won’t speed up training much without more GPUs or distribution.

Carlos N.

2026-02-10

D, distributing across multiple GPUs or machines can cut training time significantly.

Rizwan U.

2026-02-10

D, distributing the load across multiple machines should help speed up training a lot.

Ali S.

2026-01-23

B tbh seems like a solid pick too since switching to a TPU could speed up training massively, especially with such a heavy workload. The GPU might just not be powerful enough for this scale.

Irfan C.

2026-01-15

Option D seems promising, using distributed training might speed things up.