Free Google Professional-Machine-Learning-Engineer Actual Exam Questions - Question 8 Discussion

Question No. 8
You were asked to investigate failures of a production line component based on sensor readings.
After receiving the dataset, you discover that less than 1% of the readings are positive examples
representing failure incidents. You have tried to train several classification models, but none of them
converge. How should you resolve the class imbalance problem?
Select one option, then reveal solution.
US
RI
Ravi I.
2026-02-21

D imo, completely balancing the classes by removing negatives can make the training much simpler and faster, especially when positives are so rare. Sure, you lose some data, but it avoids the complexity of carefully tuning weights or generating synthetic samples. Sometimes straightforward equal sampling helps models converge better by not overwhelming them with negatives. Plus, with less data, training might be quicker, letting you experiment more. The key is to keep enough negatives while making the dataset manageable and balanced for the model to learn meaningful patterns.

0
MB
Mason B.
2026-02-16

C. Downsampling negatives and upweighting positives strikes a good balance without losing too much info, unlike D that drops too many negatives or A which could add noisy synthetic data. Makes the model focus on real failure signals.

0
MB
Mason B.
2026-02-16

It’s A because generating more positive examples to reach 10% can help the model learn better patterns without dropping data, unlike D which might throw away too many negatives and hurt performance.

0
RD
Rizwan D.
2026-02-15

C is the best pick since it balances the classes without completely discarding lots of negative samples like D. Using class weights in the loss function for upweighting positives makes sense here.

0
DF
David F.
2026-02-14

It’s C. Downsampling negatives while upweighting positives helps balance the dataset without losing too much information, unlike D which risks losing valuable negative data or A which might produce unreliable synthetic positives.

0
AG
Ahmed G.
2026-02-12

Yeah, I agree that removing negatives completely (D) feels risky. I’d pick C because downsampling while upweighting keeps a more balanced set and still respects the original data distribution. C

0
SK
Shoaib K.
2026-02-10

Option A makes sense too because generating more positive examples can help the model see enough failures without losing negative data, unlike outright removing negatives like in D.

0
BO
Brian O.
2026-01-29

I don’t think D is the best move since cutting out negatives loses valuable info. I’d go with C—downsampling negatives but upweighting positives in training helps balance without throwing away too much data. It keeps a good mix and lets the model focus more on failure cases, so the learning should be more stable.

0
MF
Mohammad F.
2026-01-28

It’s A because generating more positive examples to get closer to 10% helps the model see enough failure cases without throwing away any negatives. D feels too extreme since removing negatives wastes tons of data, which isn’t ideal for rare event detection. Also, B is off-topic—it’s about architecture, not imbalance. C could work if upweighting means adjusting loss weights, but it’s less clear and might still keep the original data distribution. Overall, A offers a straightforward way to address imbalance by increasing positive samples without losing valuable negatives.

0
DD
David D.
2026-01-23

It’s D, because removing negatives completely balances classes, making model training easier despite losing data.

0
DD
David D.
2026-01-21

It’s A. Generating more positive examples to reach around 10% helps the model see enough failure cases without discarding real data, unlike D which loses a lot of negatives and might hurt performance.

0
AI
Arjun I.
2026-01-18

A. Generating more positive examples to reach around 10% seems like a more balanced fix than just cutting negatives (D) or downsampling with upweighting (C). It’s probably about creating synthetic failure data, which helps the model learn better without losing valuable real negative samples. B is off-topic since it’s about changing the model architecture, not handling imbalance. So boosting positives with generation seems the cleanest way to get convergence here.

0
CJ
Chris J.
2026-01-17

C imo, because downsampling with upweighting helps keep the dataset manageable without losing too much info, unlike D which just drops a ton of negatives. It’s a practical balance between size and class distribution.

0
AE
Andrew E.
2026-01-17

A. Creating more positive examples to reach about 10% sounds like a clearer way to address the imbalance without throwing away data. If it means generating synthetic positives, that could help the model see more failure patterns and avoid being biased towards negatives. Options C and D both involve losing or reweighting data, which might not be ideal if the dataset isn’t huge. Also, B feels off since switching to a convolutional network doesn’t directly fix class imbalance—it’s more about model choice, not data balance. So boosting positive samples seems like the most straightforward fix here.

0
AE
Andrew E.
2026-01-16

Option C seems safer than D, because downsampling plus upweighting balances classes without losing too much data.

0
MS
Mason S.
2026-01-15

It’s kinda confusing here, but D seems risky since you’d lose a lot of data. I’m not sure if just downsampling or upweighting (C) is enough either. Anyone else think A might be better to keep the dataset more realistic?

0