Free AWS AIF-C01 Certified AI Practitioner Actual Exam Questions - Question 5 Discussion
a. The company uses an ML model to evaluate the security camera footage for potential thefts. The
company has discovered that the model disproportionately flags people who are members of a
specific ethnic group.
Which type of bias is affecting the model output?
B. If the model flags one ethnic group more often, it could mean the training data didn’t represent that group fairly. When certain groups are underrepresented or overlooked in the sample, the model struggles to generalize and ends up biased. This fits sampling bias because the problem comes from how the data was gathered, not from how features were measured or labeled. Without enough diversity in the training set, the model’s predictions will naturally skew against less represented groups.
C, since observer bias involves human labeling errors that could skew the model’s learning.
This really sounds like sampling bias (B) to me. If the training data didn’t include enough footage of that specific ethnic group, the model wouldn’t learn their typical behavior well, leading to more false flags. Measurement bias (A) would be about faulty sensors or cameras, but here it seems more about who’s in the data than how the data is captured.
A. Measurement bias fits since the way the camera captures or processes images might distort features of that ethnic group, leading to incorrect flags. It’s not just about data representation but how the inputs are measured.
D imo, since confirmation bias happens when the model's predictions reinforce existing stereotypes or assumptions, which could explain why it flags a specific ethnic group more often.
Could it be B since the training data might underrepresent other groups?
Not C, observer bias usually involves the person interpreting data differently, but here the issue is with how the model flags people, so it’s more about data or measurement than human judgment.
It’s A since the way the camera records might distort certain features.
This looks like sampling bias messing up the results. B