Free Amazon MLS-C01 Actual Exam Questions - Question 7 Discussion
A data science team is working with a tabular dataset that the team stores in Amazon S3. The team
wants to experiment with different feature transformations such as categorical feature encoding.
Then the team wants to visualize the resulting distribution of the dataset. After the team finds an
appropriate set of feature transformations, the team wants to automate the workflow for feature
transformations.
Which solution will meet these requirements with the MOST operational efficiency?
A – Data Wrangler covers all steps smoothly without extra moving parts.
A – integrated workflow in SageMaker is simpler and more efficient overall.
A/B? I like that A keeps everything tightly integrated with SageMaker, which feels smoother for both experimenting and automating the workflow. B is tempting since notebooks give flexibility and QuickSight is great for visuals, but using Lambda for automation might be less seamless compared to SageMaker pipelines. Since the question emphasizes operational efficiency, A’s native pipeline export probably reduces overhead more than piecing together Lambda and Step Functions. Plus, Data Wrangler in A is built specifically for this kind of feature work, so it seems like the most straightforward fit
Probably D. Using Data Wrangler for transformations is solid and QuickSight works well for visualization. Splitting transformations into separate Lambdas gives more modular control, and Step Functions can efficiently automate the whole workflow. It might be a bit more complex than A, but feels more flexible and operationally efficient when you want to scale or adjust parts without redoing the entire pipeline. With A, you’re kind of locked into SageMaker’s pipeline structure, which is good but less flexible if your needs evolve.
Maybe D could work since Data Wrangler handles the feature work, and using QuickSight is good for visuals. Splitting transformations into Lambda functions with Step Functions might give more control on automation than A.
It’s A because Data Wrangler combines transformation, visualization, and pipeline export all in one place, which is smoother than juggling separate tools and functions like in D.
A Just wondering—does the question specify if the team wants a fully managed service for automation or if using multiple Lambda functions and Step Functions is acceptable? That might affect which option fits best.