Free Databricks-Certified-Associate-Developer-for-Apache-Spark-3.5 Actual Exam Questions - Question 13 Discussion
A developer needs to produce a Python dictionary using data stored in a small Parquet table, which looks like this:
The resulting Python dictionary must contain a mapping of region -> region id containing the smallest 3 region_id values. Which code fragment meets the requirements? A)
B)
C)
D)
The resulting Python dictionary must contain a mapping of region -> region_id for the smallest 3 region_id values. Which code fragment meets the requirements?
B tbh, since it selects 'region_id' first then 'region', so when converted to dict it'll map region_id to region, which isn't what we want. A is better for region to region_id mapping after sorting ascending.
Option B also sorts by region_id ascending and selects columns in the right order for key-value pairs, so it fits the mapping. Just like with A, dict() might need tuples instead of Rows though.
A. It sorts ascending by region_id and selects region first, so dict() should create the right mapping from region to region_id for the smallest three values.
Maybe B, since it selects region_id first, so dict() will map region_id to region, but that’s the reverse of what we want. So probably A is better as it has region first and sorts ascending.
A vs D? D sorts descending, so it’s out for smallest 3.
I think A looks right since it sorts by region_id ascending and takes 3, then maps region to region_id. B and C switch the order in select, so the dict keys would be region_id, not region. D sorts descending, so it gets the largest IDs, not smallest.