Free Top Amazon/AWS DEA-C01 Actual Exam Questions
Dumps Box (DumpsBox) offers up-to-date practice exam questions for DEA-C01 certification exam which are developed and validated by Amazon – AWS subject domain experts certified in Top Amazon/AWS DEA-C01 . These practice questions are update regularly as we keep an eye on any recent changes in DEA-C01 syllabus, and when there is update our team quickly adjusts the questions. This commitment to providing the best quality exam prep material to certification aspirants is what makes DumpsBox.com the best certification exam prep website. On top of that, our strong, yet strictly moderated, community based feedback keeps the content clean and current. Each question has helpful community discussion that provides it extra perspective and introduces helpful resources for better exam preparation. This also saves students from other outdated practice questions or illicit exam dumps that can have adverse affects on career. Browse through our Top Amazon/AWS DEA-C01 exam questions and pass your exam on first try.
connections that encrypt data in transit to communicate securely with AWS infrastructure that is
managed by a customer.
A data engineer needs to implement a solution to simplify the generation, distribution, and rotation
of digital certificates. The solution must automatically renew and deploy SSL/TLS certificates.
Which solution will meet these requirements with the LEAST operational overhead?
It’s B for sure. ACM handles certs automatically and works with Elastic Load Balancers in front of EC2, so you avoid manual cert management or complex scripting.
Makes sense to skip A since self-managing certs means more manual work and risk. B seems best for auto-renewal without extra scripting hassle. Going with B here.
buckets and Amazon RDS databases. The data catalog must include storage format metadata for the
data in the catalog.
Which solution will meet these requirements with the LEAST effort?
Good point about RDS engine support being crucial. Still, option B makes the most sense since Glue crawlers can recognize formats automatically, reducing manual steps compared to A or D.
B Glue crawlers are designed to detect data formats automatically, so they handle this with minimal manual work. This beats A since it avoids relying on data stewards.
application on an Amazon Elastic Kubernetes Services (Amazon EKS) cluster.
The company wants to set up a robust monitoring system for the application. The company needs to
analyze the logs from the EKS cluster and the application. The company needs to correlate the
cluster's logs with the application's traces to identify points of failure in the whole application
request flow.
Which combination of steps will meet these requirements with the LEAST development effort?
(Select TWO.)
D, because OpenSearch is built for log and trace correlation with minimal extra setup.
Option A is solid because FluentBit and OpenTelemetry are designed to work seamlessly for logs and traces collection in Kubernetes environments, minimizing custom development. For correlating those logs and traces, option D makes the most sense—Amazon OpenSearch can index both data types and supports search queries that help identify failure points across the request flow. The other options either complicate trace collection or add unnecessary processing layers. Using OpenSearch directly for analysis also avoids spinning up extra ETL processes like with Glue, so A and D together meet the requi
customer record data for 7 years after each record is created. The root user also must not have the
ability to delete or modify the data.
A data engineer wants to use S3 Object Lock to secure the data.
Which solution will meet these requirements?
It’s B because compliance mode blocks root user from deleting or modifying objects during retention.
B makes sense since compliance mode locks objects from root changes, but I'm unsure if default retention applies automatically. Does the default retention really cover all objects without setting it individually?
The company has enabled logging and monitoring for all AWS Glue jobs. One of the AWS Glue jobs
begins to fail. A data engineer investigates the error and wants to examine metrics for all individual
stages within the job. How can the data engineer access the stage metrics?
Maybe A here. The Spark UI is designed to show detailed metrics for each stage of a Glue job since Glue runs on Apache Spark. CloudWatch (B) usually shows overall job metrics but not the fine-grained stage details. CloudTrail (C) is more about API calls, so it wouldn’t have stage metrics. Run Insights (D) sounds useful but it depends on the Glue version, which isn’t specified, so it feels less certain. Given that, checking the Spark UI directly seems like the most reliable way to get stage-level info.
A Spark UI gives a direct look at each stage’s performance, so it’s handy if you want detailed metrics without extra features. It’s built for that kind of granular Spark info.
Athena table named cities_world. The cities_world table contains cities that are located around the
world. The data engineer must create a new table named cities_us to contain only the cities from
cities_world that are located in the US.
Which SQL statement should the data engineer use to meet this requirement?

It’s B here because it uses CREATE TABLE AS with the right WHERE filter for US cities and explicitly defines the table location, which Athena needs to know to create the new table properly. Options C and D miss setting the location, so they might fail or default wrongly. Option A doesn’t filter at all. So B covers filtering and table creation details more completely.
C/D? Option C creates the new table with SELECT but doesn’t filter for US cities, while D filters correctly. So D seems more aligned with the requirement to include only US cities.
pipeline needs to send alerts in real time when a step fails or succeeds. The data processing pipeline
uses a combination of Amazon S3 buckets, AWS Lambda functions, and AWS Step Functions state
machines.
A data engineer needs to create a solution to monitor the entire pipeline.
Which solution will meet these requirements?
Maybe D is right since EventBridge directly listens to Step Functions events, making it cleaner than relying on S3 or CloudTrail, which aren’t designed for real-time pipeline alerts.
Maybe D is the best choice here because EventBridge can catch state machine execution updates instantly, which fits the real-time alert need. Options A and B depend on S3 events, which might add delays and extra steps, so they don’t feel as direct. CloudTrail in C is more about logging API calls rather than tracking actual step success or failure statuses, so it could miss the nuances inside the pipeline. EventBridge’s integration with SNS makes it straightforward to send alerts whenever a step fails or succeeds, covering the pipeline more comprehensively.
a. The company uses versioning in some buckets. The company runs several jobs to read and load
data into the buckets.
To help cost-optimize its storage, the company wants to gather information about incomplete
multipart uploads and outdated versions that are present in the S3 buckets.
Which solution will meet these requirements with the LEAST operational effort?
Not A, using AWS CLI would mean manual scripts and more ongoing work. Storage Lens (C) offers a ready dashboard with insights on incomplete uploads and old versions, cutting down on operational effort.
It’s C. Storage Lens gives a broad overview with trends and metrics, including incomplete multipart uploads and outdated versions, all in one dashboard. It requires minimal setup and no manual report handling. Inventory reports (B) are detailed but more manual to process regularly. For quick, ongoing insights with less operational fuss, Storage Lens fits better since it’s designed for cost optimization visibility across buckets without extra scripting or data crunching.
build a data lake on AWS. The company wants to load data warehouse tables into Amazon S3 and
synchronize the tables with incremental data that arrives from the data warehouse every day.
Each table has a column that contains monotonically increasing values. The size of each table is less
than 50 GB. The data warehouse tables are refreshed every night between 1 AM and 2 AM. A
business intelligence team queries the tables between 10 AM and 8 PM every day.
Which solution will meet these requirements in the MOST operationally efficient way?
C fits best here since tables are small and a full refresh nightly is simplest.
Maybe C works best here since the tables are small and refreshed nightly. Just overwriting the entire dataset with DMS full load avoids the complexity of incremental logic and fits the BI team's access timing well.
company to create a new topic in an existing Amazon Managed Streaming for Apache Kafka (Amazon
MSK) cluster.
A few days after the company added the new topic, Amazon CloudWatch raised an alarm on the
RootDiskUsed metric for the MSK cluster.
How should the company address the CloudWatch alarm?
It’s not C because just upgrading instance size without increasing storage won’t fix root disk space issues. Also, restarting the cluster could cause downtime which isn’t needed here. Expanding storage is more direct.
D imo, specifying a target volume for a topic doesn’t affect the broker’s root disk usage directly. The issue is more about overall storage capacity on the brokers, so expanding storage like in A seems more relevant here.
thousand data points each second. The company runs an application to process the usage data in real
time. The company aggregates and stores the data in an Amazon Aurora DB instance.
Sudden drops in network usage usually indicate a network outage. The company must be able to
identify sudden drops in network usage so the company can take immediate remedial actions.
Which solution will meet this requirement with the LEAST latency?
It’s B because streaming data through Kinesis with Apache Flink lets you detect drops instantly without waiting for batch queries. That’s way faster than polling every minute like A or C.
C imo, querying DynamoDB with DAX should be faster than Aurora queries every minute.
analysis.
A data engineering team needs to use Amazon QuickSight to perform the analysis and build
dashboards. A data engineer needs to extract the data from the SaaS applications and make the data
available for QuickSight queries.
Which solution will meet these requirements in the MOST operationally efficient way?
Maybe D is the simplest if the SaaS apps already support exports, but it’s manual and won’t scale well for frequent updates or multiple apps. Not very efficient long term.
It’s C. AppFlow automates data extraction and transfer to S3 without custom coding, so it’s way less hands-on than building Lambda functions or managing Athena connectors. Much better for operational efficiency.
AWS Glue job. The solution must integrate with AWS services.
Which solution will meet these requirements with the LEAST management overhead?
Maybe A is better because Step Functions natively orchestrate both Lambda and Glue jobs without extra infrastructure, unlike Airflow options which add more management overhead. Glue workflows likely don’t call Lambda directly yet.
D imo, running Airflow on EKS is definitely more complex and requires managing the cluster, so it adds way more overhead than a native AWS service like Step Functions.
the data into a central data warehouse to perform analytics. Users need fast response times for
analytics queries.
The company uses Amazon QuickSight in direct query mode to visualize the data. Users normally run
queries during a few hours each day with unpredictable spikes.
Which solution will meet these requirements with the LEAST operational overhead?
What about B? Athena scales automatically and fits unpredictable spikes with no infra management.
A/D? Aurora is more OLTP, not analytics-optimized, so A fits better here.
The company has deployed an ML model on a real-time endpoint in Amazon SageMaker.
The company wants to make real-time inventory recommendations. The company also wants to
make predictions about future inventory needs.
Which solutions will meet these requirements? (Select TWO.)
I think B is a solid pick for real-time recommendations since it directly invokes the SageMaker endpoint. For the future inventory predictions, A seems like the best choice because Redshift ML can build models right inside Redshift, so it handles ongoing predictions without needing to export data elsewhere. C is more about offline training, which isn’t as useful for real-time or seamless future predictions. So I’d go with A and B here.
D imo because SageMaker Autopilot is focused on building models, not dashboards, so that’s out. E is definitely off since Redshift isn’t meant for file storage or archiving reports. That leaves A and B as the only viable options—A for using Redshift ML to handle predictions within the cluster, and B for invoking the SageMaker real-time endpoint directly. Makes sense to combine them to cover both real-time and future inventory needs without unnecessary steps.