Note! Following DAS-C01 Exam is Retired now. Please select the alternative replacement for your Exam Certification.

DAS-C01 Exam Dumps - AWS Certified Data Analytics - Specialty

Go to page:

<< First
Prev
1
2
3
4
5
6
7
8
Next
Last >>

Question # 25

A company has developed an Apache Hive script to batch process data stared in Amazon S3. The script needs to run once every day and store the output in Amazon S3. The company tested the script, and it completes within 30 minutes on a small local three-node cluster.

Which solution is the MOST cost-effective for scheduling and executing the script?

Create an AWS Lambda function to spin up an Amazon EMR cluster with a Hive execution step. Set KeepJobFlowAliveWhenNoSteps to false and disable the termination protection flag. Use Amazon CloudWatch Events to schedule the Lambda function to run daily.

Use the AWS Management Console to spin up an Amazon EMR cluster with Python Hue. Hive, and Apache Oozie. Set the termination protection flag to true and use Spot Instances for the core nodes of the cluster. Configure an Oozie workflow in the cluster to invoke the Hive script daily.

Create an AWS Glue job with the Hive script to perform the batch operation. Configure the job to run once a day using a time-based schedule.

Use AWS Lambda layers and load the Hive runtime to AWS Lambda and copy the Hive script. Schedule the Lambda function to run daily by creating a workflow using AWS Step Functions.

Full Access

Question # 26

A company uses the Amazon Kinesis SDK to write data to Kinesis Data Streams. Compliance requirements state that the data must be encrypted at rest using a key that can be rotated. The company wants to meet this encryption requirement with minimal coding effort.

How can these requirements be met?

Create a customer master key (CMK) in AWS KMS. Assign the CMK an alias. Use the AWS Encryption SDK, providing it with the key alias to encrypt and decrypt the data.

Create a customer master key (CMK) in AWS KMS. Assign the CMK an alias. Enable server-side encryption on the Kinesis data stream using the CMK alias as the KMS master key.

Create a customer master key (CMK) in AWS KMS. Create an AWS Lambda function to encrypt and decrypt the data. Set the KMS key ID in the functionâ€™s environment variables.

Enable server-side encryption on the Kinesis data stream using the default KMS key for Kinesis Data

Streams.

Full Access

Question # 27

A company hosts its analytics solution on premises. The analytics solution includes a server that collects log files. The analytics solution uses an Apache Hadoop cluster to analyze the log files hourly and to produce output files. All the files are archived to another server for a specified duration.

The company is expanding globally and plans to move the analytics solution to multiple AWS Regions in the AWS Cloud. The company must adhere to the data archival and retention requirements of each country where the data is stored.

Which solution will meet these requirements?

Create an Amazon S3 bucket in one Region to collect the log files. Use S3 event notifications to invoke an AWS Glue job for log analysis. Store the output files in the target S3 bucket. Use S3 Lifecycle rules on the target S3 bucket to set an expiration period that meets the retention requirements of the country that contains the Region.

Create a Hadoop Distributed File System (HDFS) file system on an Amazon EMR cluster in one Region to collect the log files. Set up a bootstrap action on the EMR cluster to run an Apache Spark job. Store the output files in a target Amazon S3 bucket. Schedule a job on one of the EMR nodes to delete files that no longer need to be retained.

Create an Amazon S3 bucket in each Region to collect log files. Create an Amazon EMR cluster. Submit steps on the EMR clusterfor analysis. Store the output files in a target S3 bucket in each Region. Use S3 Lifecycle rules on each target S3 bucket to set an expiration period that meets the retention requirements of the country that contains the Region.

Create an Amazon Kinesis Data Firehose delivery stream in each Region to collect log data. Specify an Amazon S3 bucket in each Region as the destination. Use S3 Storage Lens for data analysis. Use S3 Lifecycle rules on each destination S3 bucket to set an expiration period that meets the retention requirements of the country that contains the Region.

Full Access

Question # 28

A marketing company collects clickstream data The company sends the data to Amazon Kinesis Data Firehose and stores the data in Amazon S3 The company wants to build a series of dashboards that will be used by hundreds of users across different departments The company will use Amazon QuickSight to develop these dashboards The company has limited resources and wants a solution that could scale and provide daily updates about clickstream activity

Which combination of options will provide the MOST cost-effective solution? (Select TWO )

Use Amazon Redshift to store and query the clickstream data

Use QuickSight with a direct SQL query

Use Amazon Athena to query the clickstream data in Amazon S3

Use S3 analytics to query the clickstream data

Use the QuickSight SPICE engine with a daily refresh

Full Access

Question # 29

A company has an application that ingests streaming data. The company needs to analyze this stream over a 5-minute timeframe to evaluate the stream for anomalies with Random Cut Forest (RCF) and summarize the current count of status codes. The source and summarized data should be persisted for future use.

Which approach would enable the desired outcome while keeping data persistence costs low?

Ingest the data stream with Amazon Kinesis Data Streams. Have an AWS Lambda consumer evaluate the stream, collect the number status codes, and evaluate the data against a previously trained RCF model. Persist the source and results as a time series to Amazon DynamoDB.

Ingest the data stream with Amazon Kinesis Data Streams. Have a Kinesis Data Analytics application evaluate the stream over a 5-minute window using the RCF function and summarize the count of status codes. Persist the source and results to Amazon S3 through output delivery to Kinesis Data Firehouse.

Ingest the data stream with Amazon Kinesis Data Firehose with a delivery frequency of 1 minute or 1 MB in Amazon S3. Ensure Amazon S3 triggers an event to invoke an AWS Lambda consumer that evaluates the batch data, collects the number status codes, and evaluates the data against a previously trained RCF model. Persist the source and results as a time series to Amazon DynamoDB.

Ingest the data stream with Amazon Kinesis Data Firehose with a delivery frequency of 5 minutes or 1 MB into Amazon S3. Have a Kinesis Data Analytics application evaluate the stream over a 1-minute window using the RCF function and summarize the count of status codes. Persist the results to Amazon S3 through a Kinesis Data Analytics output to an AWS Lambda integration.

Full Access

Question # 30

A company wants to improve the data load time of a sales data dashboard. Data has been collected as .csv files and stored within an Amazon S3 bucket that is partitioned by date. The data is then loaded to an Amazon Redshiftdata warehouse for frequent analysis. The data volume is up to 500 GB per day.

Which solution will improve the data loading performance?

Compress .csv files and use an INSERT statement to ingest data into Amazon Redshift.

Split large .csv files, then use a COPY command to load data into Amazon Redshift.

Use Amazon Kinesis Data Firehose to ingest data into Amazon Redshift.

Load the .csv files in an unsorted key order and vacuum the table in Amazon Redshift.

Full Access

Question # 31

A company plans to store quarterly financial statements in a dedicated Amazon S3 bucket. The financial statements must not be modified or deleted after they are saved to the S3 bucket.

Which solution will meet these requirements?

Create the S3 bucket with S3 Object Lock in governance mode.

Create the S3 bucket with MFA delete enabled.

Create the S3 bucket with S3 Object Lock in compliance mode.

Create S3 buckets in two AWS Regions. Use S3 Cross-Region Replication (CRR) between the buckets.

Full Access

Question # 32

A bank is using Amazon Managed Streaming for Apache Kafka (Amazon MSK) to populate real-time data into a data lake The data lake is built on Amazon S3, and data must be accessible from the data lake within 24 hours Different microservices produce messages to different topics in the cluster The cluster is created with 8 TB of Amazon Elastic Block Store (Amazon EBS) storage and a retention period of 7 days

The customer transaction volume has tripled recently and disk monitoring has provided an alert that the cluster is almost out of storage capacity

What should a data analytics specialist do to prevent the cluster from running out of disk space1?

Use the Amazon MSK console to triple the broker storage and restart the cluster

Create an Amazon CloudWatch alarm that monitors the KafkaDataLogsDiskUsed metric Automatically flush the oldest messages when the value of this metric exceeds 85%

Create a custom Amazon MSK configuration Set the log retention hours parameter to 48 Update the cluster with the new configuration file

Triple the number of consumers to ensure that data is consumed as soon as it is added to a topic.

Full Access

Go to page: