Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: scxmas70

Professional-Data-Engineer Exam Dumps - Google Professional Data Engineer Exam

Go to page:
Question # 33

You are designing storage for two relational tables that are part of a 10-TB database on Google Cloud. You want to support transactions that scale horizontally. You also want to optimize data for range queries on nonkey columns. What should you do?

A.

Use Cloud SQL for storage. Add secondary indexes to support query patterns.

B.

Use Cloud SQL for storage. Use Cloud Dataflow to transform data to support query patterns.

C.

Use Cloud Spanner for storage. Add secondary indexes to support query patterns.

D.

Use Cloud Spanner for storage. Use Cloud Dataflow to transform data to support query patterns.

Full Access
Question # 34

Your infrastructure includes a set of YouTube channels. You have been tasked with creating a process for sending the YouTube channel data to Google Cloud for analysis. You want to design a solution that allows your world-wide marketing teams to perform ANSI SQL and other types of analysis on up-to-date YouTube channels log data. How should you set up the log data transfer into Google Cloud?

A.

Use Storage Transfer Service to transfer the offsite backup files to a Cloud Storage Multi-Regional storage bucket as a final destination.

B.

Use Storage Transfer Service to transfer the offsite backup files to a Cloud Storage Regional bucket as a final destination.

C.

Use BigQuery Data Transfer Service to transfer the offsite backup files to a Cloud Storage Multi-Regional storage bucket as a final destination.

D.

Use BigQuery Data Transfer Service to transfer the offsite backup files to a Cloud Storage Regionalstorage bucket as a final destination.

Full Access
Question # 35

An external customer provides you with a daily dump of data from their database. The data flows into Google Cloud Storage GCS as comma-separated values (CSV) files. You want to analyze this data in Google BigQuery, but the data could have rows that are formatted incorrectly or corrupted. How should you build this pipeline?

A.

Use federated data sources, and check data in the SQL query.

B.

Enable BigQuery monitoring in Google Stackdriver and create an alert.

C.

Import the data into BigQuery using the gcloud CLI and set max_bad_records to 0.

D.

Run a Google Cloud Dataflow batch pipeline to import the data into BigQuery, and push errors to another dead-letter table for analysis.

Full Access
Question # 36

You have uploaded 5 years of log data to Cloud Storage A user reported that some data points in the log data are outside of their expected ranges, which indicates errors You need to address this issue and be able to run the process again in the future while keeping the original data for compliance reasons. What should you do?

A.

Import the data from Cloud Storage into BigQuery Create a new BigQuery table, and skip the rows with errors.

B.

Create a Compute Engine instance and create a new copy of the data in Cloud Storage Skip the rows with errors

C.

Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to a new dataset inCloud Storage

D.

Create a Cloud Dataflow workflow that reads the data from Cloud Storage, checks for values outside the expected range, sets the value to an appropriate default, and writes the updated records to the same dataset in Cloud Storage

Full Access
Question # 37

You migrated your on-premises Apache Hadoop Distributed File System (HDFS) data lake to Cloud Storage. The data scientist team needs to process the data by using Apache Spark and SQL. Security policies need to be enforced at the column level. You need a cost-effective solution that can scale into a data mesh. What should you do?

A.

1. Load the data to BigQuery tables.2. Create a taxonomy of policy tags in Data Catalog.3. Add policy tags to columns.4. Process with the Spark-BigQuery connector or BigQuery SQL.

B.

1. Deploy a long-living Dataproc cluster with Apache Hive and Ranger enabled.2. Configure Ranger for column level security.3. Process with Dataproc Spark or Hive SQL.

C.

1. Apply an Identity and Access Management (IAM) policy at the file level in Cloud Storage.2. Define a BigQuery external table for SQL processing.3. Use Dataproc Spark to process the Cloud Storage files.

D.

1. Define a BigLake table.2. Create a taxonomy of policy tags in Data Catalog.3. Add policy tags to columns.4. Process with the Spark-BigQuery connector or BigQuery SQL.

Full Access
Question # 38

You issue a new batch job to Dataflow. The job starts successfully, processes a few elements, and then suddenly fails and shuts down. You navigate to the Dataflow monitoring interface where you find errors related to a particular DoFn in your pipeline. What is the most likely cause of the errors?

A.

Exceptions in worker code

B.

Job validation

C.

Graph or pipeline construction

D.

Insufficient permissions

Full Access
Question # 39

You have important legal hold documents in a Cloud Storage bucket. You need to ensure that these documents are not deleted or modified. What should you do?

A.

Set a retention policy. Lock the retention policy.

B.

Set a retention policy. Set the default storage class to Archive for long-term digital preservation.

C.

Enable the Object Versioning feature. Add a lifecycle rule.

D.

Enable the Object Versioning feature. Create a copy in a bucket in a different region.

Full Access
Question # 40

You are designing storage for 20 TB of text files as part of deploying a data pipeline on Google Cloud. Your input data is in CSV format. You want to minimize the cost of querying aggregate values for multiple users who will query the data in Cloud Storage with multiple engines. Which storage service and schema design should you use?

A.

Use Cloud Bigtable for storage. Install the HBase shell on a Compute Engine instance to query the Cloud Bigtable data.

B.

Use Cloud Bigtable for storage. Link as permanent tables in BigQuery for query.

C.

Use Cloud Storage for storage. Link as permanent tables in BigQuery for query.

D.

Use Cloud Storage for storage. Link as temporary tables in BigQuery for query.

Full Access
Go to page: