Professional-Data-Engineer Exam Dumps - Google Professional Data Engineer Exam

Searching for workable clues to ace the Google Professional-Data-Engineer Exam? You’re on the right place! ExamCert has realistic, trusted and authentic exam prep tools to help you achieve your desired credential. ExamCert’s Professional-Data-Engineer PDF Study Guide, Testing Engine and Exam Dumps follow a reliable exam preparation strategy, providing you the most relevant and updated study material that is crafted in an easy to learn format of questions and answers. ExamCert’s study tools aim at simplifying all complex and confusing concepts of the exam and introduce you to the real exam scenario and practice it with the help of its testing engine and real exam dumps

Go to page:

<< First
Prev
1
2
3
4
5
6
7
8
9
10
Next
Last >>

Question # 9

Your companyâ€™s customer and order databases are often under heavy load. This makes performing analytics against them difficult without harming operations. The databases are in a MySQL cluster, with nightly backups taken using mysqldump. You want to perform analytics with minimal impact on operations. What should you do?

Add a node to the MySQL cluster and build an OLAP cube there.

Use an ETL tool to load the data from MySQL into Google BigQuery.

Connect an on-premises Apache Hadoop cluster to MySQL and perform ETL.

Mount the backups to Google Cloud SQL, and then process the data using Google Cloud Dataproc.

Full Access

Question # 10

You work for a car manufacturer and have set up a data pipeline using Google Cloud Pub/Sub to capture anomalous sensor events. You are using a push subscription in Cloud Pub/Sub that calls a custom HTTPS endpoint that you have created to take action of these anomalous events as they occur. Your custom HTTPS endpoint keeps getting an inordinate amount of duplicate messages. What is the most likely cause of these duplicate messages?

The message body for the sensor event is too large.

Your custom endpoint has an out-of-date SSL certificate.

The Cloud Pub/Sub topic has too many messages published to it.

Your custom endpoint is not acknowledging messages within the acknowledgement deadline.

Full Access

Question # 11

Your company is using WHILECARD tables to query data across multiple tables with similar names. The SQL statement is currently failing with the following error:

# Syntax error : Expected end of statement but got â€œ-â€œ at [4:11]

SELECT age

FROM

bigquery-public-data.noaa_gsod.gsod

WHERE

age != 99

AND_TABLE_SUFFIX = â€˜1929â€™

ORDER BY

age DESC

Which table name will make the SQL statement work correctly?

â€˜bigquery-public-data.noaa_gsod.gsodâ€˜

bigquery-public-data.noaa_gsod.gsod*

â€˜bigquery-public-data.noaa_gsod.gsodâ€™*

â€˜bigquery-public-data.noaa_gsod.gsod*`

Full Access

Question # 12

Your company is migrating their 30-node Apache Hadoop cluster to the cloud. They want to re-use Hadoop jobs they have already created and minimize the management of the cluster as much as possible. They also want to be able to persist data beyond the life of the cluster. What should you do?

Create a Google Cloud Dataflow job to process the data.

Create a Google Cloud Dataproc cluster that uses persistent disks for HDFS.

Create a Hadoop cluster on Google Compute Engine that uses persistent disks.

Create a Cloud Dataproc cluster that uses the Google Cloud Storage connector.

Create a Hadoop cluster on Google Compute Engine that uses Local SSD disks.

Full Access

Question # 13

Your startup has never implemented a formal security policy. Currently, everyone in the company has access to the datasets stored in Google BigQuery. Teams have freedom to use the service as they see fit, and they have not documented their use cases. You have been asked to secure the data warehouse. You need to discover what everyone is doing. What should you do first?

Use Google Stackdriver Audit Logs to review data access.

Get the identity and access management IIAM) policy of each table

Use Stackdriver Monitoring to see the usage of BigQuery query slots.

Use the Google Cloud Billing API to see what account the warehouse is being billed to.

Full Access

Question # 14

You are creating a model to predict housing prices. Due to budget constraints, you must run it on a single resource-constrained virtual machine. Which learning algorithm should you use?

Linear regression

Logistic classification

Recurrent neural network

Feedforward neural network

Full Access

Question # 15

Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Google BigQuery. You need to secure the data so that clients cannot see each otherâ€™s data. You want to ensure appropriate access to the data. Which three steps should you take? (Choose three.)

Load data into different partitions.

Load data into a different dataset for each client.

Put each clientâ€™s BigQuery dataset into a different table.

Restrict a clientâ€™s dataset to approved users.

Only allow a service account to access the datasets.

Use the appropriate identity and access management (IAM) roles for each clientâ€™s users.

Full Access

Question # 16

You are building a model to predict whether or not it will rain on a given day. You have thousands of input features and want to see if you can improve training speed by removing some features while having a minimum effect on model accuracy. What can you do?

Eliminate features that are highly correlated to the output labels.

Combine highly co-dependent features into one representative feature.

Instead of feeding in each feature individually, average their values in batches of 3.

Remove the features that have null values for more than 50% of the training records.

Full Access

Go to page: