11.11 Special Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: scxmas70

DP-203 Exam Dumps - Data Engineering on Microsoft Azure

Question # 4

You have two Azure SQL databases named DB1 and DB2.

DB1 contains a table named Table 1. Table1 contains a timestamp column named LastModifiedOn. LastModifiedOn contains the timestamp of the most recent update for each individual row.

DB2 contains a table named Watermark. Watermark contains a single timestamp column named WatermarkValue.

You plan to create an Azure Data Factory pipeline that will incrementally upload into Azure Blob Storage all the rows in Table1 for which the LastModifiedOn column contains a timestamp newer than the most recent value of the WatermarkValue column in Watermark.

You need to identify which activities to include in the pipeline. The solution must meet the following requirements:

• Minimize the effort to author the pipeline.

• Ensure that the number of data integration units allocated to the upload operation can be controlled.

What should you identify? To answer, select the appropriate options in the answer area.

Full Access
Question # 5

You have two fact tables named Flight and Weather. Queries targeting the tables will be based on the join between the following columns.

You need to recommend a solution that maximizes query performance.

What should you include in the recommendation?

A.

In the tables use a hash distribution of ArrivalDateTime and ReportDateTime.

B.

In the tables use a hash distribution of ArrivaIAirportID and AirportlD.

C.

In each table, create an identity column.

D.

In each table, create a column as a composite of the other two columns in the table.

Full Access
Question # 6

You have the Azure Synapse Analytics pipeline shown in the following exhibit.

You need to add a set variable activity to the pipeline to ensure that after the pipeline’s completion, the status of the pipeline is always successful.

What should you configure for the set variable activity?

A.

a success dependency on the Business Activity That Fails activity

B.

a failure dependency on the Upon Failure activity

C.

a skipped dependency on the Upon Success activity

D.

a skipped dependency on the Upon Failure activity

Full Access
Question # 7

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are designing an Azure Stream Analytics solution that will analyze Twitter data.

You need to count the tweets in each 10-second window. The solution must ensure that each tweet is counted only once.

Solution: You use a hopping window that uses a hop size of 10 seconds and a window size of 10 seconds.

Does this meet the goal?

A.

Yes

B.

No

Full Access
Question # 8

You have an Azure Storage account that generates 200,000 new files daily. The file names have a format of {YYYY}/{MM}/{DD}/{HH}/{CustomerID}.csv.

You need to design an Azure Data Factory solution that will load new data from the storage account to an Azure Data Lake once hourly. The solution must minimize load times and costs.

How should you configure the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 9

You plan to create a dimension table in Azure Synapse Analytics that will be less than 1 GB.

You need to create the table to meet the following requirements:

• Provide the fastest Query time.

• Minimize data movement during queries.

Which type of table should you use?

A.

hash distributed

B.

heap

C.

replicated

D.

round-robin

Full Access
Question # 10

you have a project in Azure DevOps that contains a repository named Repo1. Repo1 contains a branch named main.

You create a new Azure Synapse workspace named Workspace1.

You need to create data processing pipelines in Workspace1. The solution must meet the following requirements:

• Pipeline artifacts must be stored in Repo1.

• Source control must be provided for pipeline artifacts.

• All development must be performed in a feature branch.

which four actions should you perform in sequence in Synapse Studio? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 11

You have an Azure Stream Analytics query. The query returns a result set that contains 10,000 distinct values for a column named clusterID.

You monitor the Stream Analytics job and discover high latency.

You need to reduce the latency.

Which two actions should you perform? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

A.

Add a pass-through query.

B.

Add a temporal analytic function.

C.

Scale out the query by using PARTITION BY.

D.

Convert the query to a reference query.

E.

Increase the number of streaming units.

Full Access
Question # 12

You have an Azure subscription that contains an Azure Synapse Analytics dedicated SQL pool named Pool1. Pool1 receives new data once every 24 hours.

You have the following function.

You have the following query.

The query is executed once every 15 minutes and the @parameter value is set to the current date.

You need to minimize the time it takes for the query to return results.

Which two actions should you perform? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

A.

Create an index on the avg_f column.

B.

Convert the avg_c column into a calculated column.

C.

Create an index on the sensorid column.

D.

Enable result set caching.

E.

Change the table distribution to replicate.

Full Access
Question # 13

You are designing an Azure Stream Analytics job to process incoming events from sensors in retail environments.

You need to process the events to produce a running average of shopper counts during the previous 15 minutes, calculated at five-minute intervals.

Which type of window should you use?

A.

snapshot

B.

tumbling

C.

hopping

D.

sliding

Full Access
Question # 14

You have an Azure Data Lake Storage account that contains a staging zone.

You need to design a dairy process to ingest incremental data from the staging zone, transform the data by executing an R script, and then insert the transformed data into a data warehouse in Azure Synapse Analytics.

Solution: You use an Azure Data Factory schedule trigger to execute a pipeline that copies the data to a staging table in the data warehouse, and then uses a stored procedure to execute the R script.

Does this meet the goal?

A.

Yes

B.

No

Full Access
Question # 15

You are designing a partition strategy for a fact table in an Azure Synapse Analytics dedicated SQL pool. The table has the following specifications:

• Contain sales data for 20,000 products.

• Use hash distribution on a column named ProduclID,

• Contain 2.4 billion records for the years 20l9 and 2020.

Which number of partition ranges provides optimal compression and performance of the clustered columnstore index?

A.

40

B.

240

C.

400

D.

2,400

Full Access
Question # 16

You are designing a real-time dashboard solution that will visualize streaming data from remote sensors that connect to the internet. The streaming data must be aggregated to show the average value of each 10-second interval. The data will be discarded after being displayed in the dashboard.

The solution will use Azure Stream Analytics and must meet the following requirements:

    Minimize latency from an Azure Event hub to the dashboard.

    Minimize the required storage.

    Minimize development effort.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point

Full Access
Question # 17

You are designing an enterprise data warehouse in Azure Synapse Analytics that will contain a table named Customers. Customers will contain credit card information.

You need to recommend a solution to provide salespeople with the ability to view all the entries in Customers.

The solution must prevent all the salespeople from viewing or inferring the credit card information.

What should you include in the recommendation?

A.

data masking

B.

Always Encrypted

C.

column-level security

D.

row-level security

Full Access
Question # 18

You have a trigger in Azure Data Factory configured as shown in the following exhibit.

Use the drop-down menus to select the answer choice that completes each statement based upon the information presented in the graphic.

Full Access
Question # 19

You have an Azure Factory instance named DF1 that contains a pipeline named PL1.PL1 includes a tumbling window trigger.

You create five clones of PL1. You configure each clone pipeline to use a different data source.

You need to ensure that the execution schedules of the clone pipeline match the execution schedule of PL1.

What should you do?

A.

Add a new trigger to each cloned pipeline

B.

Associate each cloned pipeline to an existing trigger.

C.

Create a tumbling window trigger dependency for the trigger of PL1.

D.

Modify the Concurrency setting of each pipeline.

Full Access
Question # 20

Vou have an Azure Data factory pipeline that has the logic flow shown in the following exhibit.

For each of the following statements, select Yes if the statement is true. Otherwise, select No.

NOTE: Each coned selection is worth one point.

Full Access
Question # 21

You need to design the partitions for the product sales transactions. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 22

You need to implement an Azure Synapse Analytics database object for storing the sales transactions data. The solution must meet the sales transaction dataset requirements.

What solution must meet the sales transaction dataset requirements.

What should you do? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 23

You need to ensure that the Twitter feed data can be analyzed in the dedicated SQL pool. The solution must meet the customer sentiment analytics requirements.

Which three Transaction-SQL DDL commands should you run in sequence? To answer, move the appropriate commands from the list of commands to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Full Access
Question # 24

You need to implement versioned changes to the integration pipelines. The solution must meet the data integration requirements.

In which order should you perform the actions? To answer, move all actions from the list of actions to the answer area and arrange them in the correct order.

Full Access
Question # 25

You need to design a data ingestion and storage solution for the Twitter feeds. The solution must meet the customer sentiment analytics requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area

NOTE: Each correct selection b worth one point.

Full Access
Question # 26

You need to integrate the on-premises data sources and Azure Synapse Analytics. The solution must meet the data integration requirements.

Which type of integration runtime should you use?

A.

Azure-SSIS integration runtime

B.

self-hosted integration runtime

C.

Azure integration runtime

Full Access
Question # 27

You need to design a data retention solution for the Twitter feed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

A.

change feed

B.

soft delete

C.

time-based retention

D.

lifecycle management

Full Access
Question # 28

You need to design a data retention solution for the Twitter teed data records. The solution must meet the customer sentiment analytics requirements.

Which Azure Storage functionality should you include in the solution?

A.

time-based retention

B.

change feed

C.

soft delete

D.

Iifecycle management

Full Access
Question # 29

You need to design an analytical storage solution for the transactional data. The solution must meet the sales transaction dataset requirements.

What should you include in the solution? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 30

What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

A.

a server-level virtual network rule

B.

a database-level virtual network rule

C.

a database-level firewall IP rule

D.

a server-level firewall IP rule

Full Access
Question # 31

What should you do to improve high availability of the real-time data processing solution?

A.

Deploy identical Azure Stream Analytics jobs to paired regions in Azure.

B.

Deploy a High Concurrency Databricks cluster.

C.

Deploy an Azure Stream Analytics job and use an Azure Automation runbook to check the status of the job and to start the job if it stops.

D.

Set Data Lake Storage to use geo-redundant storage (GRS).

Full Access
Question # 32

You need to implement the surrogate key for the retail store table. The solution must meet the sales transaction

dataset requirements.

What should you create?

A.

a table that has an IDENTITY property

B.

a system-versioned temporal table

C.

a user-defined SEQUENCE object

D.

a table that has a FOREIGN KEY constraint

Full Access
Question # 33

Which Azure Data Factory components should you recommend using together to import the daily inventory data from the SQL server to Azure Data Lake Storage? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Full Access
Question # 34

What should you recommend using to secure sensitive customer contact information?

A.

data labels

B.

column-level security

C.

row-level security

D.

Transparent Data Encryption (TDE)

Full Access