Databricks-Certified-Data-Engineer-Associate Exam Dumps - Databricks Certified Data Engineer Associate Exam

Go to page:

Question # 25

A data engineer is attempting to drop a Spark SQL table my_table. The data engineer wants to delete all table metadata and data.

They run the following command:

DROP TABLE IF EXISTS my_table

While the object no longer appears when they run SHOW TABLES, the data files still exist.

Which of the following describes why the data files still exist and the metadata files were deleted?

The tableâ€™s data was larger than 10 GB

The tableâ€™s data was smaller than 10 GB

The table was external

The table did not have a location

The table was managed

Full Access

Question # 26

An engineering manager uses a Databricks SQL query to monitor ingestion latency for each data source. The manager checks the results of the query every day, but they are manually rerunning the query each day and waiting for the results.

Which of the following approaches can the manager use to ensure the results of the query are updated each day?

They can schedule the query to refresh every 1 day from the SQL endpoint's page in Databricks SQL.

They can schedule the query to refresh every 12 hours from the SQL endpoint's page in Databricks SQL.

They can schedule the query to refresh every 1 day from the query's page in Databricks SQL.

They can schedule the query to run every 1 day from the Jobs UI.

They can schedule the query to run every 12 hours from the Jobs UI.

Full Access

Question # 27

A data analyst has a series of queries in a SQL program. The data analyst wants this program to run every day. They only want the final query in the program to run on Sundays. They ask for help from the data engineering team to complete this task.

Which of the following approaches could be used by the data engineering team to complete this task?

They could submit a feature request with Databricks to add this functionality.

They could wrap the queries using PySpark and use Pythonâ€™s control flow system to determine when to run the final query.

They could only run the entire program on Sundays.

They could automatically restrict access to the source table in the final query so that it is only accessible on Sundays.

They could redesign the data model to separate the data used in the final query into a new table.

Full Access

Question # 28

A Delta Live Table pipeline includes two datasets defined using STREAMING LIVE TABLE. Three datasets are defined against Delta Lake table sources using LIVE TABLE.

The table is configured to run in Production mode using the Continuous Pipeline Mode.

Assuming previously unprocessed data exists and all definitions are valid, what is the expected outcome after clicking Start to update the pipeline?

All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will persist to allow for additional testing.

All datasets will be updated once and the pipeline will persist without any processing. The compute resources will persist but go unused.

All datasets will be updated at set intervals until the pipeline is shut down. The compute resources will be deployed for the update and terminated when the pipeline is stopped.

All datasets will be updated once and the pipeline will shut down. The compute resources will be terminated.

All datasets will be updated once and the pipeline will shut down. The compute resources will persist to allow for additional testing.

Full Access

Question # 29

Which of the following describes the relationship between Bronze tables and raw data?

Bronze tables contain less data than raw data files.

Bronze tables contain more truthful data than raw data.

Bronze tables contain aggregates while raw data is unaggregated.

Bronze tables contain a less refined view of data than raw data.

Bronze tables contain raw data with a schema applied.

Full Access

Question # 30

Which of the following Structured Streaming queries is performing a hop from a Silver table to a Gold table?

Full Access

Question # 31

A data engineer needs to create a table in Databricks using data from their organization's existing SQLite database. They run the following command:

CREATE TABLE jdbc_customer360

USING

OPTIONS (

url "jdbc:sqlite:/customers.db", dbtable "customer360"

)

Which line of code fills in the above blank to successfully complete the task?

autoloader

org.apache.spark.sql.jdbc

sqlite

org.apache.spark.sql.sqlite

Full Access

Question # 32

A data engineer wants to create a new table containing the names of customers who live in France.

They have written the following command:

CREATE TABLE customersInFrance

_____ AS

SELECT id,

firstName,

lastName

FROM customerLocations

WHERE country = â€™FRANCEâ€™;

A senior data engineer mentions that it is organization policy to include a table property indicating that the new table includes personally identifiable information (Pll).

Which line of code fills in the above blank to successfully complete the task?

COMMENT "Contains PIT

511

"COMMENT PII"

TBLPROPERTIES PII

Full Access