Databricks-Certified-Associate-Developer-for-Apache-Spark-3.0 Exam Dumps - Databricks Certified Associate Developer for Apache Spark 3.0 Exam

Go to page:

Question # 25

The code block shown below should set the number of partitions that Spark uses when shuffling data for joins or aggregations to 100. Choose the answer that correctly fills the blanks in the code

block to accomplish this.

spark.sql.shuffle.partitions

__1__.__2__.__3__(__4__, 100)

1. spark

2. conf

3. set

4. "spark.sql.shuffle.partitions"

1. pyspark

2. config

3. set

4. spark.shuffle.partitions

1. spark

2. conf

3. get

4. "spark.sql.shuffle.partitions"

1. pyspark

2. config

3. set

4. "spark.sql.shuffle.partitions"

1. spark

2. conf

3. set

4. "spark.sql.aggregate.partitions"

Full Access

Question # 26

The code block displayed below contains an error. The code block should arrange the rows of DataFrame transactionsDf using information from two columns in an ordered fashion, arranging first by

column value, showing smaller numbers at the top and greater numbers at the bottom, and then by column predError, for which all values should be arranged in the inverse way of the order of items

in column value. Find the error.

Code block:

transactionsDf.orderBy('value', asc_nulls_first(col('predError')))

Two orderBy statements with calls to the individual columns should be chained, instead of having both columns in one orderBy statement.

Column value should be wrapped by the col() operator.

Column predError should be sorted in a descending way, putting nulls last.

Column predError should be sorted by desc_nulls_first() instead.

Instead of orderBy, sort should be used.

Full Access

Question # 27

Which of the following code blocks returns a one-column DataFrame of all values in column supplier of DataFrame itemsDf that do not contain the letter X? In the DataFrame, every value should

only be listed once.

Sample of DataFrame itemsDf:

1.+------+--------------------+--------------------+-------------------+

3.+------+--------------------+--------------------+-------------------+

7.+------+--------------------+--------------------+-------------------+

itemsDf.filter(col(supplier).not_contains('X')).select(supplier).distinct()

itemsDf.select(~col('supplier').contains('X')).distinct()

itemsDf.filter(not(col('supplier').contains('X'))).select('supplier').unique()

itemsDf.filter(~col('supplier').contains('X')).select('supplier').distinct()

itemsDf.filter(!col('supplier').contains('X')).select(col('supplier')).unique()

Full Access