Databricks Certified Data Engineer Professional Sample Questions:
1. A data engineer is brining an existing production Databricks job under asset bundle management and wants to ensure that:
- The job's current configuration is captured as YAML, and all
referenced files are included in their bundle project.
- Future changes to the bundle's YAML will update the existing job in-
place (not create a new job)
How should the data engineer successfully move the production job under asset bundle management?
A) Manually create the YAML configuration for the job in your bundle project, ensuring all settings match the existing job. Then, run Databricks bundle deploy the bundle, which will update the existing job in your workspace.
B) Run Databricks bundle generate job --existing-job-id to generate the YAML and download referenced files. Then, run Databricks bundle deploy to deploy the bundle, which will always update the existing job automatically.
C) Export the job definition as JSON, convert it to YAML, and place it in your bundle. Then, run Databricks bundle deploy to update the existing job.
D) Run databricks bundle generate job --existing-job-id to generate the YAML and download referenced files. Then, run Databricks bundle deployment, bind to link the bundle's job resource to the existing job in Databricks.
2. A data engineer us ingesting JSON files from cloud object storage using Databricks Auto Loader.
The source folder may occasionally receive large files of data, which risks overwhelming the stream. To ensure predictable micro-batch sizes, the team wants to throttle ingestion based on the volume of data scanned at 1 GB, regardless of the number of files. Which Auto Loader configuration should the data engineer used to achieve this?
A) Configure cloudFiles.maxPartitionBytes with 1GB to limit data in each partition.
B) Configure cloudFiles.maxFilesPerTrigger and estimate the average file size to approximate a size-based throttle of 1 GB.
C) Configure cloudFiles.maxSizePerTrigger with 1 GB to place a limit.
D) Configure cloudFiles.maxBytesPerTrigger with 1 GB to place a limit.
3. A Delta Lake table representing metadata about content from user has the following schema:
user_id LONG, post_text STRING, post_id STRING, longitude FLOAT, latitude FLOAT, post_time TIMESTAMP, date DATE Based on the above schema, which column is a good candidate for partitioning the Delta Table?
A) Date
B) Post_id
C) User_id
D) Post_time
E) latitude
4. Which approach demonstrates a modular and testable way to use DataFrame transform for ETL code in PySpark?
A)
B)
C)
D) 
5. A Data engineer wants to run unit's tests using common Python testing frameworks on python functions defined across several Databricks notebooks currently used in production. How can the data engineer run unit tests against function that work with data in production?
A) Run unit tests against non-production data that closely mirrors production
B) Define units test and functions within the same notebook
C) Define and unit test functions using Files in Repos
D) Define and import unit test functions from a separate Databricks notebook
Solutions:
| Question # 1 Answer: D | Question # 2 Answer: D | Question # 3 Answer: A | Question # 4 Answer: A | Question # 5 Answer: A |














903 Customer Reviews
Quality and ValueITCertKing Practice Exams are written to the highest standards of technical accuracy, using only certified subject matter experts and published authors for development - no all study materials.
Tested and ApprovedWe are committed to the process of vendor and third party approvals. We believe professionals and executives alike deserve the confidence of quality coverage these authorizations provide.
Easy to PassIf you prepare for the exams using our ITCertKing testing engine, It is easy to succeed for all certifications in the first attempt. You don't have to deal with all dumps or any free torrent / rapidshare all stuff.
Try Before BuyITCertKing offers free demo of each product. You can check out the interface, question quality and usability of our practice exams before you decide to buy.
