Databricks • DCMLEP
Validates advanced expertise in designing and managing enterprise-scale machine learning solutions on Databricks, covering scalable model development with distributed training, MLOps practices including testing and deployment with Databricks Asset Bundles, and model monitoring with Lakehouse Monitoring.
Questions
622
Duration
120 minutes
Passing Score
70%
Difficulty
ProfessionalLast Updated
Feb 2026
The Databricks Certified Machine Learning Professional certification validates advanced expertise in designing, implementing, and managing enterprise-scale machine learning solutions on the Databricks Lakehouse Platform. It covers the full spectrum of production ML engineering: building scalable pipelines with SparkML, implementing distributed training and hyperparameter tuning using Ray and Optuna, and leveraging advanced MLflow capabilities such as nested runs, custom metrics, model flavors, PyFunc custom models, and Model Registry webhooks.
The certification also emphasizes modern MLOps practices, including ML pipeline testing strategies (unit and integration tests), environment management via Databricks Asset Bundles (DABs) for infrastructure-as-code, automated retraining workflows, and production monitoring with Lakehouse Monitoring for detecting feature drift, label drift, prediction drift, and concept drift. Model deployment topics include blue-green and canary deployment strategies, custom model serving endpoints, and rollout management through Databricks Model Serving and the MLflow Deployments SDK. The exam was updated in September 2025 to consolidate its structure into three core domains.
This certification is designed for senior ML engineers, MLOps engineers, and data scientists with at least one year of hands-on experience building and operationalizing machine learning workflows on Databricks. It is appropriate for professionals who work at enterprise scale—managing multi-environment ML deployments, automated retraining pipelines, and production monitoring—rather than those focused solely on model experimentation.
Typical candidates hold roles such as Machine Learning Engineer, MLOps Engineer, Senior Data Scientist, or ML Platform Engineer. Those already holding the Databricks Certified Machine Learning Associate credential who want to demonstrate deeper production-level expertise are also a natural fit for this exam.
There are no formal prerequisites required to register for this exam. However, Databricks strongly recommends at least one year of hands-on experience performing the advanced ML engineering tasks outlined in the official exam guide. Candidates are expected to have practical familiarity with the Databricks platform, Apache Spark and SparkML, MLflow experiment tracking and model registry, and Python-based ML workflows.
Databricks recommends completing the instructor-led courses 'Machine Learning at Scale' and 'Advanced MLOps on Databricks' before attempting the exam. Candidates who hold the Databricks Certified Machine Learning Associate credential will find that foundational knowledge helpful, though it is not a required prerequisite.
The exam consists of 59 scored multiple-choice questions to be completed within 120 minutes. All questions are multiple-choice; there are no hands-on labs or interactive coding tasks. Many questions are scenario-based, presenting real-world Databricks ML workflows and asking candidates to select the most appropriate approach. The exam may also include a small number of unscored items collected for statistical research purposes; these are not identified on the exam form and do not affect the final score, with additional time factored in to accommodate them.
The exam is delivered online through Databricks' exam delivery platform and costs USD $200. A passing score of 70% is required. Certification is valid for two years, after which recertification requires retaking the current version of the exam. The current version of the exam is the September 2025 edition.
Earning this certification positions ML engineers and data scientists for senior and staff-level roles that require end-to-end ownership of production ML systems. Job titles commonly associated with this credential include Senior Machine Learning Engineer, MLOps Engineer, ML Platform Engineer, and AI/ML Architect. Organizations adopting the Databricks Lakehouse Platform at scale—particularly in finance, healthcare, retail, and technology sectors—actively seek professionals who can demonstrate validated expertise in production ML workflows rather than just model development.
1. An MLOps engineer is configuring a production Model Serving endpoint for a classification model. The endpoint must guarantee availability during deployments and handle at least 16 concurrent requests. Which configuration meets these requirements with minimum cost? (Select one!)
2. An ML engineer is migrating models from the workspace Model Registry to Unity Catalog. In the legacy workspace registry, they used stages to manage model deployments with multiple versions in Production stage simultaneously. How should they replicate this pattern in Unity Catalog? (Select one!)
3. A data science team is using MLflow autolog for TensorFlow model training. They want to log model checkpoints and training metrics but need to disable logging of input examples due to data privacy concerns and memory constraints. They also want to ensure model signatures are still captured. Which autolog configuration should they use? (Select one!)
4. An MLOps engineer needs to evaluate a multiclass classification model built with SparkML. The business stakeholder wants to optimize for overall classification accuracy across all classes. Which evaluator and metric combination should they use? (Select one!)
5. A machine learning team is using SparkML to build a classification pipeline. They need to encode a categorical feature city with values San Francisco, New York, and Los Angeles before feeding it to a RandomForestClassifier. The city feature has high cardinality and some cities appear infrequently. Which parameter setting in StringIndexer will handle unseen cities during prediction without causing errors? (Select one!)
All exams included • Cancel anytime