Databricks is one of the most powerful data platforms out there. It combines a Data Platform and an AI Platform into the extremely powerful Databricks Data Intelligence Platform, that enables organizations to capture the true potential of their data and their data teams.

To enable users to make the most of its countless features, Databricks has opened free overview courses to partners and customers in the Databricks Academy. These courses are linked to paid certifications targeting different roles, including for ML professionals. These courses and certification paths are a great way of learning more about Databricks.

There are two relevant Databricks certifications for Machine Learning:

Machine Learning Associate
Machine Learning Professional

Often, these are pursued by Machine Learning Engineers (of all flavours including MLOPS) and Data Scientists that want to prove their Databricks expertise, and acquire a more comprehensive view of what the platform offers.

This was the case for me. I am a Machine Learning Engineer. While preparing for the Machine Learning Associate certification, I noticed the lack of good public content out there to prepare candidates for the certification. I’ve passed the certification, and written this article to help you conquer your badge too.

This article covers relevant topics for the Associate certification. This is not an introduction to Machine Learning or Databricks; rather, this is a topic review for the Databricks Machine Learning Associate certification. Be sharp on the following topics before taking the exam.

The Exam

To obtain the Databricks Machine Learning Associate certification, you need to pass an online-proctored exam. The certification is intended for professionals with at least six-month experience in the platform, and it’s valid for 2 years.

In under 90 minutes, you will need to achieved a 70% mark on 45 questions. Some of these questions focus on code and the Databricks platform, some on modelling and theoretical Machine Learning concepts.

The exam is only available in English.

It is divided in four sections with the following weights and topics:

Databricks Machine Learning (29%): Databricks ML, Databricks Runtime for Machine Learning, AutoML, Feature Store, Managed MLflow
ML Workflows (29%): Exploratory Data Analysis, Feature Engineering, Training, Evaluation and Selection
Spark ML (33%): Distributed ML Concepts, Spark ML Modelling APIs, Hyperopt, Pandas API on Spark, Pandas UDFs/Function APIs
Scaling ML Models (9%): Model Distribution, Ensembling Distribution

More details are available in the official Exam Guide.

Databricks ML Associate Exam: 34 useful things to know

Below we have displayed all 34 topics, linking to the right section in the blog post which was originally posted on Medium.com.

Databricks Machine Learning

Good luck!

Enroll now for the Databricks Machine Learning Associate certification, and go get your badge. Good luck with your exam!

If you found this article useful, please like, comment, share, or buy me a coffee.