Skip to content

Latest commit

 

History

History
222 lines (141 loc) · 11.6 KB

2024-03-07-mlops-zoomcamp.md

File metadata and controls

222 lines (141 loc) · 11.6 KB
authors description image layout subtitle tags title
valeriiakuka
Learn MLOps principles and take your projects from the notebook to production in 9 weeks
images/posts/2024-03-07-mlops-zoomcamp/image9.png
post
Learn MLOps principles and take your projects from the notebook to production in 9 weeks
courses
mlops
MLOps Zoomcamp

In this article, we take a closer look at the MLOps Zoomcamp, a free nine-week course that covers practical aspects of productionizing ML services — from training and experimenting to model deployment and monitoring. It is perfect for people who plan to work with ML services at any stage.

Course Curriculum

We will describe different aspects of this course so you can learn more about it:

  • Key course features
  • Who is the course for?
  • Course curriculum
  • Course project for your portfolio
  • Course assignments
  • Learning in public
  • DataTalks.Club community

The next cohort of the course starts on 13 May (Monday), 2024. If you’re ready to join, sign up here{:target="_blank"}.

Why is it important?

MLOps (machine learning operations) is becoming a must-know skill for many data professionals. With all the noise around that topic, it might be difficult to find one source covering the basics of each stage of the MLOps cycle and giving you the practical knowledge you could apply to your work.

That's why DataTalks.Club created the free MLOps course. It is practical and focused, designed to help you take your projects from the notebook to production.

Key features of MLOps Zoomcamp

  • Comprehensive Curriculum: The course explores each part of the entire MLOps cycle
  • Hands-On Project: A final project to apply the skills learned from the course and enhance your portfolio
  • Diverse materials: Video lectures, code samples, and community notes. Weekly homework for practice.
  • Supportive community: Course channel in Slack to ask questions and interact with peers and instructors.
  • Expert Instructors: Cristian Martinez, Alexey Grigorev, Emeli Dral, and others.

Who is the course for?

This course is for:

  • Data scientists
  • ML engineers
  • Software developers who are interested in understanding MLOps, the process of putting machine learning code in production.

MLOps involves transitioning the raw code from a development environment into a deployed model within a live service, including stages for performance monitoring and problem-solving.

Before we get into the details, it’s important to know what skills you should have to follow the course comfortably.

Here are the main prerequisites for the course:

  • Prior programming experience (at least 1+ year)

  • Prior exposure to machine learning (at work or from other courses, e.g. from ML Zoomcamp{:target="_blank"})

  • Being comfortable with the command line

  • Python

  • Docker (you can check ML Zoomcamp{:target="_blank"} for that)

GitHub repository of the course

Course Curriculum

The course curriculum is structured to guide you step-by-step through each stage of the MLOps cycle starting from experimentation and model selection to model deployment to monitoring. You’ll spend the first six weeks learning and practicing each part of the MLOps cycle. In the concluding two weeks, you will apply your acquired knowledge and skills to develop an end-to-end machine learning project{:target="_blank"}.

Course Curriculum
  • Week 1: Introduction & Prerequisites
  • Week 2: Experiment tracking and model management
  • Week 3: Orchestration and ML Pipelines
  • Week 4: Model Deployment
  • Week 5: Model Monitoring
  • Week 6: Best Practices
  • Weeks 7, 8, 9: Project

Let's quickly go over each week, focusing on the main points and the tech you'll use.

Course opening presentation from the previous cohort

Week 1: Introduction & Prerequisites

Tech: Docker, AWS

Focus: Week 1 is dedicated to setting up the key tools and technologies you'll be using throughout the course and introducing you to the concept of MLOps and why we need to use that concept.

Week 2: Experiment tracking and model management

Tech: MLFlow

Focus: Week 2 covers experiment tracking to store and organize relevant information about your experiments. For example, the input data, source code, model architecture parameters, and corresponding outputs of the model.

Week 3: Orchestration and ML Pipelines

Tech: Mage

Focus: Week 3 focuses on creating a production-ready pipeline for training machine learning models. It means that the pipeline is easily reproduced, and re-run, in a fully automated way.

Week 4: Model Deployment

Tech: Flask, Docker, MLflow, Mage, AWS Lambda & AWS Kinesis

Focus: Week 4 introduces you to the three ways of model deployment and gives you a demonstration of how to work with each of them.

Week 5: Model Monitoring

Tech: Prometheus, Evidently AI, and Grafana

Focus: Week 5 is about monitoring machine learning models including service health, model performance, data quality and integrity, and data Drift & concept drift.

Week 6: Best Practices

Tech: Python, Docker, Localstack, Github Actions

Focus: Week 6 summarizes the best practices like unit tests, integration tests, checking code quality, and automating deployments with CI/CD and GitHub Actions.

Weeks 7, 8, 9: Project

Duration: 2 weeks for development, 1 week for peer review

Objective: The project focuses on applying your acquired skills to build a data engineering pipeline from scratch. Completing this hands-on project not only validates your skills but also enhances your portfolio, offering a competitive edge in job searches.

Peer Review: To complete the project, you are required to evaluate projects from at least three of your peers. Failure to do so will result in your project being marked as incomplete. For detailed peer review criteria, check this link{:target="_blank"}.

Project Requirements:

  • Choose a dataset of interest
  • Train a model on the selected dataset and track your experiments
  • Develop a pipeline for model training
  • Deploy the model in a batch, as a web service, or in a streaming format
  • Monitor your model's performance
  • Adhere to best practices

Star history of the MLOps Zoomcamp GitHub repository

To support us, star the repository of the MLOps Zoomcamp. You can do it here{:target="_blank"}.

The course description{:target="_blank"} on GitHub provides a detailed overview of the topics covered each week. You can see the video lectures, slides, code, and community notes for each week of the course to dive into the content. By the end of the course, you will have acquired the fundamental skills necessary for a career as a data engineer.

If you’re ready to join the next cohort of the course, submit this form{:target="_blank"} to register and stay updated.

Course assignments and scoring

Homework and getting feedback

To reinforce your learning, you can submit a homework assignment at the end of each week. Your scores are added to an anonymous leaderboard, creating friendly competition among course members and motivating you to do your best.

The leaderboard with scored homework

For support, we have an FAQ{:target="_blank"} section with quick answers to common questions. If you need more help, our Slack community{:target="_blank"} is always available for technical questions, clarifications, or guidance. Additionally, we host live Q&A sessions called "office hours" where you can interact with instructors and get immediate answers to your questions.

A screenshot of a FAQ document

Learning in public approach

A unique feature is our "learning in public" approach, inspired by Shawn @swyx Wang{:target="_blank"}'s article{:target="_blank"}. We believe that everyone has something valuable to contribute, regardless of their expertise level.

An extract from Shawn @swyx Wang's article about learning in public

Throughout the course, we actively encourage and incentivize learning in public. By sharing your progress, insights, and projects online, you earn additional points for your homework and projects.

Anonymous leaderboard from the previous cohort of the course. On the right, you can see the bonus points for learning in public

This not only demonstrates your knowledge but also builds a portfolio of valuable content. Sharing your work online also helps you get noticed by social media algorithms, reaching a broader audience and creating opportunities to connect with individuals and organizations you may not have encountered otherwise.

DataTalks.Club community

DataTalks.Club has a supportive community of like-minded individuals in our Slack{:target="_blank"}. It is the perfect place to enhance your skills, deepen your knowledge, and connect with peers who share your passion. These connections can lead to lasting friendships, potential collaborations in future projects, and exciting career prospects.

Course channel in our Slack community

Conclusion

The MLOps Zoomcamp offers a comprehensive and hands-on approach to mastering the essentials of machine learning operations. It provides a solid foundation for anyone looking to integrate ML services into real-world applications. Whether you’re looking to enhance your portfolio, boost your career prospects, or simply deepen your understanding of MLOps, this course can help you achieve your goals.

Again, the next cohort starts on 13 May (Monday), 2024. You can register{:target="_blank"} for the MLOps Zoomcamp now.