How to Win a Data Science Competition:
Learn from Top Kagglers

If you want to break into competitive data science, then this course is for you!

  • The course is related to the online specialization ''Advanced Machine Learning"
  • Flexible Terms
  • 5 weeks (3 credits)
  • Time to completion: 52 hours
  • Online course
  • Certificate
Apply for the specialization

About the Course

Participating in predictive modelling competitions can help you gain practical experience, improve and harness your data modelling skills in various domains such as credit, insurance, marketing, natural language processing, sales’ forecasting and computer vision to name a few. At the same time you get to do it in a competitive context against thousands of participants where each one tries to build the most predictive algorithm. Pushing each other to the limit can result in better performance and smaller prediction errors. Being able to achieve high ranks consistently can help you accelerate your career in data science. In this course, you will learn to analyse and solve competitively such predictive modelling tasks.

Disclaimer: This is not a machine learning online course in the general sense. This course will teach you how to get high-rank solutions against thousands of competitors with focus on practical usage of machine learning methods rather than the theoretical underpinnings behind them

Course Objectives


01

Understand how to solve predictive modelling competitions efficiently and learn which of the skills obtained can be applicable to real-world task


02

Learn how to preprocess the data and generate new features from various sources such as text and images


03

Be taught advanced feature engineering techniques like generating mean-encodings, using aggregated statistical measures or finding nearest neighbors as a means to improve your predictions


04

Be able to form reliable cross validation methodologies that help you benchmark your solutions and avoid overfitting or underfitting when tested with unobserved (test) data


05

Gain experience of analysing and interpreting the data. You will become aware of inconsistencies, high noise levels, errors and other data-related issues such as leakages and you will learn how to overcome them


06

Acquire knowledge of different algorithms and learn how to efficiently tune their hyperparameters and achieve top performance

Learning Outcomes

1. Data Analysis

2. Feature Extraction

3. Feature Engineering

4. Xgboost

Course Syllabus

Week 1. Introduction & Recap. Feature Preprocessing and Generation with Respect to Models. Final Project Description

Week 2. Exploratory Data Analysis. Validation. Data Leakages

Week 3. Metrics Optimization. Advanced Feature Engineering I

Week 4. Hyperparameter Optimization. Advanced feature engineering II. Ensembling

Week 5. Competitions go through. Final Project




Teachers
Dmitry Ulyanov

HSE Faculty of Computer Science: Visiting lecturer

Alexander Guschin

HSE Faculty of Computer Science: Visiting lecturer

Mikhail Trofimov

HSE Faculty of Computer Science: Visiting lecturer

Dmitry Altukhov

HSE Faculty of Computer Science: Visiting lecturer

Marios Michailidis

H2O.ai: Research Data Scientist

Prerequisites

Python: work with DataFrames in pandas, plot figures in matplotlib, import and train models from scikit-learn, XGBoost, LightGBM

Machine Learning: basic understanding of linear models, K-NN, random forest, gradient boosting and neural networks

Graduation Document

Earn a Certificate upon completion

 

 

Learning Activities


Lectures

Online


Low-Stakes Assignments

Tests


High-Stakes Assignments

Final project


Cost and Conditions


21 000 ₽

Full access to the learning materials + Graduation document

More: публичная оферта