Introduction to Text Mining with R

In this online course, you will learn about the next big thing in applied analytics – text analysis. This course is self-contained: you will learn everything from basic programming skills to advanced natural language modelling for topic discovery. This course is designed around a problem-oriented approach, meaning that we will not spend too much time learning theoretical concepts but instead focus on applying them to practical problems

  • The course is related to the online specialization "Network Analytics for Business"
  • Flexible Terms
  • 5 weeks (2 credits)
  • Time to completion: 11 hours
  • Online course
  • Certificate
Apply for the specialization

About the Course

The goal of this online course is. The course gives students an opportunity to learn the methods on natural language processing (NLP) and then apply these methods to problems in students’ own areas of interest.

Each week on the course is accompanied by tests, gradable and non-gradable programming assignments, and links to additional material for those who want to dig deeper into the course material. At the end of the course, you’ll have to complete a project and then review your peers' projects.

This course is heavily tilted toward practical skills. During this course, students will dive into the basics of R for text analysis, tidy text approach, regular expressions, different algorithms for topic modelling and text classification with machine learning and deep learning approaches, and many more. Various synthetic and real-world databases will help participants see how to apply these techniques to extract insights from user reviews, social media posts, short descriptions of the products. This distance learning opportunity is brought to you by HSE University, one of the top think tanks in Russia, by instructors experienced in using text analysis for business-oriented projects.
The online course consists on short pre-recorded lectures, 5 to 15 minutes in length.
Each week will have a graded test with 10 to 15 questions. At the end of the last week, students will have to complete a project utilising the skills learned in the course, and then review and grade the projects of their peers

Course Objectives


01

To train and evaluate unsupervised learning models on text data


02

To equip students with the necessary knowledge and skills for analysing text data with R programming language


03

To learn the methods on natural language processing (NLP) and then apply them

Learning Outcomes

1. Use the R programming language to work with both structured and unstructured text data

2. Prepare text data for analysis

3. Interpret the results of unsupervised and supervised modelling

4. Apply both supervised and unsupervised machine learning techniques

Course Syllabus

Week 1. R and RStudio Basics

Week 2. Working with Tidyverse

Week 3. Supervised machine learning with the bag-of-words approach

Week 4. Unsupervised machine learning

Week 5. Final Project




Teacher
Alexander Byzov

Faculty of Social Sciences / School of Sociology: Instructor

Prerequisites

To master the discipline some knowledge of natural language processing or R programming might ease the dive into the course materials

Graduation Document

Earn a Certificate upon completion

 

 

Learning Activities


Lectures

Online


Low-Stakes Assignments

Tests


High-Stakes Assignments

Final project


Cost and Conditions


17 000 ₽

Full access to the learning materials + Graduation document

More: публичная оферта