Beginner track for machine learning

Two books and a video course

MML for math (2020)ISLP for classic ML (2023)DLS for deep learning (2024)

Two free textbooks for math and classic machine learning (ML) and a video course on deep learning (DL) make a solid entry track for beginners:

Links to other textbooks and supplementary materials are provided below.

Roadmap

Math   ML       DL                  Subfields and data types
=====  =======  =================   ============================

         +------------------------> Tabular data and time series
         |
MML  -> ISLP  -> deeplearning.ai -> Text and speech (NLP)
(free)  (free)   Deep Learning      Transformers (the T in ChatGPT)
         |       Specialisation     Computer vision (CV)
         |       + any of           Reinforcement learning (RL)
         |       3 free textbooks
         |
        Practical manuals:
        - scipy lectures (free)
        - Muller (paid), Geron (paid) or Burkov (free preview)

Python packages

Math:     ML:           DL:
- numpy   scikit-learn  - torch
- scipy                 - tf
                        - keras

Prerequisites

You will need a working knowledge of Python and ability to operate with mathematical concepts and notation from linear algebra and calculus.

Core path

  1. Check you math knowledge with Mathematics for Machine Learning (MML) Part 1.
  2. Read chapter 8 “When Models Meet Data” in MML for introduction to machine learning.
  3. Proceed to Introduction to Statistical Learning with Python (ISLP) textbook.
  4. Read from scikit-learn documentation about neural network models.
  5. Start Andrew Ng Deep Learning Specialization.

Reference texts

There are more dense textbooks than ISLP or Andrew Ng course, you can use them as references.

For classic machine learning they are Bishop (2006) and Murphy (2022).

For deep learning there are several open textbooks:

DLB (2016) is a reference text that enjoys a continious stream of citations, while d2l and UDL are newer and keep updating their code and content.

1

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code. The entire book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code. Our goal is to offer a resource that could (i) be freely available for everyone; (ii) offer sufficient technical depth to provide a starting point on the path to actually becoming an applied machine learning scientist; (iii) include runnable code, showing readers how to solve problems in practice; (iv) allow for rapid updates, both by us and also by the community at large; (v) be complemented by a forum for interactive discussion of technical details and to answer questions. [arxiv abstract]

What else?

You can supplement the core path above with the following:

Python packages

Tabular data:

Numeric computation:

Visualisation:

Machine learning:

Deep learning:

Does reading these materials make you a machine learning engineer?

Not until you make projects for real tasks on real data with real contraints (that would be quite different from textbook examples).

Not in scope

This page puts no recommendation for various skills that are also important for a quantitative modeller or an engineer:

  • Python programming, Linux and cloud computing;
  • data processing, pipelines and model productisation;
  • experiment design and iterative workflows;
  • advanced topics in statistics and machine learning;
  • modelling methods outside machine learning;
  • domain knowledge, business sense and outcomes of ML adoption.

Please refer to larger MLMW guide for coverage of these topics.