Machine Learning My Way (MLMW) and Models, Data, Tools and Productisation (MDTP) notes

Version number Reddit Subscribe Telegram group

MLMW and MDTP are the two collection of notes that describe the world of machine learning from the following angles:

  • modelling concepts and techniques (the “formulas”),
  • data analysis and control systems workflows (the “pipelines”)
  • data acquisition, storage and operations (the “data”),
  • computer science and software tools (the “code”),
  • model and data productisation (the “money”),
  • investor sentiment, society impacts and regulation (the “markets”)

Both collections are still incomplete, yet equipped with links to textbooks, reports, summary articles, industry cases and excercises. Only a limited number of passive learning resouces like videos or tutorials made it to the lists.

Get access and download

The collections are availale as a topic list, a longer guide and the website.

ArtifactIntentLink
MDTP topic listA slim list of topics I wish I knew well (models, tools and productisation)Access granted upon request.
The MLMW guideA collection of topics with links and quotes.Ealier public PDF or view online upon request.
WebsiteBest of MLMWBrowse at https://trics.me.

Changelog

Subscribe for MLMW guide updates:



Beginner track for machine learning

Two books and a video course

MML for math (2020)ISLP for classic ML (2023)DLS for deep learning (2024)

Two free textbooks for math and classic machine learning (ML) and a video course on deep learning (DL) make a solid entry track for beginners:

Links to other textbooks and supplementary materials are provided below.

Roadmap

Math   ML       DL                  Subfields and data types
=====  =======  =================   ============================

         +------------------------> Tabular data and time series
         |
MML  -> ISLP  -> deeplearning.ai -> Text and speech (NLP)
(free)  (free)   Deep Learning      Transformers (the T in ChatGPT)
         |       Specialisation     Computer vision (CV)
         |       + any of           Reinforcement learning (RL)
         |       3 free textbooks
         |
        Practical manuals:
        - scipy lectures (free)
        - Muller (paid), Geron (paid) or Burkov (free preview)

Python packages

Math:     ML:           DL:
- numpy   scikit-learn  - torch
- scipy                 - tf
                        - keras

Prerequisites

You will need a working knowledge of Python and ability to operate with mathematical concepts and notation from linear algebra and calculus.

Core path

  1. Check you math knowledge with Mathematics for Machine Learning (MML) Part 1.
  2. Read chapter 8 “When Models Meet Data” in MML for introduction to machine learning.
  3. Proceed to Introduction to Statistical Learning with Python (ISLP) textbook.
  4. Read from scikit-learn documentation about neural network models.
  5. Start Andrew Ng Deep Learning Specialization.

Reference texts

There are more dense textbooks than ISLP or Andrew Ng course, you can use them as references.

For classic machine learning they are Bishop (2006) and Murphy (2022).

For deep learning there are several open textbooks:

DLB (2016) is a reference text that enjoys a continious stream of citations, while d2l and UDL are newer and keep updating their code and content.

1

This open-source book represents our attempt to make deep learning approachable, teaching readers the concepts, the context, and the code. The entire book is drafted in Jupyter notebooks, seamlessly integrating exposition figures, math, and interactive examples with self-contained code. Our goal is to offer a resource that could (i) be freely available for everyone; (ii) offer sufficient technical depth to provide a starting point on the path to actually becoming an applied machine learning scientist; (iii) include runnable code, showing readers how to solve problems in practice; (iv) allow for rapid updates, both by us and also by the community at large; (v) be complemented by a forum for interactive discussion of technical details and to answer questions. [arxiv abstract]

What else?

You can supplement the core path above with the following:

Python packages

Tabular data:

Numeric computation:

Visualisation:

Machine learning:

Deep learning:

Does reading these materials make you a machine learning engineer?

Not until you make projects for real tasks on real data with real contraints (that would be quite different from textbook examples).

Not in scope

This page puts no recommendation for various skills that are also important for a quantitative modeller or an engineer:

  • Python programming, Linux and cloud computing;
  • data processing, pipelines and model productisation;
  • experiment design and iterative workflows;
  • advanced topics in statistics and machine learning;
  • modelling methods outside machine learning;
  • domain knowledge, business sense and outcomes of ML adoption.

Please refer to larger MLMW guide for coverage of these topics.

Interviews

randomlyCoding on production pipelines, engineering skills and job roles

randomlyCoding, a head of AI at a startup who has been working in the field for over a decade: “I certainly don’t know everything, but I like to get my feet wet and touch on anything I find interesting. I’ve trained ML models to do all sorts of tasks and will likely have at least heard of most things.”

MLMW: Can one summarize a production pipeline as the following: choosing a business and then a modelling hypothesis – dataset – model selection – training – validation – model rollout followed by business metrics? What are the weak links in this process and where a pipeline may break?

randomlyCoding: In general that’s about on point. I’d say there’s certainly a lot more recursion. For example you might pick a dataset, build a model and train it, only to realize you’ve massively overfitting because you don’t have enough data – thus you go looking for a bigger additional dataset. Weak links often occur at either end of the process – you pick a dataset that isn’t suited to your problem and thus end up with a solution that solves a problem you weren’t trying to solve or the model is 100% perfect but the business case requires inference to happen in real time and it takes 20 minutes based on the size of the model. I’ve also seen cases of trying to extend a model to do more than it was initially designed for. This isn’t always a bad idea but if the person leading this doesn’t understand the underlying model there can often be misalignment between their expectations and reality.

MLMW: What skills would you expect an ML engineer (MLE) to know? How can an decent econometrician upgrade to an MLE?*

randomlyCoding: I would expect any ML engineer to know one of three Python packages that are the core of most ML processes (either pytorch, tensorflow or keras), but on top of that I’d expect familiarity with some domain specific packages, that might be NLTK if you’re working on natural language processing; it might be scikit-learn if you’re looking at random forests. One thing I would say is usually a must is familiarity with Linux and a cloud provider (AWS, GCP, Azure). You don’t need to know all 3 cloud providers (pick AWS if you don’t know any yet – it has 50% market share) but if you don’t know any of them it’ll be harder to on board you and your first few weeks would be a lot more overwhelming – even knowing a different one to the one you use at a specific job will help as they all have similar functionality.

MLMW: Who puts and ML model into production? You got the weights after training, validation stage passed ok, then it always becomes a small API? Who wraps a notebook into API, a designated engineer?

randomlyCoding: Who puts the ML model into production can vary depending on the system in use, it’s often an API but not always. I would expect any ML engineer to at least be able to put together a notebook (or similar) that can be used to run inference on the model; in some cases if the organisation is small enough it will be someone who has directly worked on the model; in other cases they may be using a specific orchestration packages that abstracts away this process; in yet other cases it could be hidden behind a message broker. Obviously not all ML models need to be hosted all the time, some are run periodically and they might not require anything more than ingesting a CSV file into a single python script.

MLMW: Does a full-stack data scientist role still exist?

randomlyCoding: I think the full-stack data scientist role does still exist, it will always exists as long as there are start-ups that have limited budgets and big ideas. If you’re in a larger team your remit will often be constrained to a specific task, but depending on the organisation your within that task could change regularly (eg. today you’re handling data ingestion because the model we’re working on is a transformer and you don’t have much experience with transformers, but tomorrow we’re building a reinforcement learning system and you’re the team’s expert in RL). In most teams I’d expect the architect of the model to also do a fair amount of the modelling itself; anyone doing modelling will have to work closing with the data engineers, etc. I think this mean the roles aren’t as well defined as in SWE and I think this is because there’s a lot of trial and error in ML so it’s not as simple as for example ingest the data and pass the process on.

MLMW: What is the most unexpected case of a model you thought would not work but it did?

randomlyCoding: Diffusion feels like it shouldn’t work. In general it’s a multi-step process of removing noise from an image until you end up with the image without any noise; but to do that you start with a completely noisy image and then predict a small percentage of the noise that was added (the previous step of noise added) and then subtract that noise from the image. The maths behind it is reasonably simple, but it just feels like it shouldn’t work!

Video series

MLMW is inherently a text-format guide (books, articles, code) with an exception for these quality videos and podcasts.

Probability and statistics

Beginner

Courses

Reference

Advanced