How to Learn Machine Learning and Deep Learning: a guide for Software Engineers

Introduction

The subject of Artificial Intelligence piques my interest and I’m constantly studying and trying new things in this field.

It is notorious how the technologies related to Natural Language Processing, Computer Vision and such have emerged and evolved into solutions used by millions of users every day.

Even though people use the term "Artificial Intelligence", we are still far away from something as advanced as a Skynet from the Terminator movies.

The most common subfield of AI used today is the one called Machine Learning, which, in its turn, has Deep Learning as subfield steeply growing every day for quite some time now.

In this guide, I aim to describe a path to follow for software engineers to begin understanding how Machine Learning works and how to apply it to your projects.

Yeah, you can just go to Google API’s or Amazon and pick some magical API to do Speech Recognition for you, but the value of knowing how it works, why it works and even more, how to make your own API as a Service and tune it to your specific needs is incredible.

Remember, as a developer, every tool is a new power.

I’ve read, watched and gone through all these resources until the end, even got a paid certification for some, even though it is not necessary to learn, I find myself more engaged to finish when I have some deadline and assessment to prove I actually learned the material.

Let’s dive into the topics.

Python

Python is the main language these days when working with Data Science, Machine Learning, and Deep Learning.

If you need a crash course on Python, here is your guide: The Python Guide for Beginners.

The Basics: Math!

Maybe you never had the chance to study some college-level math, or you did study it but you can’t remember most of the stuff because JavaScript and CSS took all the memory of those topics away.

There are 3 topics you must know beforehand, or at least have a decent grasp of to follow any good material on ML and DL: Linear Algebra, Calculus and Statistics.

If you’d like to go deep in learning the math needed to ML and DL, you can look for MIT OpenCourseWare classes like Professor Strang’s renowned Linear Algebra class.

I’ve watched it in college in parallel with my regular class and it is very good.

But, let’s face it, most people have no time for that or the patience.

So I will give you the crash course for the 3 topics mentioned above.

Linear Algebra

Just watch the whole series Essence of Linear Algebra from the Youtube channel 3Blue1Brown.

The guy makes visual explanations of once hard concepts incredibly easy!

It is very far in terms of content compared to Professor Strang’s, but it’s enough, to begin with, and you can go after other topics as you advance in ML and DL.

Calculus

Guess what?

3Blue1Brown also has a whole series on Calculus on Youtube for you to watch for free: Essence of Calculus.

Again, he is very good at giving you the intuition of why and how rather than just throw some random equations on your face.

Statistics

This is a whole field that, in my opinion, you can learn as needed, a good reference is Practical Statistics for Data Scientists: 50 Essential Concepts.

An objective book with some good examples for every concept.

Fast to read too.

As the title implies, it is more suitable for Data Scientists, but understanding some basics of statistics is always good and this is what this is book is for.

You won’t become a statistician after reading it, but you will learn some good stuff.

The Bypassed: Machine Learning

Everybody wants to jump straight into Deep Learning and be the cool guy training a single model for a week on a 12GB GPU.

But to get Deep Learning right, you need to go through Machine Learning first!

Start from the beginning

The concepts, the train of thought, the "feeling" of how things work start here and there is no one else more capable of teaching those concepts than Professor Andrew Ng in his course Machine Learning.

You may think this course is old and outdated, well, technology-wise, maybe, but conceptually-wise, it is better than anything else out there.

Professor Ng makes it easy to understand the math applied in every technique he teaches and gives you a solid understanding of what happens underneath in a very short and concise course.

All the exercises are made in Octave, a free version of Matlab of sorts, and you finish the course implementing your own Neural Network!

The syntax in Octave is easy to grasp for any programmer, so don’t let that be a barrier for you.

Once you finish the course, you will have implemented all the major algorithms and will be able to solve several prediction problems.

Random Forests

I said all the major algorithms, right?

Actually, there is but one flaw in Andrew Ng’s course, he doesn’t cover Random Forests.

An awesome complement to his course is fast.ai’s Introduction to Machine Learning for Coders.

Jeremy Howard goes super practical on the missing piece in Ng’s course covering a topic that is, for many classical problems, the best solution out there.

Fast.ai’s approach is what is called Top-Down, meaning they show you how to solve the problem and then explain why it worked, which is the total opposite of what we are used to in school.

Jeremy also uses real-world tools and libraries, so you learn by coding in industry-tested solutions.

Deep Learning

Finally!

The reason why we are all here, Deep Learning!

Again, the best resource for it is Professor Ng’s course, actually, a series of courses.

The Deep Learning Specialization is composed of 5 courses total going from the basics and evolving on specific topics such as language, images, and time-series data.

One nice thing is that he continues from the very end of his classical Machine Learning course, so it just feels like an extension of the first course.

The math, the concepts, the notion of how and why it works, he delivers it all very concisely like few I’ve seen.

~~The only drawback is that he uses Tensorflow 1.x (Google’s DL Framework) in this course, but that’s minimal detail in my opinion since the explanations and exercises are so well delivered.~~

~~You can pick up the most recent version of the framework relatively easy and to do so there is the final piece of this guide, a book.~~

UPDATE APRIL 2021: The course was updated and now features Tensorflow 2 and some extra topics.

Too much stuff, give me something faster

This book might be the only thing you need to start, it is Aurélien Géron’s Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems.

It covers a lot, from classical Machine Learning to the most recent Deep Learning topics. Good examples and exercises using industry-grade frameworks and libraries.

I dare say that, if you are really in a rush, you can skip everything I said before and just go for the book.

You will miss a good amount of information contained on the other resources mentioned, but the practical and actionable knowledge from Géron’s book is enough to work on many ideas for your next project.

If you feel limited after only reading the book, go back and study the rest of the material, it will fill in the gaps you might have and give you a more solid understanding.

What about Framework X or Y?

"Hey, I’ve heard about PyTorch and that other framework or library X everybody talks about".

As a Software Engineer, you know better than anyone how fast technology evolves.

Don’t go crazy for that, after you learn the basics in this guide, you can easily go, for instance, on PyTorch documentation or any other library or framework of sorts and learn how to use it in a week or two.

The techniques, the concepts, are all the same, it is only a matter of syntax and application or even tastes that you might have for any given tool.

Conclusion

To wrap it up, I want to say that, even though it might seem a lot, I tried to remove all the noise and at the end of the process, you will feel confident that you understand what is happening behind the curtains, the jargons and even be able to read some papers published in the field to keep up with the latest advances.

TL;DR Here is the list of resources mentioned in sequence:

Watch on Youtube

You can also watch this content on Youtube: