Grokking Deep Learning
I'm taking a short detour (or more likely, a parallel path) from my work on Claude and other agentic coding tools to dig into deep learning concepts. I've heard good things about Grokking Deep Learning by Andrew Trask as a foundational text, so I'm starting there. In particular, I like a few things about the book so far:
1. Limited math skills required
It's easy to get overwhelmed trying to refresh yourself on linear algebra, matrices, and other things I only sort of remember from high school and college. This book focuses on explaining the core concepts in an accessible way, without focusing too much on the complex math. I actually took a refresher linear algebra course online a few months ago, but I found that focusing too much on the math (and not enough on building stuff) was really taking the wind out of my sails.
2. Build using foundational Python components, rather than diving straight into libraries like Pytorch
While most real work in the field is done using Pytorch or other sophisticated libraries, it is easy to let the library do all of the heavy lifting and not fully grasp what is happening under the hood. By building using basic data structures (first lists, then NumPy arrays), it helps strip away the magic and reinforce what is actually happening.
3. Project-driven work in Jupyter Notebooks
The book is structured around discrete exercises that can be written, run, and experimented with in Jupyter notebooks. This lowers the friction and makes playing with the results easy and engaging.
I used chatgpt to scaffold a notebook for each chapter, where I can reproduce the code samples, make notes for myself to make sure I'm really nailing down the concepts, etc. I'll share my progress as I go with occasional posts, and in this repo.