Below is a non-exhaustive list of resources including blogs, courses, books, podcasts, and video lectures which I have found extremely useful in learning python, statstics, and machine-learning concepts.
Highly recommended machine-learning starting point:
For practical purposes, I’ve noticed, it is not always necessary dive super deep in a concept, rather its helpful to get a concise version of the concept, understand the core assumptions, and start applying the concept right away figuring out your knowledge gaps along the way. I strongly believe in the 80-20 rule (80% output from 20% input). In that spirit, following are the top five sources to get upto speed on learning the basics of ML.
- Hands-On Machine Learning with Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems. Github
I started to read on ML and data analysis using this wonderful book by Aurélien Géron. This is one of the best, if not the best, introductory books for machine learning. It is concise and simple to read and has jupyter notebooks to apply the concepts taught in it. Initial chapters (Part 1) of the book offer a strong foundation for traditional ML algorithms.
Besides just focusing on ML, having experience with data wrangling using PyData stack (NumPy, Pandas, and friends) is always a plus. In fact, most of the time the limitation in setting up any ML model is massaging data into machine readable format.
Deep Learning is the most popular sub-branch of ML and something you should have a general understanding of. Jeremy Howard and team have setup this wonderful didactic coursework using PyTorch (personal preference) comprising of useful collection of walkthroughs and practical examples.
Fantastic high-level math focussed introduction to algorithms.
Approachable compendium of key ML concepts boiled down to key insights, offers a nice way to articulate concepts in a concise way.
Nice (free) online courses:
Machine Learning
- MIT’s Intro to Deep Learning
- Google’s ML crash course
- Stanford’s CS - CNN course
- NYU’s PyTorch Deep learning
- AI Summer
Data Science and Computation
Miscellaneous
Python in general
Learning Python
- Automate Boring Stuff with Python
- Scientific programing with Python
- Visual Guide to Numpy
- Python DataScience Handbook
- Chris Albon’s notes
- Numpy Visual Introduction
Tutorials / Projects
- Pynative
- Python Workout
- Reuven Lerner’s Python Interview Prep
- Project Euler
- Calm code tips on python code
Writing better code * Corey Schafer’s Tips for writing better code * Refactory blog * RealPython blog
Datasets
Books and websites
Statstics & Exploratory Analysis
Data Science
- Jakevdp Python Datascience Notebook
- Introduction to Cultural Analytics & Python Nice collection of tips for web scraping, network analysis, geotagging, language processing using python.
Data Visualization
Machine-Learning
Machine-learning focused key commentaries, perspectives, and reviews
Area reviews
General tips
- How to avoid machine learning pitfalls: a guide for academic researchers
- Scikit-learn documentation on common pitfalls
- Machine Learning that Matters
- Three pitfalls to avoid in machine learning
- A Few Useful Things to Know about Machine Learning
Commentaries
- Statistical Modeling: The Two Cultures
- The Hardware Lottery
- Machine Learning that Matters
- Why is AI harder than we think
In Chemical Sciences:
Molecular science:
Graph networks
Cheat Sheets
I’ve compiled some nice cheat-sheets discussing basics of ML, Data Science, Statistics concepts alongside some tips on NumPy, Pandas, and Scikit-learn. These compilations are particularly useful when brushing up details before a potential job interview. Link to dataset repository
Video series
Explanations
- Neural networks series by 3Blue1Brown
- Machine learning zero to hero
- Ali Godsi’s video lecture series (highly recommend his lecture on Variational Auto Encoders)
- Khan Academy’s Multivariate Calculus
- Khan Academy Statsitics + Probability
PyCon talks
AI talks / commentaries
Blogs
Data Science focused
- Nate Silver’s 538
- Jim Vallandingham
- Pudding’s data viz
- Flowing Data
- Mike Bostock
- Spurious Correlations
- Understanding uncertainty
- Math3ma Blog
- Max Wolfe
- Chris Albon
- Caitlin Hudson
Statistics Blogs * Statistics by Jim * Probably overthinking it by Allen Downley
ML inclined * Chris Olah * Andrej Karpathy, wonderfully didactic posts * Jay Alammer’s NLP focussed
ML code examples and tutorials * Keras Code Examples * Tensorflow Examples * PyTorch Examples
General compilations * Distill Blog * KDNuggets
Data-inspired Podcasts
Fanstastic resource, you can be a fly on the wall and listen to experts talk about a topic that interests you
- AI in Business
- McKinsey AI
- AZ16 podcast
- Data Skeptic
- Lex Friedman / AI podcast
- Microsoft Research Podcast
YouTubers
List of YouTuber channels that never fail to inspire me
1. Science and Technology
Statistics
2. Food
3.Videography and Design
- Casey Neistat
- Dan Mace
- Peter McKinnon
- Andrew Price - Blender
- CGMatter
- CG Figures - Scientific visualization in Blender
4. Journalism
Diversity & Inclusion
- Delotte’s Woman in AI
- Account from Dow employees going from R&D to D&I
- Diversity in STEM: What it is and Why it Matters
- How Diversity Makes Us Smarter
- Increasing Gender Diversity in the STEM Research Workforce
- Without Inclusion, Diversity Initiatives May Not Be Enough
- Work organization and mental health problems in PhD students