There's also a newer book by Hastie (and Efron!), which I very much prefer to The Elements of Statistical Learning: Computer Age Statistical Inference.
It's really well motivated and, unlike ESL, discusses many different schools---including classical inference, empirical and Bayes deep learning. Without these different perspectives, newcomers often find statistics very obscure as it just looks like a bag of tricks.
So much of that book just goes over my head while I didn't have that problem with ESL. I don't know if it's Murphy's writing style or just the way he approaches the topic but I found his book significantly more difficult to process.
I love this book so much. It takes a strong Bayesian point of view that makes things so clear to me. It's well written and we'll structured. It starts with a summary chapter of ML which honestly by itself gets you to a very good place in understanding the basics of ML.