Great to see they have a nice introductory section to feature engineering! Feature engineering is often the most impactful thing you can do to improve quality of models and a place where I often see beginners (and experts for that matter) get stuck. Google walks through how to work with json files and categorical variables https://developers.google.com/machine-learning/crash-course/....
If anyone is looking to get more indepth, I work on an open source python library for automated feature engineering called Featuretools https://github.com/featuretools/featuretools/. It can help when your data is more complex such as when it comprised of multiple tables.
Your comment got me interested in this course. However, all I could find about feature engineering there is what you linked to, directly.
Given that entire scientific careers, books, and conferences are built around the topic of feature engineering, and at least IMO good ML tools live or die with good feature engineering (in its broadest sense, for you deep learning fanatics :-)) that doesn't seem like more than the bare minimum I'd expect from any ML "crash-course" that is to be taken serious (and I wouldn't expect an ounce less from Google... :-)).
Am I missing something, maybe?
In any case, nice work of your own, and thanks for sharing it!
Although I'm normally skeptical of AI/ML courses, that section on feature engineering do's-and-do-nots is new and surprisingly under-discussed. It's very useful even outside of AI/ML.
I expect that as companies increase their focus on finding practical applications of ML / AI, the topic will start to get more attention in these tutorials, as well as from researchers. Right now, too many people assume you already have a feature matrix, which is rarely the case when working on real world problems.
How do you select features created with featuretools? The problem with automated feature engineering is that you end with too many irrelevant features, and I haven't found a good guide on feature selection.
If anyone is looking to get more indepth, I work on an open source python library for automated feature engineering called Featuretools https://github.com/featuretools/featuretools/. It can help when your data is more complex such as when it comprised of multiple tables.
We have several demos you can run yourself to apply it to real datasets here: https://www.featuretools.com/demos.