Data Science for Business: What you need to know about data mining and data-analytic thinking by Foster Provost
My rating: 5 of 5 stars
What Provost and Fawcett have done is to write a book on data mining that focuses on the why of data mining technique, which is great complement to all the books that focus on the how of data mining. And because it focuses on the why for the myriad of methods that fall under the heading of data mining, this would be a good source for a manager of a project for which data mining was merely part of the project, or for a source of good explanations when you need to explain to others what data mining methods (or buzzwards) can and cannot do.
I've come across a number of data mining books. Some are deep into the mathematics and statistics that underlie the methods of data mining. Others focus on how you implement methods. But while this helps with technique, a missing niche is the why, or the morality of data mining methods. They go over a range of methods, but the focus is on the task, recognizing what kinds of questions can be asked in a situation, then how to answer it. This is different from a methods book that has chapters focused on PCA, SVM, trees and forests, or other techniques. The second can lead to tossing out buzzwords. This book is the first, and is for having conversations about how to get a task done.
While I've read and worked through examples from books that focused on methods and implementations, I think that my understanding of data mining has improved significantly on reading this book. I'm recommending it to a former student who has since had to learn and implement these methods in practice, so he can better explain what he has done and its significance at his company. My only nit to pick is the title. The book clearly focuses on data mining, not on other aspects of data science. Within that realm, I recommend it unreservedly.
Disclaimer: I received a free electronic copy of this book through the OReilly Blogger program.
View all my reviews