
Buy new:
-52% $24.09$24.09
Ships from: Amazon.com Sold by: Amazon.com
Save with Used - Good
$12.08$12.08
Ships from: Amazon Sold by: NoCo Treasure Trove

Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Image Unavailable
Color:
-
-
-
- To view this video download Flash Player
Follow the authors
OK
Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking 1st Edition
Purchase options and add-ons
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces the fundamental principles of data science, and walks you through the "data-analytic thinking" necessary for extracting useful knowledge and business value from the data you collect. This guide also helps you understand the many data-mining techniques in use today.
Based on an MBA course Provost has taught at New York University over the past ten years, Data Science for Business provides examples of real-world business problems to illustrate these principles. You’ll not only learn how to improve communication between business stakeholders and data scientists, but also how participate intelligently in your company’s data science projects. You’ll also discover how to think data-analytically, and fully appreciate how data science methods can support business decision-making.
- Understand how data science fits in your organization―and how you can use it for competitive advantage
- Treat data as a business asset that requires careful investment if you’re to gain real value
- Approach business problems data-analytically, using the data-mining process to gather good data in the most appropriate way
- Learn general concepts for actually extracting knowledge from data
- Apply data science principles when interviewing data science job candidates
- ISBN-101449361323
- ISBN-13978-1449361327
- Edition1st
- PublisherO'Reilly Media
- Publication dateSeptember 17, 2013
- LanguageEnglish
- Dimensions7 x 0.9 x 9.19 inches
- Print length413 pages
Discover the latest buzz-worthy books, from mysteries and romance to humor and nonfiction. Explore more
Frequently bought together

More items to explore
- Similarity matching attempts to identify similar individuals based on data known about them.Highlighted by 2,744 Kindle readers
- Causal modeling attempts to help us understand what events or actions actually influence others.Highlighted by 2,525 Kindle readers
From the brand

-
Explore more Data Science
-
Start learning with O'Reilly
-
More From O'Reilly
-
Sharing the knowledge of experts
O'Reilly's mission is to change the world by sharing the knowledge of innovators. For over 40 years, we've inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
Editorial Reviews
Review
"This book goes beyond data analytics 101. It's the essential guide for those of us (all of us?) whose businesses are built on the ubiquity of data opportunities and the new mandate for data-driven decision-making."--Tom PhillipsCEO of Media6Degrees and Former Head of Google Search and Analytics
"Data is the foundation of new waves of productivity growth, innovation, and richer customer insight. Only recently viewed broadly as a source of competitive advantage, dealing well with data is rapidly becoming table stakes to stay in the game. The authors' deep applied experience makes this a must read--a window into your competitor's strategy."-- Alan MurraySerial Entrepreneur; Partner at Coriolis Ventures
"This timely book says out loud what has finally become apparent: in the modern world, Data is Business, and you can no longer think business without thinking data. Read this book and you will understand the Science behind thinking data."-- Ron BekkermanChief Data Officer at Carmel Ventures
"A great book for business managers who lead or interact with data scientists, who wish to better understand the principles and algorithms available without the technical details of single-disciplinary books."-- Ronny KohaviPartner Architect at Microsoft Online Services Division
About the Author
Tom Fawcett holds a Ph.D. in machine learning and has worked in industry R&D for more than two decades for companies such as GTE Laboratories, NYNEX/Verizon Labs, and HP Labs. His published work has become standard reading in data science.
Product details
- Publisher : O'Reilly Media
- Publication date : September 17, 2013
- Edition : 1st
- Language : English
- Print length : 413 pages
- ISBN-10 : 1449361323
- ISBN-13 : 978-1449361327
- Item Weight : 1.49 pounds
- Dimensions : 7 x 0.9 x 9.19 inches
- Best Sellers Rank: #35,513 in Books (See Top 100 in Books)
- #10 in Data Mining (Books)
- #13 in Business Statistics
- #21 in Statistics (Books)
- Customer Reviews:
About the authors
Tom Fawcett holds a Ph.D. in machine learning and has worked in industry R&D for more than two decades for companies such as GTE Laboratories, NYNEX/Verizon Labs, and HP Labs. His published work has become standard reading in data science.
Foster Provost is Professor of Data Science at NYU and Ira Rennert Professor of Entrepreneurship and Information Systems at the NYU Stern School of Business. His award-winning research is read and cited broadly. Prof. Provost has co-founded several successful companies focusing on data science for marketing, fraud control, and other business applications.
Customer reviews
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonCustomers say
Customers find the book an excellent overview of data science concepts, providing a great framework for approaching analytics and machine learning. Moreover, the text is well-written in easy-to-understand language, making it helpful for beginners. Additionally, customers appreciate the book's design, with one noting its well-thought-out structure, and consider it well worth the price.
AI Generated from the text of customer reviews
Customers find the book provides an excellent overview of data science concepts, breaking them down into clear explanations and offering a great framework for approaching analytics and machine learning.
"...Reproducible research; Experimental design; R programming (or python, or perhaps SAS or Octave, but some mathy language for sure); Exploratory data..." Read more
"...real world examples are interesting, and their guidance and insights are extremely valuable. My criticisms are limited to their website...." Read more
"...For future versions though, correcting for verbosity and greater specificity (essence) will make it a true winner." Read more
"...There is also a chapter on evaluating and critiquing data mining proposals, which nicely ties together the algorithmic, business, and practical..." Read more
Customers find the book well-written and easy to understand, with chapters that are straightforward to read.
"...It's great to read this stuff! This is followed by a concise discussion of CRISP-DM, a well-defined data mining process, whose concepts are..." Read more
"...I found the book to be both pragmatic and concise, while covering a lot of ground...." Read more
"...The math is certainly discussed, but kept to a minimum, and coupled with comprehensible, plain English explanations of each algorithm...." Read more
"...The section about Bayes rule is very well written. Data Science for Business is also an excellent resource to avoid data mining pitfalls...." Read more
Customers find the book easy to use, particularly helpful for beginners, with one customer noting its intuitive, non-mathy approach.
"...Its style allows the book to be read by beginners, but its wide coverage and detailed case studies makes it a reference for experts as well...." Read more
"...and index, as well as a detailed table of contents, makes it easy to navigate...." Read more
"...It is also a great introduction for technical people who have never worked in data science before." Read more
"...want to just start plugging away with Python or R then this is a great starter book that will teach you the lingo and the concepts that surround..." Read more
Customers appreciate the book's design, with reviews noting its well-thought-out structure, pleasant style, and effective presentation of content.
"...So what's so good? The book is really well thought out and the grammatical style is top notch...." Read more
"...half of the book (interspersed between the algorithms) deals with issues relating to design, implementation, evaluation, and deployment of models...." Read more
"...The title of the book is appropriate as it is not just about analyzing data, but figuring out the business case...." Read more
"...The style is very pleasant since authors have made efforts to put the reader in specific situations to better understand a problem...." Read more
Customers find the book well worth its price, particularly appreciating the profit curve as an excellent centerpiece, and one customer mentions the expected value framework for evaluating models.
"...It's a testament to this book's strong value that you can do a lot more based on its material. Nice work. Recommended." Read more
"...for Business by Foster Provost and Tom Fawcett is well worth the price of admission and the reading time you'll invest...." Read more
"...this can eventually lead to improved efficiency focus and profitability for the company...." Read more
"...It was a long read, especially with the holidays, but well worth it, and more enjoyable than almost every technical book I have every read...." Read more
Customers appreciate the book's foundation material, with one customer noting it provides a solid grounding in both business and technical aspects.
"...I've read are short in either of two domains: interpretability and rigorousness. This is fills in the hole pretty well...." Read more
"...This book gives a really solid grounding in both the business (strategic) and data (analytic, technical) aspects of modern data analytics...." Read more
"...Clear examples are used and the fundàmentals reinforced in many places. Of special interest is the example proposal and evaluation in the appendix...." Read more
"Excellent !! This book is very "direct to the point", with good real examples, it really teach us about the use of Data Science on Business." Read more
Reviews with images

Don't buy the kindle version
Top reviews from the United States
There was a problem filtering reviews. Please reload the page.
- Reviewed in the United States on October 14, 2015It's an excellent, even mandatory book for your Data Science shelf. I am glad I bought it. I am 67% of the way through reading this book. It has nowhere near enough material on some areas, though, and is just missing some material that you need for DS. That's actually OK because of course no single book is enough to cover everything you need to know in a field. Look how many books you may have bought just to get an undergrad degree, and I bet it was not just one book.
So here is a list of good and bad about this excellent book.
Its good points:
The profit curve. After reading this book, I will never use Accuracy to select a model any more, as that's nearly a worthless metric especially when there are marginal costs and marginal profits involved in an application scenario. The book is just amazingly good on describing how to select models based on estimated profit, and foremost the profit curve, and selected other supporting curves like ROC area under curve.
The expected profit computation and the cost-benefit matrix as a partner to the confusion matrix. This is great stuff. It's not even described in other data science courses that I have taken.
Other good points: ...And don't worry about the other good points (there are some). The profit curve analysis, and the lead-up to that, are superior.
Its bad points:
p.224: "We will train on the complete dataset and then test on the same dataset we trained on." What follows next the rest of the chapter is just an inappropriate error analysis, because it is overly optimistic (but otherwise the techniques are great.) The models have seen the training data. We should never completely assess (test) -- and base the entire remainder of the chapter material -- on error (accuracy) estimates produced from data that the models have already seen.
In most chapters, there is just not enough detail in the material, to enable this book to be used as a "correct reference" basis against which to write your own working code as you follow along with the text in whatever computer language you want to use for analysis.
In summary:
The book is outstanding. It is necessary for your DS bookshelf, but on the other hand it is nowhere near sufficient.
The data science course sequence by Johns Hopkins University identifies many of the elements of a nice overall outline as to what DS practitioners need to be able to do (and this is not even sufficient either):
Reproducible research; Experimental design; R programming (or python, or perhaps SAS or Octave, but some mathy language for sure); Exploratory data analysis; Regression models; Statistical inference; Practical machine learning; Scientific writing; Developing data products; Big data techniques (e.g. Apache Spark programming or at least MapReduce-style programming); SQL and NoSQL databases; Concurrent, distributed, and parallel programming; Advanced statistics (such as multiple testing corrections).
This book by Provost et al gives just a part of the necessary DS material. However the part it provides, is essential. I wish the biological data scientists in academia would adopt and integrate the cost-benefit matrix idea and the profit curve idea into their model selection techniques instead of just using the accuracy metric mostly.
Also a data scientist could do several follow-on added-value extensions to the profit curve chapter. You could produce Revenue curve (or Cost) since sometimes that matters more. You could quickly find alternatives which are nearly equi-profitable to the optimal profit but which exhibit (less revenue, less cost) or (more revenue, more cost). You could detail the model selection and profit consequences of fixed budgets. You could further assess the implications of marginal profit analysis on the optimal quantity when the profitability ratio changes. You could directly assess the data science solution against the best business wisdom solution and estimate what amount of profit is lost when using the old business wisdom decisions. It's a testament to this book's strong value that you can do a lot more based on its material.
Nice work. Recommended.
- Reviewed in the United States on March 7, 2015Data Science for Business by Foster Provost and Tom Fawcett is a very important book about data mining and data analytic thinking. In 1971, Abbie Hoffman shocked the world when he demanded hippie readers (at the time, a likely oxymoron) "Steal This Book". While I wouldn't go so far as to encourage current and future data scientists to shoplift, I will demand that they READ THIS BOOK!
Not long ago, data was difficult and expensive to come by. Today, we're living in a world of far too much data, vast amounts of cheap computing power, and way too many poorly defined questions. Mix them all together and you're guaranteed to make a mess.
Going from data dearth to plethora presents substantive issues. In business, the balance between gut feel decision-making and analysis paralysis is changing, rapidly. Whether it moves too far from gut to paralysis, only time will tell. Through Data Science for Business, Provost and Fawcett offer practitioners a guide to equilibrium.
Read this book and you'll find yourself moving briskly down the road towards data analytic enlightenment. While not highly technical, the authors covers each topic with enough rigor to appreciate the tools being presented and the insights being offered.
From the outset, the authors are clear about the book's objectives: "The primary goals of this book are to help you view business problems from a data perspective and understand principles of extracting useful knowledge from data. There is fundamental structure to data-analytic thinking, and basic principals that should be understood. There are also particular areas where intuition, creativity, common sense, and domain knowledge must be brought to bear… As you get better at data-analytic thinking you will develop intuition as to how and where to apply creativity and domain knowledge."
This paragraph makes me think of all those undergrad and graduate students studying Statistics at Universities all over the world, my daughter included, who are being bombarded by one math or statistics class after another (Calculus III, Math Stat I and II, Linear Algebra, etc.). Yet, far too often, they enter the real world lacking "data analytic thinking" or a sense of "basic principals" They do, however, have a sense of being overwhelmed and under prepared. The epic battle between "frequentists" and "Bayesians", takes a back seat to what should be the real controversy in statistics departments around the world, the balance between "application" and "theory". The book's "primary goals" should be the walking orders of every statistics program at any college or university anywhere.
From the outset (page 2), the authors state, "Data mining is a craft. It involves the application of a substantial amount of science and technology, but the proper application still involves art as well." Absolutely true! It's great to read this stuff! This is followed by a concise discussion of CRISP-DM, a well-defined data mining process, whose concepts are elementary, essential, and integral to the responsible, proper, and successful practice of data mining.
From this point on, the authors proceed to accomplish their primary goals. They present such topics as predictive modeling, correlation, classification, clustering, regression, logistic regression, linear discriminants, and much more. Their presentations are user friendly, their real world examples are interesting, and their guidance and insights are extremely valuable.
My criticisms are limited to their website. The Data Science for Business site leaves me wanting more real world examples to enjoy, access to more resources and tools of the trade, more references to peruse, and a more rigorous approach to some of the solutions. Perhaps Data Science for Business the sequel is on the horizon?
Whether you're a seasoned statistician (or, data scientist), a young aspiring novice, or an adventurous business person looking to expand his/her horizons, Data Science for Business by Foster Provost and Tom Fawcett is well worth the price of admission and the reading time you'll invest.
Foster Provost and Tom Fawcett state, "[i]deally, we envision a book that any data scientist would give to his collaborators…" I'll do them one better, I'm giving it to my daughter!
- Reviewed in the United States on September 5, 2017Context: I'm an MD, needing to communicate with data scientist to build a product.
I've this far only read two chapters. My pattern-recognition ;) this far however, with an assessment that this will be applicable to the rest of the book is two-fold:
1) Too verbose!
Too much stuff on explaining the structure and purpose of the book. Could've been said way more succinctly, and therefore more clearly. The effect is that I start skimming.
2) Not 'sharp' enough.
The best non-fiction written for non-expert manages to reduce the complex into explaining the essence. Not making it simpler, and reducing crucial comprehension. But reducing the complex into its crucial essence.
When going over different types of tasks; classification, regression, similarity matching, clustering, co-occurence grouping - the way they are described, there is essentially no difference between i.e. clustering, similarity matching and clustering; they're all classifications - yes, there is a difference between regression.
In order for this to be truly helpful even for an absolute layman as myself, it needs to add enough crucial, essential distinctions to make the categories mutually exclusive. I can think about it, I can look it up. The book would however been better if the information was more 'sharply' communicated.
So why 4-star?
Because it is a beautiful balance for the amateur. Explaining basic concepts instead of trendy-applications.
For future versions though, correcting for verbosity and greater specificity (essence) will make it a true winner.
Top reviews from other countries
-
ZzzReviewed in Germany on July 2, 2015
5.0 out of 5 stars Top Einführungsbuch
Mein Einführungsbuch für Data Science. War sehr gut! Ich habe viel gelernt! Nachhinein sehe ich: Das Buch deckt die meisten Themen ab.
-
CustomerKingReviewed in Italy on January 9, 2014
5.0 out of 5 stars Molto ben fatto ed quilibrato
Chi pensa di trovare in questo libro la spiegazione dettagliata di tecniche statistiche e di machine learning forse rimarrà deluso, ma lo scopo di questo libro del prof. Provost è quello di richiamare metodologie e tecniche, che dovrebbero essere già chiare nella testa di chi legge, e di applicarle in maniera comprensibile per chiunque quando si affrontano problemi di business come segmentazione, ottimizzazione di campagne di marketing commerciale, valutazione del rischio di abbandono di un cliente, ecc.
A prima vista questo può sembrare scontato per un libro con un titolo del genere (e in effetti ce ne sono tanti in giro di questo tipo) ma la superiorità di questo testo sta nella linearità e chiarezza di esposizione e nella pragmaticità nell'affrontare i diversi casi aziendali.
-
LunaReviewed in Spain on March 23, 2020
5.0 out of 5 stars Simple y bueno
Simple de entender
- Payam MokhtarianReviewed in Australia on March 6, 2018
5.0 out of 5 stars Five Stars
Highly recommended book for those who wnat to hands on data science and business principles of machine learning
-
2501Reviewed in France on November 11, 2021
5.0 out of 5 stars Très intéressant
Peut-être le livre le plus intéressant que j'ai pu lire sur le machine learning. Livre non destiné au débutants, car si vous ne maîtrisez pas déjà le sujet, vous n'en tirerez pas grand chose, mais si vous avez déjà une certaine expérience sur le sujet, il vous fera comprendre pas mal de subtilités habituellement jamais évoquées.