Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(R) by Nitin R. Patel

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(R)

Buy Rights

Rights Contact Login For More Details

Wiley

More About This Title Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(R)

English

English

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® presents an applied and interactive approach to data mining.

Featuring hands-on applications with JMP Pro®, a statistical package from the SAS Institute, the book
uses engaging, real-world examples to build a theoretical and practical understanding of key data mining methods, especially predictive models for classification and prediction. Topics include data visualization, dimension reduction techniques, clustering, linear and logistic regression, classification and regression trees, discriminant analysis, naive Bayes, neural networks, uplift modeling, ensemble models, and time series forecasting.

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® also includes:

Detailed summaries that supply an outline of key topics at the beginning of each chapter
End-of-chapter examples and exercises that allow readers to expand their comprehension of the presented material
Data-rich case studies to illustrate various applications of data mining techniques
A companion website with over two dozen data sets, exercises and case study solutions, and slides for instructors www.dataminingbook.com

Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro® is an excellent textbook for advanced undergraduate and graduate-level courses on data mining, predictive analytics, and business analytics. The book is also a one-of-a-kind resource for data scientists, analysts, researchers, and practitioners working with analytics in the fields of management, finance, marketing, information technology, healthcare, education, and any other data-rich field.

English

English

Galit Shmueli, PhD, is Distinguished Professor at National Tsing Hua University's Institute of Service Science. She has designed and instructed data mining courses since 2004 at University of Maryland, Statistics.com, Indian School of Business, and National Tsing Hua University, Taiwan. Professor Shmueli is known for her research and teaching in business analytics, with a focus on statistical and data mining methods in information systems and healthcare. She has authored over 70 journal articles, books, textbooks, and book chapters, including Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner^®, Third Edition, also published by Wiley.

Peter C. Bruce is President and Founder of the Institute for Statistics Education at www.statistics.com He has written multiple journal articles and is the developer of Resampling Stats software. He is the author of Introductory Statistics and Analytics: A Resampling Perspective and co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner^®, Third Edition, both published by Wiley.

Mia Stephens is Academic Ambassador at JMP^®, a division of SAS Institute. Prior to joining SAS, she was an adjunct professor of statistics at the University of New Hampshire and a founding member of the North Haven Group LLC, a statistical training and consulting company. She is the co-author of three other books, including Visual Six Sigma: Making Data Analysis Lean, Second Edition, also published by Wiley.

Nitin R. Patel, PhD, is Chairman and cofounder of Cytel, Inc., based in Cambridge, Massachusetts. A Fellow of the American Statistical Association, Dr. Patel has also served as a Visiting Professor at the Massachusetts Institute of Technology and at Harvard University. He is a Fellow of the Computer Society of India and was a professor at the Indian Institute of Management, Ahmedabad, for 15 years. He is co-author of Data Mining for Business Analytics: Concepts, Techniques, and Applications with XLMiner^®, Third Edition, also published by Wiley.

English

English

FOREWORD xvii

PREFACE xix

ACKNOWLEDGMENTS xxi

PART I PRELIMINARIES

1 Introduction 3

1.1 What Is Business Analytics? 3

Who Uses Predictive Analytics? 4

1.2 What Is Data Mining? 5

1.3 Data Mining and Related Terms 5

1.4 Big Data 6

1.5 Data Science 7

1.6 Why Are There So Many Different Methods? 7

1.7 Terminology and Notation 8

1.8 Roadmap to This Book 10

Order of Topics 11

Using JMP Pro, Statistical Discovery Software from SAS 11

2 Overview of the Data Mining Process 14

2.1 Introduction 14

2.2 Core Ideas in Data Mining 15

Classification 15

Prediction 15

Association Rules and Recommendation Systems 15

Predictive Analytics 16

Data Reduction and Dimension Reduction 16

Data Exploration and Visualization 16

Supervised and Unsupervised Learning 16

2.3 The Steps in Data Mining 17

2.4 Preliminary Steps 19

Organization of Datasets 19

Sampling from a Database 19

Oversampling Rare Events in Classification Tasks 19

Preprocessing and Cleaning the Data 20

Changing Modeling Types in JMP 20

Standardizing Data in JMP 25

2.5 Predictive Power and Overfitting 25

Creation and Use of Data Partitions 25

Partitioning Data for Crossvalidation in JMP Pro 27

Overfitting 27

2.6 Building a Predictive Model with JMP Pro 29

Predicting Home Values in a Boston Neighborhood 29

Modeling Process 30

Setting the Random Seed in JMP 34

2.7 Using JMP Pro for Data Mining 38

2.8 Automating Data Mining Solutions 40

Data Mining Software Tools: the State of theMarket by Herb Edelstein 41

Problems 44

PART II DATA EXPLORATION AND DIMENSION REDUCTION

3 Data Visualization 51

3.1 Uses of Data Visualization 51

3.2 Data Examples 52

Example 1: Boston Housing Data 53

Example 2: Ridership on Amtrak Trains 53

3.3 Basic Charts: Bar Charts, Line Graphs, and Scatterplots 54

Using The JMP Graph Builder 54

Distribution Plots: Boxplots and Histograms 56

Tools for Data Visualization in JMP 59

Heatmaps (Color Maps and Cell Plots): Visualizing Correlations and Missing Values 59

3.4 Multidimensional Visualization 61

Adding Variables: Color, Size, Shape, Multiple Panels, and Animation 62

Manipulations: Rescaling, Aggregation and Hierarchies, Zooming, Filtering 65

Reference: Trend Lines and Labels 68

Adding Trendlines in the Graph Builder 69

Scaling Up: Large Datasets 70

Multivariate Plot: Parallel Coordinates Plot 71

Interactive Visualization 72

3.5 Specialized Visualizations 73

Visualizing Networked Data 74

Visualizing Hierarchical Data: More on Treemaps 75

Visualizing Geographical Data: Maps 76

3.6 Summary of Major Visualizations and Operations, According to Data

Mining Goal 77

Prediction 77

Classification 78

Time Series Forecasting 78

Unsupervised Learning 79

Problems 79

4 Dimension Reduction 81

4.1 Introduction 81

4.2 Curse of Dimensionality 82

4.3 Practical Considerations 82

Example 1: House Prices in Boston 82

4.4 Data Summaries 83

Summary Statistics 83

Tabulating Data (Pivot Tables) 85

4.5 Correlation Analysis 87

4.6 Reducing the Number of Categories in Categorical Variables 87

4.7 Converting a Categorical Variable to a Continuous Variable 90

4.8 Principal Components Analysis 90

Example 2: Breakfast Cereals 91

Principal Components 95

Normalizing the Data 97

Using Principal Components for Classification and Prediction 100

4.9 Dimension Reduction Using Regression Models 100

4.10 Dimension Reduction Using Classification and Regression Trees 100

Problems 101

PART III PERFORMANCE EVALUATION

5 Evaluating Predictive Performance 105

5.1 Introduction 105

5.2 Evaluating Predictive Performance 106

Benchmark: The Average 106

Prediction Accuracy Measures 107

Comparing Training and Validation Performance 108

5.3 Judging Classifier Performance 109

Benchmark: The Naive Rule 109

Class Separation 109

The Classification Matrix 109

Using the Validation Data 111

Accuracy Measures 111

Propensities and Cutoff for Classification 112

Cutoff Values for Triage 112

Changing the Cutoff Values for a Confussion Matrix in JMP 114

Performance in Unequal Importance of Classes 115

False-Positive and False-Negative Rates 116

Asymmetric Misclassification Costs 116

Asymmetric Misclassification Costs in JMP 119

Generalization to More Than Two Classes 120

5.4 Judging Ranking Performance 120

Lift Curves 120

Beyond Two Classes 122

Lift Curves Incorporating Costs and Benefits 122

5.5 Oversampling 123

Oversampling the Training Set 126

Stratified Sampling and Oversampling in JMP 126

Evaluating Model Performance Using a Nonoversampled Validation Set 126

Evaluating Model Performance If Only Oversampled Validation Set Exists 127

Applying Sampling Weights in JMP 128

Problems 129

PART IV PREDICTION AND CLASSIFICATION METHODS

6 Multiple Linear Regression 133

6.1 Introduction 133

6.2 Explanatory versus Predictive Modeling 134

6.3 Estimating the Regression Equation and Prediction 135

Example: Predicting the Price of Used Toyota Corolla Automobiles 136

Coding of Categorical Variables in Regression 138

Additional Options for Regression Models in JMP 140

6.4 Variable Selection in Linear Regression 141

Reducing the Number of Predictors 141

How to Reduce the Number of Predictors 142

Manual Variable Selection 142

Automated Variable Selection 142

Coding of Categorical Variables in Stepwise Regression 143

Working with the All Possible Models Output 145

When Using a Stopping Algorithm in JMP 147

Other Regression Procedures in JMP Pro—Generalized Regression 149

Problems 150

7 k-Nearest Neighbors (k-NN) 155

7.1 The 𝑘-NN Classifier (Categorical Outcome) 155

Determining Neighbors 155

Classification Rule 156

Example: Riding Mowers 156

Choosing 𝑘 157

𝑘 Nearest Neighbors in JMP Pro 158

The Cutoff Value for Classification 159

𝑘-NN Predictions and Prediction Formulas in JMP Pro 161

𝑘-NN with More Than Two Classes 161

7.2 𝑘-NN for a Numerical Response 161

Pandora 161

7.3 Advantages and Shortcomings of 𝑘-NN Algorithms 163

Problems 164

8 The Naive Bayes Classifier 167

8.1 Introduction 167

Naive Bayes Method 167

Cutoff Probability Method 168

Conditional Probability 168

Example 1: Predicting Fraudulent Financial Reporting 168

8.2 Applying the Full (Exact) Bayesian Classifier 169

Using the ‘‘Assign to the Most Probable Class’’ Method 169

Using the Cutoff Probability Method 169

Practical Difficulty with the Complete (Exact) Bayes Procedure 170

Solution: Naive Bayes 170

Example 2: Predicting Fraudulent Financial Reports, Two Predictors 172

Using the JMP Naive Bayes Add-in 174

Example 3: Predicting Delayed Flights 174

8.3 Advantages and Shortcomings of the Naive Bayes Classifier 179

Spam Filtering 179

Problems 180

9 Classification and Regression Trees 183

9.1 Introduction 183

9.2 Classification Trees 184

Recursive Partitioning 184

Example 1: Riding Mowers 185

Categorical Predictors 186

9.3 Growing a Tree 187

Growing a Tree Example 187

Classifying a New Observation 188

Fitting Classification Trees in JMP Pro 191

Growing a Tree with CART 192

9.4 Evaluating the Performance of a Classification Tree 192

Example 2: Acceptance of Personal Loan 192

9.5 Avoiding Overfitting 193

Stopping Tree Growth: CHAID 194

Growing a Full Tree and Pruning It Back 194

How JMP Limits Tree Size 196

9.6 Classification Rules from Trees 196

9.7 Classification Trees for More Than Two Classes 198

9.8 Regression Trees 199

Prediction 199

Evaluating Performance 200

9.9 Advantages and Weaknesses of a Tree 200

9.10 Improving Prediction: Multiple Trees 204

Fitting Ensemble Tree Models in JMP Pro 206

9.11 CART and Measures of Impurity 207

Problems 207

10 Logistic Regression 211

10.1 Introduction 211

Logistic Regression and Consumer Choice Theory 212

10.2 The Logistic Regression Model 213

Example: Acceptance of Personal Loan (Universal Bank) 214

Indicator (Dummy) Variables in JMP 216

Model with a Single Predictor 216

Fitting One Predictor Logistic Models in JMP 218

Estimating the Logistic Model from Data: Multiple Predictors 218

Fitting Logistic Models in JMP with More Than One Predictor 221

10.3 Evaluating Classification Performance 221

Variable Selection 222

10.4 Example of Complete Analysis: Predicting Delayed Flights 223

Data Preprocessing 225

Model Fitting, Estimation and Interpretation---A Simple Model 226

Model Fitting, Estimation and Interpretation---The Full Model 227

Model Performance 229

Variable Selection 230

Regrouping and Recoding Variables in JMP 232

10.5 Appendixes: Logistic Regression for Profiling 234

Appendix A: Why Linear Regression Is Problematic for a

Categorical Response 234

Appendix B: Evaluating Explanatory Power 236

Appendix C: Logistic Regression for More Than Two Classes 238

Nominal Classes 238

Problems 241

11 Neural Nets 245

11.1 Introduction 245

11.2 Concept and Structure of a Neural Network 246

11.3 Fitting a Network to Data 246

Example 1: Tiny Dataset 246

Computing Output of Nodes 248

Preprocessing the Data 251

Activation Functions and Data Processing Features in JMP Pro 251

Training the Model 251

Fitting a Neural Network in JMP Pro 254

Using the Output for Prediction and Classification 256

Example 2: Classifying Accident Severity 258

Avoiding overfitting 259

11.4 User Input in JMP Pro 260

Unsupervised Feature Extraction and Deep Learning 263

11.5 Exploring the Relationship between Predictors and Response 264

Understanding Neural Models in JMP Pro 264

11.6 Advantages and Weaknesses of Neural Networks 264

Problems 265

12 Discriminant Analysis 268

12.1 Introduction 268

Example 1: Riding Mowers 269

Example 2: Personal Loan Acceptance (Universal Bank) 269

12.2 Distance of an Observation from a Class 270

12.3 From Distances to Propensities and Classifications 272

Linear Discriminant Analysis in JMP 275

12.4 Classification Performance of Discriminant Analysis 275

12.5 Prior Probabilities 277

12.6 Classifying More Than Two Classes 278

Example 3: Medical Dispatch to Accident Scenes 278

Using Categorical Predictors in Discriminant Analysis in JMP 279

12.7 Advantages and Weaknesses 280

Problems 282

13 Combining Methods: Ensembles and Uplift Modeling 285

13.1 Ensembles 285

Why Ensembles Can Improve Predictive Power 286

The Wisdom of Crowds 287

Simple Averaging 287

Bagging 288

Boosting 288

Creating Ensemble Models in JMP Pro 289

Advantages and Weaknesses of Ensembles 289

13.2 Uplift (Persuasion) Modeling 290

A-B Testing 290

Uplift 290

Gathering the Data 291

A Simple Model 292

Modeling Individual Uplift 293

Using the Results of an Uplift Model 294

Creating Uplift Models in JMP Pro 294

Using the Uplift Platform in JMP Pro 295

13.3 Summary 295

Problems 297

PART V MINING RELATIONSHIPS AMONG RECORDS

14 Cluster Analysis 301

14.1 Introduction 301

Example: Public Utilities 302

14.2 Measuring Distance between Two Observations 305

Euclidean Distance 305

Normalizing Numerical Measurements 305

Other Distance Measures for Numerical Data 306

Distance Measures for Categorical Data 308

Distance Measures for Mixed Data 308

14.3 Measuring Distance between Two Clusters 309

Minimum Distance 309

Maximum Distance 309

Average Distance 309

Centroid Distance 309

14.4 Hierarchical (Agglomerative) Clustering 311

Hierarchical Clustering in JMP and JMP Pro 311

Hierarchical Agglomerative Clustering Algorithm 312

Single Linkage 312

Complete Linkage 313

Average Linkage 313

Centroid Linkage 313

Ward’s Method 314

Dendrograms: Displaying Clustering Process and Results 314

Validating Clusters 316

Two-Way Clustering 318

Limitations of Hierarchical Clustering 319

14.5 Nonhierarchical Clustering: The 𝑘-Means Algorithm 320

𝑘-Means Clustering Algorithm 321

Initial Partition into 𝐾 Clusters 322

𝐾-Means Clustering in JMP 322

Problems 329

PART VI FORECASTING TIME SERIES

15 Handling Time Series 335

15.1 Introduction 335

15.2 Descriptive versus Predictive Modeling 336

15.3 Popular Forecasting Methods in Business 337

Combining Methods 337

15.4 Time Series Components 337

Example: Ridership on Amtrak Trains 337

15.5 Data Partitioning and Performance Evaluation 341

Benchmark Performance: Naive Forecasts 342

Generating Future Forecasts 342

Partitioning Time Series Data in JMP and Validating

Time Series Models 342

Problems 343

16 Regression-Based Forecasting 346

16.1 A Model with Trend 346

Linear Trend 346

Fitting a Model with Linear Trend in JMP 348

Creating Actual versus Predicted Plots and Residual Plots in JMP 350

Exponential Trend 350

Computing Forecast Errors for Exponential Trend Models 352

Polynomial Trend 352

Fitting a Polynomial Trend in JMP 353

16.2 A Model with Seasonality 353

16.3 A Model with Trend and Seasonality 356

16.4 Autocorrelation and ARIMA Models 356

Computing Autocorrelation 356

Improving Forecasts by Integrating Autocorrelation Information 360

Fitting AR (Autoregression) Models in the JMP Time Series

Platform 361

Fitting AR Models to Residuals 361

Evaluating Predictability 363

Summary: Fitting Regression-Based Time Series Models in JMP 365

Problems 366

17 Smoothing Methods 377

17.1 Introduction 377

17.2 Moving Average 378

Centered Moving Average for Visualization 378

Trailing Moving Average for Forecasting 379

Computing a Trailing Moving Average Forecast in JMP 380

Choosing Window Width (𝑤) 382

17.3 Simple Exponential Smoothing 382

Choosing Smoothing Parameter 𝛼 383

Fitting Simple Exponential Smoothing Models in JMP 384

Creating Plots for Actual versus Forecasted Series and Residuals Series Using the Graph Builder 386

Relation between Moving Average and Simple Exponential Smoothing 386

17.4 Advanced Exponential Smoothing 387

Series with a Trend 387

Series with a Trend and Seasonality 388

Problems 390

PART VII CASES

18 Cases 402

18.1 Charles Book Club 401

The Book Industry 401

Database Marketing at Charles 402

Data Mining Techniques 403

Assignment 405

18.2 German Credit 409

Background 409

Data 409

Assignment 409

18.3 Tayko Software Cataloger 410

Background 410

The Mailing Experiment 413

Data 413

Assignment 413

18.4 Political Persuasion 415

Background 415

Predictive Analytics Arrives in US Politics 415

Political Targeting 416

Uplift 416

Data 417

Assignment 417

18.5 Taxi Cancellations 419

Business Situation 419

Assignment 419

18.6 Segmenting Consumers of Bath Soap 420

Business Situation 420

Key Problems 421

Data 421

Measuring Brand Loyalty 421

Assignment 421

18.7 Direct-Mail Fundraising 423

Background 423

Data 424

Assignment 425

18.8 Predicting Bankruptcy 425

Predicting Corporate Bankruptcy 426

Assignment 428

18.9 Time Series Case: Forecasting Public Transportation Demand 428

Background 428

Problem Description 428

Available Data 428

Assignment Goal 429

Assignment 429

Tips and Suggested Steps 429

References 431

Data Files Used in the Book 433

Index 435

Title Details

Rights Contact Login For More Details

More About This Title Data Mining for Business Analytics: Concepts, Techniques, and Applications with JMP Pro(R)

English

English

English