Wiley, 2019. — 329 p. — ISBN: 978-1-119-29626-3.
A guide to the principles and methods of data analysis that does not require knowledge of statistics or programming
This book is an essential guide to understand and use data analytics. This book is written using easy-to-understand terms and does not require familiarity with statistics or programming. The authors—noted experts in the field—highlight an explanation of the intuition behind the basic data analytics techniques. The text also contains exercises and illustrative examples.
Thought to be easily accessible to non-experts, the book provides motivation to the necessity of analyzing data. It explains how to visualize and summarize data, and how to find natural groups and frequent patterns in a dataset. The book also explores predictive tasks, be them classification or regression. Finally, the book discusses popular data analytic applications, like mining the web, information retrieval, social network analysis, working with text, and recommender systems. The learning resources offer:
A guide to the reasoning behind data mining techniques
A unique illustrative example that extends throughout all the chapters
Exercises at the end of each chapter and larger projects at the end of each of the text’s two main parts
Together with these learning resources, the book can be used in a 13-week course guide, one chapter per course topic.
The book was written in a format that allows the understanding of the main data analytics concepts by non-mathematicians, non-statisticians and non-computer scientists interested in getting an introduction to data science. A General Introduction to Data Analytics is a basic guide to data analytics written in highly accessible terms.
True PDFIntroductory Background
What Can We Do With Data?Big Data and Data Science
Big Data Architectures
Small Data
What is Data?
A Short Taxonomy of Data Analytics
Examples of Data Use
A Project on Data Analytics
How this Book is Organized
Who Should Read this Book
Getting Insights from Data
Descriptive StatisticsScale Types
Descriptive Univariate Analysis
Descriptive Bivariate Analysis
Final Remarks
Exercises
Descriptive Multivariate AnalysisMultivariate Frequencies
Multivariate Data Visualization
Multivariate Statistics
Infographics and Word Clouds
Final Remarks
Exercises
Data Quality and PreprocessingData Quality
Converting to a Different Scale Type
Converting to a Different Scale
Data Transformation
Dimensionality Reduction
Final Remarks
Exercises
ClusteringDistance Measures
Clustering Validation
Clustering Techniques
Final Remarks
Exercises
Frequent Pattern MiningFrequent Itemsets
Association Rules
Behind Support and Confidence
Other Types of Pattern
Final Remarks
Exercises
Cheat Sheet and Project on Descriptive AnalyticsCheat Sheet of Descriptive Analytics
Project on Descriptive Analytics
Predicting the Unknown
RegressionPredictive Performance Estimation
Finding the Parameters of the Model
Technique and Model Selection
Final Remarks
Exercises
ClassificationBinary Classification
Predictive Performance Measures for Classification
Distance-based Learning Algorithms
Probabilistic Classification Algorithms
Final Remarks
Exercises
Additional Predictive MethodsSearch-based Algorithms
Optimization-based Algorithms
Final Remarks
Exercises
Advanced Predictive TopicsEnsemble Learning
Algorithm Bias
Non-binary Classification Tasks
Advanced Data Preparation Techniques for Prediction
Description and Prediction with Supervised Interpretable Techniques
Exercises
Cheat Sheet and Project on Predictive AnalyticsCheat Sheet on Predictive Analytics
Project on Predictive Analytics
Popular Data Analytics Applications
Applications for Text, Web and Social MediaWorking with Texts
Recommender Systems
Social Network Analysis
Exercises
Apendix A: Comprehensive Description of the CRISP-DM Methodology