Introduction to Probability and Statistics for Data Science
Introduction to Probability and Statistics for Data Science provides a solid course in the fundamental concepts, methods and theory of statistics for students in statistics, data science, biostatistics, engineering, and physical science programs. It teaches students to understand, use, and build on modern statistical techniques for complex problems. The authors develop the methods from both an intuitive and mathematical angle, illustrating with simple examples how and why the methods work. More complicated examples, many of which incorporate data and code in R, show how the method is used in practice. Through this guidance, students get the big picture about how statistics works and can be applied. This text covers more modern topics such as regression trees, large scale hypothesis testing, bootstrapping, MCMC, time series, and fewer theoretical topics like the Cramer-Rao lower bound and the Rao-Blackwell theorem. It features more than 250 high-quality figures, 180 of which involve actual data. Data and R are code available on our website so that students can reproduce the examples and do hands-on exercises.
- Provides a solid course in the fundamental concepts, methods and theory of statistics for a wide array of students in statistics, data science, biostatistics, engineering, and physical science programs
- Teaches students to understand, use, and build on modern statistical techniques for complex problems
- Develops statistical methods from both an intuitive and mathematical angle. Simple examples illustrate how and why the methods work, while more complicated examples show how the method is used in practice
- All theory is developed with immediate and direct applications
- Covers modern topics, like regression trees, large scale hypothesis testing, bootstrapping, MCMC, time series
- Accompanied by data and code repositories
Reviews & endorsements
'This book serves as an excellent resource for students with diverse backgrounds, offering a thorough exploration of fundamental topics in statistics. The clear explanation of concepts, methods, and theory, coupled with an abundance of practical examples, provides a solid foundation to help students understand statistical principles and bridge the gap between theory and application. This book offers invaluable insights and guidance for anyone seeking to master the principles of statistics. I highly recommend adopting this book for my future statistics class.' Haijun Gong, Saint Louis University
'Professors Rigdon, Fricker and Montgomery have put together an impressive volume that covers not only basic probability and basic statistics, but also includes extensions in a number of directions, all of which have immediate relevance to the work of practitioners in quantitative fields. Suffused with common sense and insights about real data and problems, it is both approachable and precise. I'm excited about the inclusion of material on power and on multiple testing, both of which will help users become smarter about what their analyses can do, and I applaud their omission of too much theory. I also appreciate their use of R and of real data. This would be an excellent text for undergraduate or graduate-level data analysts.' Sam Buttrey, Naval Postgraduate School (NPS)
'This is a comprehensive and rich book that extends foundational concepts in statistics and probability in easily accessible form into data science as an integrated discipline. The reader applies and validates theoretical concepts in R and connects results from R back to the theory across many methods: from descriptive statistics to Bayesian models, time series, generalized linear models and more. Thoroughly enjoyable!' Oliver Schabenberger, Virginia Tech Academy of Data Science
Product details
October 2024Adobe eBook Reader
9781009573344
0 pages
This ISBN is for an eBook version which is distributed on our behalf by a third party.
Table of Contents
- Part I. Descriptive Statistics & Data Science:
- 1. Introduction
- 2. Descriptive statistics
- 3. Data visualization
- Part II. Probability:
- 4. Basic probability
- 5. Random variables
- 6. Discrete distributions
- 7. Continuous distribution
- Part III. Classical Statistical Inference:
- 8. About data & data collection
- 9. Sampling distributions
- 10. Point estimation
- 11. Confidence intervals
- 12. Hypothesis testing
- 13. Hypothesis tests for two or more samples
- 14. Hypothesis tests for discrete data
- 15. Regression
- Part IV. Bayesian and Other Computer Intensive Methods:
- 16. Bayesian methods
- 17. Time series methods
- 18. The jackknife and bootstrap
- Part V. Advanced Topics in Inference & Data Science:
- 19. Generalized linear models and regression trees
- 20. Cross-validation and estimates of prediction error
- 21. Large-scale hypothesis testing and the false discovery rate
- Appendix. More About R.