Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more

Recommended product

Popular links

Popular links


Foundations of Data Science

Foundations of Data Science

Foundations of Data Science

Avrim Blum, Toyota Technological Institute at Chicago
John Hopcroft, Cornell University, New York
Ravindran Kannan, Microsoft Research, India
January 2020
Hardback
9781108485067
£44.99
GBP
Hardback
USD
eBook

    This book provides an introduction to the mathematical and algorithmic foundations of data science, including machine learning, high-dimensional geometry, and analysis of large networks. Topics include the counterintuitive nature of data in high dimensions, important linear algebraic techniques such as singular value decomposition, the theory of random walks and Markov chains, the fundamentals of and important algorithms for machine learning, algorithms and analysis for clustering, probabilistic models for large networks, representation learning including topic modelling and non-negative matrix factorization, wavelets and compressed sensing. Important probabilistic techniques are developed including the law of large numbers, tail inequalities, analysis of random projections, generalization guarantees in machine learning, and moment methods for analysis of phase transitions in large random graphs. Additionally, important structural and complexity measures are discussed such as matrix norms and VC-dimension. This book is suitable for both undergraduate and graduate courses in the design and analysis of algorithms for data.

    • Contains over 350 end-of-chapter exercises
    • Includes over ninety figures which illustrate key concepts in the text

    Awards

    Winner, 2020 Choice Outstanding Academic Title

    Read more

    Reviews & endorsements

    'This beautifully written text is a scholarly journey through the mathematical and algorithmic foundations of data science. Rigorous but accessible, and with many exercises, it will be a valuable resource for advanced undergraduate and graduate classes.' Peter Bartlett, University of California, Berkeley

    'The rise of the Internet, digital media, and social networks has brought us to the world of data, with vast sources from every corner of society. Data Science - aiming to understand and discover the essences that underlie the complex, multifaceted, and high-dimensional data - has truly become a 'universal discipline', with its multidisciplinary roots, interdisciplinary presence, and societal relevance. This timely and comprehensive book presents - by bringing together from diverse fields of computing - a full spectrum of mathematical, statistical, and algorithmic materials fundamental to data analysis, machine learning, and network modeling. Foundations of Data Science offers an effective roadmap to approach this fascinating discipline and engages more advanced readers with rigorous mathematical/algorithmic theory.' Shang-Hua Teng, University of Southern California

    'A lucid account of mathematical ideas that underlie today's data analysis and machine learning methods. I learnt a lot from it, and I am sure it will become an invaluable reference for many students, researchers and faculty around the world.' Sanjeev Arora, Princeton University, New Jersey

    'It provides a very broad overview of the foundations of data science that should be accessible to well-prepared students with backgrounds in computer science, linear algebra, and probability theory … These are all important topics in the theory of machine learning and it is refreshing to see them introduced together in a textbook at this level.' Brian Borchers, MAA Reviews

    'One plausible measure of [Foundations of Data Science's] impact is the book's own citation metrics. Semantic Scholar (https://www.semanticscholar.org) reports 81 citations with 42 citations related to background or methods; [Foundations of Data Science] appears to be on course to becoming influential.' M. Mounts, Choice

    See more reviews

    Product details

    January 2020
    Hardback
    9781108485067
    432 pages
    259 × 182 × 27 mm
    0.93kg
    Available

    Table of Contents

    • 1. Introduction
    • 2. High-dimensional space
    • 3. Best-fit subspaces and Singular Value Decomposition (SVD)
    • 4. Random walks and Markov chains
    • 5. Machine learning
    • 6. Algorithms for massive data problems: streaming, sketching, and sampling
    • 7. Clustering
    • 8. Random graphs
    • 9. Topic models, non-negative matrix factorization, hidden Markov models, and graphical models
    • 10. Other topics
    • 11. Wavelets
    • 12. Appendix.