Our systems are now restored following recent technical disruption, and we’re working hard to catch up on publishing. We apologise for the inconvenience caused. Find out more

Recommended product

Popular links

Popular links


Introduction to Parallel Programming

Introduction to Parallel Programming

Introduction to Parallel Programming

Subodh Kumar, Indian Institute of Technology, Delhi
January 2023
Paperback
9781009069533

Experience the eBook and the associated online resources on our new Higher Education website. Go to site For other formats please stay on this page.

$59.99
USD
Paperback

    In modern computer science, there exists no truly sequential computing system; and most advanced programming is parallel programming. This is particularly evident in modern application domains like scientific computation, data science, machine intelligence, etc. This lucid introductory textbook will be invaluable to students of computer science and technology, acting as a self-contained primer to parallel programming. It takes the reader from introduction to expertise, addressing a broad gamut of issues. It covers different parallel programming styles, describes parallel architecture, includes parallel programming frameworks and techniques, presents algorithmic and analysis techniques and discusses parallel design and performance issues. With its broad coverage, the book can be useful in a wide range of courses; and can also prove useful as a ready reckoner for professionals in the field.

    • Features a holistic and in-depth look at basic parallel programming
    • Equal emphasis on both the theoretical underpinnings and practical applications of the subject
    • A large number of illustrations to explain the concepts and techniques
    • Objectives and important ideas introduced in each chapter is highlighted
    • A supplements package comprising a solutions manual, and teaching aids for instructors

    Product details

    January 2023
    Paperback
    9781009069533
    350 pages
    241 × 185 × 15 mm
    0.5kg
    Available

    Table of Contents

    • List of Figures
    • Introduction
    • Concurrency and Parallelism
    • Why Study Parallel Programming
    • What is in this Book
    • 1. An Introduction to Parallel Computer Architecture
    • 1.1 Parallel Organization
    • SISD: Single Instruction, Single Data
    • SIMD: Single Instruction, Multiple Data
    • MIMD: Multiple Instruction, Multiple Data
    • MISD: Multiple Instruction, Single Data
    • 1.2 System Architecture
    • 1.3 CPU Architecture
    • 1.4 Memory and Cache
    • 1.5 GPU Architecture
    • 1.6 Interconnect Architecture
    • Routing
    • Links
    • Types and Quality of Networks
    • Torus Network
    • Hypercube Network
    • Cross-Bar Network
    • Shuffle-Exchange Network
    • Clos Network
    • Tree Network
    • Network Comparison
    • 1.7 Summary
    • 2. Parallel Programming Models
    • 2.1 Distributed-Memory Programming Model
    • 2.2 Shared-Memory Programming Model
    • 2.3 Task Graph Model
    • 2.4 Variants of Task Parallelism
    • 2.5 Summary
    • 3. Parallel Performance Analysis
    • 3.1 Simple Parallel Model
    • 3.2 Bulk-Synchronous Parallel Model
    • BSP Computation Time
    • BSP Example
    • 3.3 PRAM Model
    • PRAM Computation Time
    • PRAM Example
    • 3.4 Parallel Performance Evaluation
    • Latency and Throughput
    • Speed-up
    • Cost
    • Efficiency
    • Scalability
    • Iso-efficiency
    • 3.5 Parallel Work
    • Brent's Work-Time Scheduling Principle
    • 3.6 Amdahl's Law
    • 3.7 Gustafson's Law
    • 3.8 Karp–Flatt Metric
    • 3.9 Summary
    • 4. Synchronization and Communication Primitives
    • 4.1 Threads and Processes
    • 4.2 Race Condition and Consistency of State
    • Sequential Consistency
    • Causal Consistency
    • FIFO and Processor Consistency
    • Weak Consistency
    • Linearizability
    • 4.3 Synchronization
    • Synchronization Condition
    • Protocol Control
    • Progress
    • Synchronization Hazards
    • 4.4 Mutual Exclusion
    • Lock
    • Peterson's Algorithm
    • Bakery Algorithm
    • Compare and Swap
    • Transactional Memory
    • Barrier and Consensus
    • 4.5 Communication
    • Point-to-Point Communication
    • RPC
    • Collective Communication
    • 4.6 Summary
    • 5. Parallel Program Design
    • 5.1 Design Steps
    • Granularity
    • Communication
    • Synchronization
    • Load Balance
    • 5.2 Task Decomposition
    • Domain Decomposition
    • Functional Decomposition
    • Task Graph Metrics
    • 5.3 Task Execution
    • Preliminary Task Mapping
    • Task Scheduling Framework
    • Centralized Push Scheduling Strategy
    • Distributed Push Scheduling
    • Pull Scheduling
    • 5.4 Input/Output
    • 5.5 Debugging and Profiling
    • 5.6 Summary
    • 6. Middleware: The Practice of Parallel Programming
    • 6.1 OpenMP
    • Preliminaries
    • OpenMP Thread Creation
    • OpenMP Memory Model
    • OpenMP Reduction
    • OpenMP Synchronization
    • Sharing a Loop's Work
    • Other Work-Sharing Pragmas
    • SIMD Pragma
    • Tasks
    • 6.2 MPI
    • MPI Send and Receive
    • Message-Passing Synchronization
    • MPI Data Types
    • MPI Collective Communication
    • MPI Barrier
    • MPI Reduction
    • One-Sided Communication
    • MPI File IO
    • MPI Groups and Communicators
    • MPI Dynamic Parallelism
    • MPI Process Topology
    • 6.3 Chapel
    • Partitioned Global Address Space
    • Chapel Tasks
    • Chapel Variable Scope
    • 6.4 Map-Reduce
    • Parallel Implementation
    • Hadoop
    • 6.5 GPU Programming
    • OpenMP GPU Off-Load
    • Data and Function on Device
    • Thread Blocks in OpenMP
    • CUDA
    • CUDA Programming Model
    • CPU–GPU Memory Transfer
    • Concurrent Kernels
    • CUDA Synchronization
    • CUDA Shared Memory
    • CUDA Parallel Memory Access
    • False Sharing
    • 6.6 Summary
    • 7. Parallel Algorithms and Techniques
    • 7.1 Divide and Conquer: Prefix-Sum
    • Parallel Prefix-Sum: Method 1
    • Parallel Prefix-Sum: Method 2
    • Parallel Prefix-Sum: Method 3
    • 7.2 Divide and Conquer: Merge Two Sorted Lists
    • Parallel Merge: Method 1
    • Parallel Merge: Method 2
    • Parallel Merge: Method 3
    • Parallel Merge: Method 4
    • 7.3 Accelerated Cascading: Find Minima
    • 7.4 Recursive Doubling: List Ranking
    • 7.5 Recursive Doubling: Euler Tour
    • 7.6 Recursive Doubling: Connected Components
    • 7.7 Pipelining: Merge-Sort
    • Basic Merge-Sort
    • Pipelined Merges
    • 4-Cover Property Analysis
    • Merge Operation per Tick
    • 7.8 Application of Prefix-Sum: Radix-Sort
    • 7.9 Exploiting Parallelism: Quick-Sort
    • 7.10 Fixing Processor Count: Sample-Sort
    • 7.11 Exploiting Pa