瀏覽紀錄

TOP
1/1
無庫存,下單後進貨(採購期約45個工作天)
大規模併行處理器程序設計(影印版)(簡體書)
人民幣定價:36元
定  價:NT$216元
可得紅利積點:6 點

無庫存,下單後進貨(採購期約45個工作天)

商品簡介

作者簡介

目次

《大規模並行處理器程序設計(影印版)》介紹了並行程序設計與GPU體系結構的基本概念,并詳細探討了用于構建並行程序的各種技術,用案例演示了並行程序設計的整個開發過程,即從並行計算的思想開始,直到最終實現實際且高效的並行程序。
作者:(美國)柯克(David B.Kirk) (美國)Wen-mei W.Hwu
Preface
Acknowledgments
Dedication
CHAPTER 1 INTRODUCTION
1.1 GPUs as Parallel Computers
1.2 Architecture of a Modem GPU
1.3 Why More Speed or Parallelism?
1.4 Parallel Programming Languages and Models
1.5 0verarching Goals
1.6 Organization of the Book

CHAPTER 2 HISTORY OF GPU COMPUTING
2.1 Evolution of Graphics Pipelines
2.1.1 The Era of Fixed-Function Graphics Pipelines
2.1.2 Evolution of Programmable Real-Time Graphics
2.1.3 Unified Graphics and Computing Processors
2.1.4 GPGPU: An Intermediate Step
2.2 GPU Computing
2.2.1 Scalable GPUs
2.2.2 Recent Developments
2.3 Future Trends

CHAPTER 3 INTRODUCTION TO CUDA
3.1 Data Parallelism
3.2 CUDA Program Structure
3.3 A Matrix-Matrix Multiplication Example
3.4 Device Memories and Data Transfer
3.5 Kernel Functions and Threading
3.6 Summary
3.6.1 Function declarations
3.6.2 Kernel launch
3.6.3 Predefined variables
3.6.4 Runtime API

CHAPTER 4 CUDA THREADS
4.1 CUDA Thread Organization
4.2 blockIdx and threadIdx
4.3 Synchronization and Transparent Scalability
4.4 Thread Assignment
4.5 Thread Scheduling and Latency Tolerance
4.6 Summary
4.7 Exercises

CHAPTER 5 CUDATM MEMORIES
5.1 Importance of Memory Access Efficiency
5.2 CUDA Device Memory Types
5.3 A Strategy for Reducing Global Memory Traffic
5.4 Memory as a Limiting Factor to Parallelism
5.5 Summary
5.6 Exercises

CHAPTER 6 PERFORMANCE CONSIDERATIONS
6.1 More on Thread Execution
6.2 Global Memory Bandwidth
6.3 Dynamic Partitioning of SM Resources
6.4 Data Prefetching
6.5 Instruction Mix
6.6 Thread Granularity
6.7 Measured Performance and Summary
6.8 Exercises

CHAPTER 7 FLOATING POINT CONSIDERATIONS
7.1 Floating-Point Format
7.1.1 Normalized Representation of M
7.1.2 Excess Encoding of E
7.2 Representable Numbers
7.3 Special Bit Patterns and Precision
7.4 Arithmetic Accuracy and Rounding
7.5 Algorithm Considerations
7.6 Summary
7.7 Exercises

CHAPTER 8 APPLICATION CASE STUDY: ADVANCED MRI
RECONSTRUCTION
8.1 Application Background
8.2 Iterative Reconstruction
8.3 Computing FHd
Step 1. Determine the Kernel Parallelism Structure
Step 2. Getting Around the Memory Bandwidth Limitation.
Step 3. Using Hardware Trigonometry Functions
Step 4. Experimental Performance Tuning
8.4 Final Evaluation
8.5 Exercises

CHAPTER 9 APPLICATION CASE STUDY: MOLECULAR VISUALIZATION
AND ANALYSIS
9.1 Application Background
9.2 A Simple Kernel Implementation
9.3 Instruction Execution Efficiency
9.4 Memory Coalescing
9.5 Additional Performance Comparisons
9.6 Using Multiple GPUs
9.7 Exercises

CHAPTER 10 PARALLEL PROGRAMMING AND COMPUTATIONAL
THINKING
10.1 Goals of Parallel Programming
10.2 Problem Decomposition
10.3 Algorithm Selection
10.4 Computational Thinking
10.5 Exercises

CHAPTER 11 A BRIEF INTRODUCTION TO OPENCLTM
11.1 Background
11.2 Data Parallelism Model
11.3 Device Architecture
11.4 Kernel Functions
11.5 Device Management and Kernel Launch
11.6 Electrostatic Potential Map in OpenCL
11.7 Summary
11.8 Exercises

CHAPTER 12 CONCLUSION AND FUTURE OUTLOOK
12.1 Goals Revisited
12.2 Memory Architecture Evolution
12.2.1 Large Virtual and Physical Address Spaces
12.2.2 Unified Device Memory Space
12.2.3 Configurable Caching and Scratch Pad
12.2.4 Enhanced Atomic Operations
12.2.5 Enhanced Global Memory Access
12.3 Kernel Execution Control Evolution
12.3.1 Function Calls within Kernel Functions
12.3.2 Exception Handling in Kernel Functions
12.3.3 Simultaneous Execution of Multiple Kernels
12.3.4 Interruptible Kernels
12,4 Core Performance
12.4.1 Double-Precision Speed
12.4.2 Better Control Flow Efficiency
12.5 Programming Environment
12.6 A Bright Outlook
APPENDIX A MATRIX MULTIPLICATION HOST-ONLY VERSION
SOURCE CODE
A.1 matrixmul.cu
A.2 matri mulgol d.cpp
A.3 matrixmul, h
A.4 assi st. h
A.5 Expected Output
APPENDIX B GPU COMPUTE CAPABILITIES
B.1 GPU Compute Capability Tables
B.2 Memory Coalescing Variations
Index

購物須知

為了保護您的權益,「三民網路書店」提供會員七日商品鑑賞期(收到商品為起始日)。

若要辦理退貨,請在商品鑑賞期內寄回,且商品必須是全新狀態與完整包裝(商品、附件、發票、隨貨贈品等)否則恕不接受退貨。

大陸出版品因裝訂品質及貨運條件與台灣出版品落差甚大,除封面破損、內頁脫落等較嚴重的狀態,其餘商品將正常出貨。

無現貨庫存之簡體書,將向海外調貨:
海外有庫存之書籍,等候約20個工作天;
海外無庫存之書籍,平均作業時間約45個工作天,然不保證確定可調到貨,尚請見諒。