The website uses cookies. By using this site, you agree to our use of cookies as described in the Privacy Policy.
I Agree
blank_error__heading
blank_error__body
Text direction?

CUDA编程入门极简教程 - 知乎

Measure
Measure
Related Notes
Get a free MyMarkup account to save this article and view it later on any device.
Create account

End User License Agreement

Summary | 19 Annotations
host指代CPU及其内存
2020/09/02 11:25
device指代GPU及其内存
2020/09/02 11:25
用<<<grid, block>>>来指定kernel要执行的线程数量
2020/09/02 11:25
核函数用__global__符号声明
2020/09/02 11:25
每一个线程
2020/09/02 11:25
threadIdx
2020/09/02 11:25
host不会等待kernel执行完就执行下一步
2020/09/02 11:26
但可和__device__,此时函数会在device和host都编译
2020/09/02 11:26
kernel在device上执行时实际上是启动很多线程
2020/09/02 11:28
一个线程块里面包含很
2020/09/02 11:28
多线程
2020/09/02 11:28
blockIdx
2020/09/02 11:29
threadIdx
2020/09/02 11:29
blockDim
2020/09/02 11:29
gridDim
2020/09/02 11:30
一个线程块只能在一个SM上被调度。SM一般可以调度多个线程块
2020/09/02 11:33
block大小一般要设置为32的倍数
2020/09/02 11:34
使用cudaMallocManaged函数分配托管内存
2020/09/02 11:36
由于托管内存自动进行数据传输,这里要用cudaDeviceSynchronize()函数保证device和host同步
2020/09/02 11:37