OpenCL
The long-standing open standard for heterogeneous compute. Runs on GPUs, CPUs and accelerators from every vendor, so a single kernel is broadly portable — the trade-off is a verbose host API and less cutting-edge tooling than CUDA.
KhronosC/C++ · manycross-vendoropen standard
Install
# Linux: install an ICD loader + a vendor runtime
sudo apt-get install -y ocl-icd-opencl-dev clinfo
clinfo # lists available OpenCL platforms/devices
Hello, GPU
add.cl — an OpenCL vector-add kernel
// add.cl — the device kernel
__kernel void add(__global const float* a,
__global const float* b,
__global float* c) {
int i = get_global_id(0);
c[i] = a[i] + b[i];
}
/* Host (C, abbreviated):
clGetPlatformIDs / clGetDeviceIDs -> pick a device
clCreateContext / clCreateCommandQueue
clCreateProgramWithSource + clBuildProgram(add.cl)
clCreateBuffer x3 -> a, b, c
clSetKernelArg x3
clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &n, NULL, ...)
clEnqueueReadBuffer -> read c back */
Run it:
# compile the host program against -lOpenCL, then run it