OpenCL

The long-standing open standard for heterogeneous compute. Runs on GPUs, CPUs and accelerators from every vendor, so a single kernel is broadly portable — the trade-off is a verbose host API and less cutting-edge tooling than CUDA.

KhronosC/C++ · manycross-vendoropen standard

Official docs ↗ ← All libraries

Install

# Linux: install an ICD loader + a vendor runtime
sudo apt-get install -y ocl-icd-opencl-dev clinfo
clinfo            # lists available OpenCL platforms/devices

Hello, GPU

add.cl — an OpenCL vector-add kernel

// add.cl — the device kernel
__kernel void add(__global const float* a,
                  __global const float* b,
                  __global float* c) {
    int i = get_global_id(0);
    c[i] = a[i] + b[i];
}

/* Host (C, abbreviated):
   clGetPlatformIDs / clGetDeviceIDs   -> pick a device
   clCreateContext / clCreateCommandQueue
   clCreateProgramWithSource + clBuildProgram(add.cl)
   clCreateBuffer  x3                  -> a, b, c
   clSetKernelArg  x3
   clEnqueueNDRangeKernel(queue, kernel, 1, NULL, &n, NULL, ...)
   clEnqueueReadBuffer -> read c back                              */

Run it:

# compile the host program against -lOpenCL, then run it

Learn more