--- title: "Chapter 01: Setting Up OpenCL and Enabling GPU Acceleration" author: "Kjell Nygren" date: "`r Sys.Date()`" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Chapter 01: Setting Up OpenCL and Enabling GPU Acceleration} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r setup, include = FALSE} knitr::opts_chunk$set(collapse = TRUE, comment = "#>") ``` ## Overview `nmathopencl` can run in two modes: - **CPU-only mode**: all distribution functions fall back to `stats::` or base-R equivalents. No special setup is needed. This is the default when the package is installed from a prebuilt binary (e.g., from CRAN or R-universe). Most downstream use of the nmath `.cl` library does not require a GPU-enabled build of `nmathopencl` itself; the `.cl` source files are always present on disk regardless. - **GPU-accelerated mode**: kernels execute on an OpenCL device. This requires compiling the package from source on a machine with an OpenCL SDK installed. It is necessary when you want to use the exported `*_opencl` wrappers to validate GPU pipeline operation, or when your downstream package links directly against the nmathopencl kernel runners. This vignette describes how to read the attach-time advisory, understand the **`opencltools`** dependency, and enable OpenCL specifically for **`nmathopencl`**. Detailed per-OS driver, header, and ICD instructions are maintained in **`opencltools`** vignette **Chapter 01**; end-user GPU setup notes also appear in **Chapter 02**. ## When you load nmathopencl Attaching the package in an interactive session may print a short **`packageStartupMessage`**. This is intentional: it gives a first-pass diagnosis of whether GPU acceleration is *feasible* on your machine and whether *this install* was compiled with OpenCL support. CPU execution is always available. The message appears only when there is a plausible GPU/OpenCL path that is not yet enabled in the installed binaries, or when the host stack looks incomplete. When both **`nmathopencl`** and **`opencltools`** were built with OpenCL, attach is silent. To suppress the message in scripts or CI: ```{r, eval = FALSE} options(nmathopencl.quiet_opencl_startup = TRUE) ``` Further detail: `?gpu_diagnostics`, `?nmathopencl-package`, this vignette, and `vignette("Chapter-12", package = "nmathopencl")`. ### What the message means | Situation | Typical message gist | Next step | |-----------|---------------------|-----------| | **`opencltools` OK, `nmathopencl` not** | opencltools was built with OpenCL but nmathopencl was not | [Step 3](#step-3-if-opencltools-is-enabled-but-nmathopencl-is-not) --- reinstall nmathopencl from source | | **`nmathopencl` OK, `opencltools` not** | nmathopencl was built with OpenCL but opencltools was not | Reinstall **opencltools** from source (unusual; it is an Imports dependency) | | **Neither OK, headers/runtime detected** | Reinstall both packages from source | [Step 2](#step-2-if-opencl-is-not-enabled-at-all) then [Step 3](#step-3-if-opencltools-is-enabled-but-nmathopencl-is-not) | | **Neither OK, GPU present, stack incomplete** | Install or repair drivers, ICD, headers | [Step 2](#step-2-if-opencl-is-not-enabled-at-all) --- follow **opencltools** Chapter 01 | | **Neither OK, no GPU/stack** | No suitable GPU/OpenCL environment | CPU-only is expected; GPU setup is optional | ## opencltools as imported dependency **`nmathopencl` lists `opencltools` in `Imports`** (>= 0.8.0). It is installed automatically with nmathopencl and provides two categories of functionality: - **Host/runtime diagnostics** --- GPU vendor detection, driver and ICD checks, `verify_opencl_runtime()`, and related probes. These are **not** re-exported; call `opencltools::…` directly. - **Kernel-library authoring tools** --- dependency tagging, sorting, and subset loading (`load_library_for_kernel`, `extract_library_subset`, ...). These **are** re-exported from nmathopencl for downstream kernel authors. Work through **opencltools** first when enabling OpenCL: fix the host environment and confirm `opencltools::has_opencl()` before reinstalling nmathopencl from source. Useful **opencltools** vignettes: | Vignette | Topic | |----------|-------| | `vignette("Chapter-01", package = "opencltools")` | Platform install: drivers, headers, ICD, verification | | `vignette("Chapter-02", package = "opencltools")` | Assembling kernel programs from ported library shards | | `vignette("Chapter-03", package = "opencltools")` | Kernel runners and wrappers (the glmbayes pattern) | ## What "compiling with OpenCL" means At compile time, each package's `configure` script detects whether `CL/cl.h` is present and a linkable `OpenCL` library exists. If both are found, it sets `-DUSE_OPENCL` and links against `-lOpenCL`. All OpenCL-specific C++ code is guarded by `#ifdef USE_OPENCL`, so the package compiles cleanly without OpenCL headers. A binary built **without** `USE_OPENCL` is a fully valid CPU-only package. **`nmathopencl_has_opencl()`** returns `FALSE` and every `*_opencl` wrapper silently uses its `stats::` fallback. Because **`nmathopencl` Imports `opencltools`**, both packages must be built with OpenCL for the exported `*_opencl` wrappers to dispatch to a GPU. The attach message compares `nmathopencl_has_opencl()` (nmathopencl) and `opencltools::has_opencl()` (opencltools) and tells you which reinstall is needed. ## Enabling OpenCL for nmathopencl Work through these steps in order. ### Step 1: Read the load message ```{r, eval = FALSE} library(nmathopencl) ``` If attach is silent and `nmathopencl_has_opencl()` returns `TRUE`, you are done --- skip to [Verifying the setup](#verifying-the-setup). Otherwise, use the table above or continue to Step 2 or 3. ### Step 2: If OpenCL is not enabled at all When `opencltools::has_opencl()` and `nmathopencl_has_opencl()` are both `FALSE`, the host stack or compile flags are missing for both packages. ```{r, eval = FALSE} opencltools::has_opencl() # opencltools build flag nmathopencl_has_opencl() # nmathopencl build flag opencltools::diagnose_glmbayes() # host/runtime report nmathopencl_has_opencl() # nmathopencl compile-time flag ``` Follow **`vignette("Chapter-01", package = "opencltools")`** for drivers, OpenCL headers, the ICD loader/runtime, and building **opencltools** from source until `opencltools::has_opencl()` is `TRUE`. Enabling GPU acceleration requires three independent host components: | Component | What it provides | Where to get it | |-----------|-----------------|-----------------| | **GPU driver** | Exposes the hardware to the OS | Vendor website or OS package manager | | **OpenCL headers** (`CL/cl.h`) | Needed at compile time | GPU vendor SDK or `opencl-headers` package | | **OpenCL ICD loader / runtime** | Needed at runtime (`OpenCL.dll` / `libOpenCL.so`) | Installed with driver or as `ocl-icd-libopencl1` | All three must be present. Having headers but not the loader (or vice versa) is the most common failure mode. Platform-specific install commands are maintained in **opencltools** Chapter 01. ### Step 3: If opencltools is enabled but nmathopencl is not When `opencltools::has_opencl()` is `TRUE` but `nmathopencl_has_opencl()` is `FALSE`, the host environment is likely fine; **nmathopencl** was installed from a CPU binary or compiled before the SDK was present. Reinstall **nmathopencl** from source: ```{r, eval = FALSE} # From CRAN or R-universe: install.packages("nmathopencl", type = "source") ``` The `configure` / `configure.win` script runs automatically and prints whether `USE_OPENCL` was activated. Then confirm: ```{r, eval = FALSE} nmathopencl_has_opencl() #> [1] TRUE ``` ## Verifying the setup After Steps 2--3, confirm the full stack: ```{r, eval = FALSE} library(nmathopencl) # nmathopencl compile-time flag nmathopencl_has_opencl() #> [1] TRUE # opencltools compile-time flag (imported dependency) opencltools::has_opencl() #> [1] TRUE # Host GPU inventory via opencltools (not the compile flag) opencltools::gpu_names() #> [1] "NVIDIA GeForce RTX 4090" # Host/runtime diagnostic report (opencltools) opencltools::diagnose_glmbayes() ``` `opencltools::diagnose_glmbayes()` runs a layered host check (environment, GPU vendors, drivers, OpenCL headers and ICD, Linux/WSL runtime probe, PATH / `LD_LIBRARY_PATH`). Pair it with `nmathopencl_has_opencl()` for this package's compile-time OpenCL build flag. On Windows, the Linux/WSL runtime probe is skipped; rely on driver and ICD checks instead. ### Validating GPU calls with the exported wrappers Once `nmathopencl_has_opencl()` is `TRUE`, the exported `*_opencl` functions dispatch to the GPU. These are most useful as a **validation tool**: running them on large vectors confirms the OpenCL pipeline is working end-to-end before you invest time in a downstream package. ```{r, eval = FALSE} x <- rnorm(1e7) system.time(dnorm_opencl(x, mean = 0, sd = 1)) system.time(dnorm(x, mean = 0, sd = 1)) ``` Use large vectors (millions of elements) for this test. For small to moderate vectors the GPU will often be *slower* than the CPU due to the overhead of kernel compilation and host-to-device data transfer. That is expected behavior, not a bug. The real-world acceleration from using the `nmathopencl` nmath library happens when statistical math calls are embedded *inside* a downstream package's GPU kernels, where they share device memory with the rest of the computation and avoid the round-trip transfer entirely. When OpenCL is available, the default `fallback = FALSE` surfaces kernel errors rather than silently falling back, which helps debugging during development. ## Troubleshooting common failures ### `nmathopencl_has_opencl()` returns `FALSE` after driver installation The package was compiled before the SDK was installed, or from a prebuilt binary. If `opencltools::has_opencl()` is already `TRUE`, go to [Step 3](#step-3-if-opencltools-is-enabled-but-nmathopencl-is-not). Otherwise, fix the host stack first ([Step 2](#step-2-if-opencl-is-not-enabled-at-all)). ### Compilation fails with `CL/cl.h: No such file or directory` The OpenCL headers are not on the compiler include path. See **`vignette("Chapter-01", package = "opencltools")`** and, if necessary, set `OPENCL_HOME` or `OPENCL_SDK` to the SDK root before installing. ### Runtime error: `clGetPlatformIDs: CL_PLATFORM_NOT_FOUND_KHR` Headers were found at compile time but no runtime platform is available. Check (details in **opencltools** Chapter 01): - Is the GPU driver installed and up to date? - Is the ICD file present (e.g., `/etc/OpenCL/vendors/nvidia.icd` on Linux)? - On WSL2: is the NVIDIA driver installed in Windows, not just inside WSL? Use `opencltools::verify_opencl_runtime()` for a programmatic probe. ### `PATH` warnings from `opencltools::diagnose_glmbayes()` On Windows, the CUDA Toolkit bin directory may need to be in **`PATH`** so that `OpenCL.dll` is found at runtime. The diagnostic report lists any missing entries; fixing them via system settings or your shell profile is the recommended approach. Advanced developers may use `opencltools::add_to_path_windows()` and related helpers directly --- see `?opencltools::add_to_path`. ## For package developers: next steps Once OpenCL is working for **`nmathopencl`** itself, the next step is adding `USE_OPENCL` and `nmathopencl_has_opencl()` to your own downstream package so that it also builds cleanly on CRAN. See **`vignette("Chapter-02", package = "nmathopencl")`** for the `use_opencl_configure()` / `port_to_opencl_configure()` workflow.