--- title: "Chapter 16: Large models — GPU acceleration using OpenCL" output: rmarkdown::html_vignette vignette: > %\VignetteIndexEntry{Chapter 16: Large models — GPU acceleration using OpenCL} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` # Introduction GPU acceleration is an **optional** feature of `glmbayes`. All modeling functions --- `glmb()`, `lmb()`, `rglmb()`, and related tools --- run fully on the CPU regardless of whether OpenCL is available. No setup is needed for standard use. Where GPU acceleration pays off is with **large models**: high-dimensional predictor sets or large posterior sample sizes. The computationally intensive work in `glmbayes` is envelope construction and evaluation --- the gradient and log-posterior calculations at each point of the tangency grid grow with model dimension and are embarrassingly parallel. Dispatching them to a GPU with `use_opencl = TRUE` can substantially reduce wall time for these cases. See **Chapter A10** for a technical explanation of what is accelerated and why. This chapter describes how to enable GPU acceleration. The process closely resembles a source install of any compiled R package: the only extra step is ensuring that the **'opencltools'** dependency is in an OpenCL-ready state before the source install. # What you see when you load glmbayes When `glmbayes` is loaded in an interactive session it checks, silently, whether GPU acceleration appears feasible. If `has_opencl()` is already `TRUE` --- meaning this build was compiled with OpenCL support --- attach is completely silent. If `has_opencl()` is `FALSE` **and** the package detects a GPU or OpenCL stack on the host, you will see a message like: ``` Note: glmbayes provides full CPU capability in this session (e.g. glmb(), lmb(), Prior_Setup()). GPU acceleration is recommended for bigger models and appears available. Reinstall glmbayes from source with OpenCL at compile time to enable it; see vignette("Chapter-16", "glmbayes") for install instructions. ``` On a machine with no GPU and no OpenCL stack, attach is silent --- the CPU-only install is entirely appropriate and no action is needed. To suppress the message in scripts or automated workflows: ```r options(glmbayes.quiet_opencl_startup = TRUE) ``` # Enabling GPU acceleration: three steps Work through these steps in order. After each step you can check whether you are done and skip the rest. ## Step 1: Check whether OpenCL is already enabled ```r library(glmbayes) has_opencl() ``` If this returns `TRUE`, GPU acceleration is already compiled in. Pass `use_opencl = TRUE` to `glmb()` and you are done. Otherwise continue to Step 2. ## Step 2: Ensure 'opencltools' is OpenCL-ready `opencltools` is installed automatically as a dependency of `glmbayes`. It provides the host diagnostics and runtime checks that `glmbayes` relies on. For GPU acceleration to work in `glmbayes`, `opencltools` must itself be built with OpenCL support. Check: ```r opencltools::has_opencl() ``` If this returns `FALSE`, follow **`vignette("Chapter-01", package = "opencltools")`** to install the required OpenCL components (GPU driver, headers, ICD loader) for your platform and reinstall `opencltools` from source. The `opencltools` Chapter 01 vignette is the maintained home for per-OS installation instructions and keeps them current. For a host-level diagnostic that does not depend on the `glmbayes` build state: ```r opencltools::diagnose_glmbayes() ``` Once `opencltools::has_opencl()` returns `TRUE`, proceed to Step 3. **What you need on your system** (brief summary; details in 'opencltools' Chapter 01): | Component | What it provides | Needed for | |-----------|-----------------|------------| | GPU driver | Exposes hardware to the OS | Runtime | | OpenCL headers (`CL/cl.h`) | Required at compile time | Source build | | OpenCL ICD loader (`OpenCL.dll` / `libOpenCL.so`) | Dispatches to vendor runtime | Runtime | All three must be present. The most common failure mode is having the driver but not the headers, or the headers but not the ICD loader. ## Step 3: Reinstall glmbayes from source With the OpenCL environment confirmed, reinstall `glmbayes` from source. The `configure` / `configure.win` script runs automatically, detects the OpenCL headers and library, and sets `-DUSE_OPENCL` if everything is found. ### Windows Windows users typically need **`devtools`** (or `remotes`) for source installs. Install it first if you do not have it: ```r install.packages("devtools") ``` Then install `glmbayes` from source. From CRAN with source compilation: ```r install.packages("glmbayes", type = "source") ``` Or from GitHub if you need a development version: ```r devtools::install_github("knygren/glmbayes") ``` Rtools must be installed and on your `PATH`. If you have not yet installed Rtools, follow the prompt at . ### Linux / macOS ```r install.packages("glmbayes", type = "source") ``` On macOS, Xcode Command Line Tools and GCC (via Homebrew) are required; see **`vignette("Chapter-01", package = "opencltools")`** for details. ### After the install Confirm the build succeeded: ```r library(glmbayes) has_opencl() #> [1] TRUE ``` # Verifying the setup Once `has_opencl()` returns `TRUE`, run a full diagnostic to confirm the complete stack: ```r diagnose_glmbayes() ``` A clean report looks like: ``` === glmbayes OpenCL Diagnostic Report === Environment: linux GPU: NVIDIA [OK] Driver installed [OK] OpenCL headers found (CL/cl.h) [OK] OpenCL runtime found (OpenCL.dll / ICD) [OK] OpenCL fully available (headers + runtime) [OK] Required PATH and library dirs present [OK] OpenCL runtime probe succeeded (platform available) [OK] glmbayes was compiled with OpenCL support. === End of Diagnostic Report === ``` Each line reports one layer of the stack. If any line shows `[FAIL]` or `[WARN]`, the report indicates what is missing. Common resolutions: - **Driver not installed** → install or update your GPU vendor driver. - **Headers not found** → install the OpenCL SDK; see 'opencltools' Chapter 01. - **Runtime not found** → install the ICD loader (`ocl-icd-libopencl1` on Linux, included with the CUDA Toolkit on Windows). - **Runtime probe failed** (`CL_PLATFORM_NOT_FOUND_KHR`) → the ICD loader is present but no vendor platform is registered. On Linux, run `clinfo` outside R to check for visible platforms, and ensure the vendor ICD file is in `/etc/OpenCL/vendors/`. - **glmbayes not compiled with OpenCL** → the source install did not find OpenCL at compile time; check `opencltools::has_opencl()` and retry Step 2. On Windows, the Linux/WSL runtime probe step is skipped; rely on the driver and ICD checks instead. For PATH-related warnings on Windows (CUDA Toolkit bin directory not in PATH), the diagnostic report lists the missing entries. Fix them via system settings or your shell profile; advanced users may use the helpers in `opencltools` directly (see `?opencltools::add_to_path`). # Running a GPU-accelerated model Once set up, pass `use_opencl = TRUE` to `glmb()` or `rglmb()`: ```r example(Cleveland) ``` The built-in Cleveland example runs a CPU vs OpenCL comparison and is a convenient end-to-end test. The chunks below illustrate the pattern (not executed during the vignette build): ```{r, eval=FALSE} library(glmbayes) data("Cleveland") ps <- Prior_Setup( hd ~ age + sex + cp + trestbps + chol + fbs + restecg + thalach + exang + oldpeak + slope + ca + thal, family = binomial(logit), data = Cleveland ) t_cpu <- system.time({ fit_cpu <- glmb( hd ~ age + sex + cp + trestbps + chol + fbs + restecg + thalach + exang + oldpeak + slope + ca + thal, family = binomial(link = "logit"), pfamily = dNormal(mu = ps$mu, Sigma = ps$Sigma), data = Cleveland, n = 20000, Gridtype = 2, use_parallel = TRUE, use_opencl = FALSE, verbose = FALSE ) }) t_gpu <- system.time({ fit_gpu <- glmb( hd ~ age + sex + cp + trestbps + chol + fbs + restecg + thalach + exang + oldpeak + slope + ca + thal, family = binomial(link = "logit"), pfamily = dNormal(mu = ps$mu, Sigma = ps$Sigma), data = Cleveland, n = 20000, Gridtype = 2, use_parallel = TRUE, use_opencl = TRUE, verbose = FALSE ) }) t_cpu t_gpu ``` ```{r, echo=FALSE, out.width="100%"} knitr::include_graphics( system.file("extdata", "cleveland_non_opencl_output_01.png", package = "glmbayes") ) ``` ```{r, echo=FALSE, out.width="100%"} knitr::include_graphics( system.file("extdata", "cleveland_opencl_output_01.png", package = "glmbayes") ) ``` ```{r, eval=FALSE} summary(fit_gpu) ``` ```{r, echo=FALSE, out.width="100%"} knitr::include_graphics( system.file("extdata", "cleveland_summary_output_01.png", package = "glmbayes") ) knitr::include_graphics( system.file("extdata", "cleveland_summary_output_02.png", package = "glmbayes") ) ``` The GPU path gives the same posterior results as the CPU path; only timing differs. GPU gains are most visible with larger models (more predictors, larger `n`, higher-dimensional tangency grids). # Appendix A: AMD GPUs on Linux (ROCm OpenCL) AMD provides multiple OpenCL implementations on Linux, but only **ROCm OpenCL** is fully supported and stable. If you are using an AMD GPU, install ROCm OpenCL on **Ubuntu 22.04 or 24.04 LTS**: ```sh sudo apt-get install rocm-opencl-runtime ``` This installs the AMD OpenCL runtime, the ICD file (`amdocl64.icd`), and ROCm's optimized OpenCL implementation. **Supported AMD GPUs** (ROCm): - Radeon RX 7900 XTX / XT / GRE - Radeon RX 7800 XT / 7700 XT - Radeon Pro W7900 / W7800 / W7700 - Instinct MI200 / MI300 accelerators Older GPUs (Polaris, Vega, Navi 1x/2x) are **not supported** by ROCm. Mesa Rusticl is a community alternative that may work but is not officially supported. AMDGPU-PRO OpenCL is legacy and not recommended. For full per-distribution instructions and verification steps, see **`vignette("Chapter-01", package = "opencltools")`**.