--- title: "Chapter 02-S05: Gamma–Gamma Conjugacy for One Response Rate" author: "Kjell Nygren" date: "`r Sys.Date()`" output: rmarkdown::html_vignette bibliography: REFERENCES.bib reference-section-title: References vignette: > %\VignetteIndexEntry{Chapter 02-S05: Gamma–Gamma Conjugacy for One Response Rate} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup} library(glmbayes) ``` --- ## 1. The model ### 1.1 Likelihood Suppose we observe \(n\) positive continuous responses \(y_1, \dots, y_n\) — waiting times, survival times, monetary amounts — drawn from a Gamma distribution with **known shape** \(k > 0\) and **unknown rate** \(\beta > 0\): \[ Y_i \mid \beta \;\sim\; \mathrm{Gamma}(k,\, \beta), \qquad i = 1,\dots,n. \] The mean response is \(\mu = k/\beta\) and the variance is \(k/\beta^2\). The log-likelihood (up to constants) is: \[ \ell(\beta) \;\propto\; n k \log\beta - \beta\,\textstyle\sum_i y_i. \] **Why fix the shape?** Conjugacy requires \(k\) to be treated as known. In practice you either know it from domain knowledge (e.g., exponential responses have \(k = 1\)) or estimate it by method-of-moments: \(\hat k = \bar y^2 / s^2\). When both \(k\) and \(\beta\) are unknown, conjugacy breaks and a rejection sampler is needed — that case is handled by `Gamma(log)` with `glmbdisp()`. **The special case \(k = 1\)** gives the **Exponential distribution**, the natural model for inter-event waiting times under a constant-hazard assumption. ### 1.2 Prior Because \(\beta > 0\), the Gamma distribution is again the natural conjugate prior: \[ \beta \;\sim\; \mathrm{Gamma}(\alpha_0,\, b_0), \] with prior mean \(\alpha_0 / b_0\) and variance \(\alpha_0 / b_0^2\). **Relationship to the dispersion.** In GLM terms the Gamma dispersion is \(\phi = 1/k\), so the shape \(k\) is the precision \(1/\phi\). Specifying `lik_shape = k` in `dGamma(Inv_Dispersion = FALSE)` is equivalent to specifying a known dispersion. **The "identity" link.** When you pass `family = Gamma(link = "identity")` to **`glmb()`**, the single regression coefficient is the rate \(\beta\) directly. This matches the conjugate parameterization. ### 1.3 Posterior derivation Multiply the likelihood kernel \(\beta^{n k}\,e^{-\beta \sum y_i}\) by the prior kernel \(\beta^{\alpha_0 - 1}\,e^{-b_0 \beta}\): \[ p(\beta \mid y) \;\propto\; \beta^{(\alpha_0 + n k) - 1}\,e^{-(b_0 + \sum_i y_i)\,\beta}. \] This is a Gamma kernel, giving the conjugate posterior: \[ \boxed{\beta \mid y \;\sim\; \mathrm{Gamma}\!\left(\alpha_0 + n k,\;\; b_0 + \textstyle\sum_i y_i\right).} \] **Update rule:** - Add \(n \times k\) to the shape (\(n\) itself when \(k = 1\)). - Add the sum of observations \(\sum_i y_i\) to the rate. The posterior mean **rate** is \((\alpha_0 + n k)/(b_0 + \sum y_i)\), and the posterior mean **response** is: \[ \mathbb{E}[\mu \mid y] \;\approx\; \frac{k}{\mathbb{E}[\beta \mid y]} \;=\; \frac{k(b_0 + \sum_i y_i)}{\alpha_0 + n k}. \] --- ## 2. Illustration with base-R plots The **`bayesrules`** package does not currently include a `plot_gamma_gamma()` helper, so we plot the prior, the (normalized) likelihood, and the posterior directly with `curve()`. **Scenario.** You record the waiting time between calls to a customer service hotline. Exponential waiting times (\(k = 1\)) are a standard model. You place a **\(\Gamma(2, 4)\)** prior on the call rate \(\beta\) (prior mean = 0.5 calls/minute). You then observe \(n = 20\) inter-arrival times with sum \(\sum y = 35\) minutes. ```{r ggamma-illustration-setup} alpha0 <- 2; b0 <- 4 ## Gamma(2, 4) prior: mean = 0.5 k <- 1 ## known shape (exponential) n_obs <- 20L; sum_y <- 35 ## data: 20 observations, sum = 35 post_shape <- alpha0 + n_obs * k post_rate <- b0 + sum_y ggamma_analytic <- data.frame( Example = "Call inter-arrivals (illustration)", n = n_obs, `sum(y)` = sum_y, Posterior = sprintf("Gamma(%d, %d)", post_shape, post_rate), `Mean (rate)` = post_shape / post_rate, `SD (rate)` = sqrt(post_shape) / post_rate, check.names = FALSE ) knitr::kable(ggamma_analytic, digits = 4, caption = "Conjugate Gamma--Gamma posterior for rate beta") ``` ```{r ggamma-illustration-plot, fig.width = 6, fig.height = 4} beta_grid <- seq(0.01, 1.5, length.out = 500) ## Normalized likelihood: Gamma(n*k + 1, sum_y) shape, treated as density lik_unnorm <- dgamma(beta_grid, shape = n_obs * k + 1, rate = sum_y) ## Scale likelihood to have same peak height as the prior (visual only) prior_vals <- dgamma(beta_grid, alpha0, b0) post_vals <- dgamma(beta_grid, post_shape, post_rate) lik_scaled <- lik_unnorm * (max(prior_vals) / max(lik_unnorm)) plot(beta_grid, prior_vals, type = "l", lwd = 2, col = "steelblue", xlab = expression(beta), ylab = "Density", main = "Gamma–Gamma update: prior, likelihood, posterior", ylim = c(0, max(post_vals) * 1.1)) lines(beta_grid, lik_scaled, lwd = 2, col = "darkgreen", lty = 2) lines(beta_grid, post_vals, lwd = 2, col = "tomato") legend("topright", legend = c("Prior Gamma(2, 4)", "Likelihood (scaled)", sprintf("Posterior Gamma(%d, %d)", post_shape, post_rate)), col = c("steelblue", "darkgreen", "tomato"), lty = c(1, 2, 1), lwd = 2, bty = "n") abline(v = alpha0 / b0, lty = 3, col = "steelblue") abline(v = post_shape / post_rate, lty = 3, col = "tomato") ``` The posterior is pulled from the prior mean (0.5) toward the data-based MLE (\(n/\sum y = 20/35 \approx 0.57\)) and is considerably tighter than either. --- ## 3. Real data: Boston Marathon finishing times The `bayesrules::cherry_blossom_sample` dataset records net finishing times (minutes) for runners in a 10-mile cherry blossom race [@JohnsonLauEtAl2022]. Finishing times are continuous and positive; we model them as Gamma-distributed with a method-of-moments estimate of the shape \(k\). ```{r cb-data, eval = requireNamespace("bayesrules", quietly = TRUE)} library(bayesrules) ## Use net finishing times (minutes); drop missing values y_cb <- cherry_blossom_sample$net y_cb <- y_cb[is.finite(y_cb)] n_cb <- length(y_cb) ybar_cb <- mean(y_cb) s2_cb <- var(y_cb) ## Method-of-moments estimate of shape k: k_hat = ybar^2 / s^2 k_hat_cb <- ybar_cb^2 / s2_cb cat(sprintf("n = %d, mean = %.2f min, var = %.2f, k_hat = %.3f\n", n_cb, ybar_cb, s2_cb, k_hat_cb)) ``` The estimated shape \(\hat k \approx\) `r if(requireNamespace("bayesrules",quietly=TRUE)) round(mean(bayesrules::cherry_blossom_sample$net[is.finite(bayesrules::cherry_blossom_sample$net)])^2 / var(bayesrules::cherry_blossom_sample$net[is.finite(bayesrules::cherry_blossom_sample$net)]),2) else "4"` is treated as fixed for the conjugate analysis. **Prior.** Based on general knowledge that competitive 10-mile race times cluster around 85–100 minutes, we set a **\(\Gamma(5, 0.06)\)** prior on the rate \(\beta\) (prior mean rate \(\approx 0.083\), corresponding to a prior mean time \(k/\beta \approx 85\) minutes for the estimated \(k\)). ```{r cb-prior, eval = requireNamespace("bayesrules", quietly = TRUE)} ## Set prior centered at rate corresponding to ~85 min mean finish time ## E[mu] = k / beta_prior => beta_prior = k_hat / 85 alpha0_cb <- 5 b0_cb <- alpha0_cb / (k_hat_cb / ybar_cb) ## prior mean rate = k_hat / ybar cat(sprintf("Prior: Gamma(%.2f, %.4f), prior mean rate = %.4f, implied mean time = %.1f min\n", alpha0_cb, b0_cb, alpha0_cb / b0_cb, k_hat_cb / (alpha0_cb / b0_cb))) ``` ```{r cb-posterior, eval = requireNamespace("bayesrules", quietly = TRUE)} post_shape_cb <- alpha0_cb + n_cb * k_hat_cb post_rate_cb <- b0_cb + sum(y_cb) post_mean_rate_cb <- post_shape_cb / post_rate_cb post_sd_rate_cb <- sqrt(post_shape_cb) / post_rate_cb post_mean_time_cb <- k_hat_cb / post_mean_rate_cb cb_analytic <- data.frame( Dataset = "Cherry blossom finish times", n = n_cb, `k (fixed)` = k_hat_cb, Posterior = sprintf("Gamma(%.2f, %.4f)", post_shape_cb, post_rate_cb), `Mean (rate beta)` = post_mean_rate_cb, `SD (rate beta)` = post_sd_rate_cb, `Mean time (min)` = post_mean_time_cb, check.names = FALSE ) knitr::kable(cb_analytic, digits = 4, caption = "Conjugate Gamma--Gamma posterior for rate beta (shape--rate)") ``` ### 3.1 Fitting with **`glmb()`** ```{r cb-glmb, eval = requireNamespace("bayesrules", quietly = TRUE)} df_cb <- data.frame(y = y_cb) cb_beta <- matrix(alpha0_cb / b0_cb, nrow = 1, ncol = 1, dimnames = list(NULL, "(Intercept)")) cb_pf <- dGamma( shape = alpha0_cb, rate = b0_cb, beta = cb_beta, Inv_Dispersion = FALSE, lik_shape = k_hat_cb ) set.seed(2026) fit_cb <- glmb( n = 20000, y ~ 1, data = df_cb, family = Gamma(link = "identity"), pfamily = cb_pf ) print(fit_cb) ``` ```{r cb-compare, eval = requireNamespace("bayesrules", quietly = TRUE)} cb_compare <- data.frame( Dataset = "Cherry blossom finish times", Posterior = cb_analytic$Posterior, `Analytic Mean (rate)` = cb_analytic$`Mean (rate beta)`, `Analytic SD (rate)` = cb_analytic$`SD (rate beta)`, `glmb Post.Mean` = fit_cb$coef.means["(Intercept)"], `glmb Post.Sd` = sd(fit_cb$coefficients[, "(Intercept)", drop = TRUE]), check.names = FALSE ) knitr::kable(cb_compare, digits = 4, caption = "Analytic vs. glmb() posterior mean and SD for rate beta") ``` The **`glmb Post.Mean`** and **`glmb Post.Sd`** columns refer to the **rate** \(\beta\). Mean finish time is \(\hat k\) divided by posterior mean rate (see **Mean time** in the analytic table). --- ## 4. Connection to glmbayes The scalar Gamma–Gamma model is the conjugate prototype for **Gamma `glmb()` with `family = Gamma(link = "identity")`**. In that setting the single coefficient is the rate \(\beta\), the prior is `dGamma(shape, rate, beta, Inv_Dispersion = FALSE, lik_shape = k)`, and the posterior update is the closed form above. When the shape \(k\) is **unknown**, the Gamma prior on \(k\) is no longer conjugate. That case is handled by **`Gamma(link = "log")`** with `dNormal()` priors on the regression coefficients plus **`glmbdisp()`** for the dispersion \(\phi = 1/k\) (Chapter 10). --- ## See also - **Chapter-02-S01** — introduction to Bayesian inference and conjugacy. - **Chapter-02-S02** — Normal–Normal conjugacy for one mean. - **Chapter-02-S03** — Beta–Binomial conjugacy for one proportion. - **Chapter-02-S04** — Gamma–Poisson conjugacy for one count rate. - **Chapter 10** — non-conjugate Gamma regression with `Gamma(log)`.