---
title: "Chapter 02-S04: Gamma–Poisson Conjugacy for One Count Rate"
author: "Kjell Nygren"
date: "`r Sys.Date()`"
output: rmarkdown::html_vignette
bibliography: REFERENCES.bib
reference-section-title: References
vignette: >
  %\VignetteIndexEntry{Chapter 02-S04: Gamma–Poisson Conjugacy for One Count Rate}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
```

```{r setup}
library(glmbayes)
if (requireNamespace("bayesrules", quietly = TRUE)) library(bayesrules)
```

---

## 1. The model

### 1.1 Likelihood

Suppose we observe \(n\) independent counts \(y_1, \dots, y_n\) drawn from a Poisson distribution with unknown rate \(\lambda > 0\):

\[
  Y_i \mid \lambda \;\sim\; \mathrm{Poisson}(\lambda), \qquad i = 1,\dots,n.
\]

The joint likelihood is

\[
  p(y_1,\dots,y_n \mid \lambda)
  \;\propto\; e^{-n\lambda}\,\lambda^{\sum_i y_i}.
\]

Everything relevant for updating beliefs about \(\lambda\) is captured by two sufficient statistics: \(n\) (number of observations) and \(\sum_i y_i\) (total observed count).

**With exposure.**  In many applications the counts are observed over different exposure periods \(e_i\) (time, area, number of opportunities).  The model becomes \(Y_i \mid \lambda \sim \mathrm{Poisson}(e_i \lambda)\), and the sufficient statistics become \(\sum_i e_i\) and \(\sum_i y_i\).  An exposure-weighted example appears in **Appendix A** ([@Albert2009]).

### 1.2 Prior

The rate \(\lambda > 0\) calls for a positive prior.  The **Gamma distribution** in shape–rate form is the natural conjugate choice:

\[
  p(\lambda) \;=\; \frac{\beta^\alpha}{\Gamma(\alpha)}\,\lambda^{\alpha-1}\,e^{-\beta\lambda},
  \qquad \lambda > 0,
\]

with shape \(\alpha > 0\) and rate \(\beta > 0\).  Its mean is \(\alpha/\beta\) and its variance is \(\alpha/\beta^2\).

**Interpreting the parameters.**  Think of \(\alpha - 1\) as a "prior total count" and \(\beta\) as a "prior number of observations".  The prior mean \(\alpha/\beta\) is your pre-data estimate of the rate, and larger \(\beta\) encodes a stronger commitment to that estimate.

Note: many textbooks use **shape–scale** form (\(\theta = 1/\beta\)); **glmbayes** uses the **shape–rate** convention throughout.

### 1.3 Posterior derivation

Multiply the likelihood kernel \(e^{-n\lambda}\lambda^{\sum y_i}\) by the prior kernel \(\lambda^{\alpha-1}e^{-\beta\lambda}\):

\[
  p(\lambda \mid y)
  \;\propto\;
  \lambda^{(\alpha + \sum y_i) - 1}\,e^{-(\beta + n)\lambda}.
\]

This is a Gamma kernel, so the conjugate posterior is:

\[
  \boxed{\lambda \mid y \;\sim\; \mathrm{Gamma}\!\left(\alpha + \textstyle\sum_i y_i,\;\; \beta + n\right).}
\]

**Update rule:** add the total count to the shape, add the sample size (or total exposure) to the rate.

The posterior mean is

\[
  \mathbb{E}[\lambda \mid y]
  \;=\; \frac{\alpha + \sum y_i}{\beta + n}
  \;=\;
  \frac{\beta}{\beta + n}\cdot\frac{\alpha}{\beta}
  \;+\;
  \frac{n}{\beta + n}\cdot\bar{y},
\]

again a weighted average of the prior mean and the data mean.

---

## 2. `bayesrules` illustration

The **`bayesrules`** package [@JohnsonLauEtAl2022] provides `plot_gamma_poisson()` and `summarize_gamma_poisson()` for visualizing the Gamma–Poisson update.  The same scenario appears in the Bayes Rules! conjugate vignette and in Chapter 12 of the book.

**Scenario.**  You observe daily bicycle rental counts \(y_1, \dots, y_9\) from the first nine days of a year.  The data are \(y = (1, 1, 1, 0, 0, 0, 0, 0, 0)\), giving \(\sum y = 3\) across \(n = 9\) days.  Your prior belief is a **\(\Gamma(3, 4)\)** distribution on the daily rate \(\lambda\) (prior mean = 0.75 rentals/day).

```{r gp-br-data}
br_y_df  <- data.frame(y = c(1, 1, 1, rep(0, 6)))  ## n = 9, sum(y) = 3
br_n     <- nrow(br_y_df)
br_shape <- 3
br_rate  <- 4

## Analytic posterior
post_shape_br <- br_shape + sum(br_y_df$y)   ## 3 + 3 = 6
post_rate_br  <- br_rate  + br_n              ## 4 + 9 = 13
```

```{r gp-br-plot, eval = requireNamespace("bayesrules", quietly = TRUE)}
library(bayesrules)
plot_gamma_poisson(
  shape      = br_shape,
  rate       = br_rate,
  sum_y      = sum(br_y_df$y),
  n          = br_n,
  prior      = TRUE,
  likelihood = TRUE,
  posterior  = TRUE
)
```

```{r gp-br-summary, eval = requireNamespace("bayesrules", quietly = TRUE)}
summarize_gamma_poisson(shape = br_shape, rate = br_rate,
                        sum_y = sum(br_y_df$y), n = br_n)
```

**Reading the output.**  The posterior is \(\Gamma(6, 13)\) with mean \(6/13 \approx 0.46\).  The data (mean \(\bar y = 0.33\)) have pulled the posterior mean down from the prior mean (0.75), and the posterior is noticeably tighter.

```{r gp-analytic}
gp_analytic <- data.frame(
  Example = "Bayes Rules! daily counts",
  n = br_n,
  `sum(y)` = sum(br_y_df$y),
  Posterior = sprintf("Gamma(%d, %d)", post_shape_br, post_rate_br),
  Mean = post_shape_br / post_rate_br,
  SD = sqrt(post_shape_br) / post_rate_br,
  check.names = FALSE
)
knitr::kable(gp_analytic, digits = 4,
  caption = "Conjugate Gamma--Poisson posterior (Bayes Rules! scenario)")
```

---

## 3. Fitting with **`glmb()`**

The scalar Gamma–Poisson model maps to **`glmb()`** with `family = poisson(link = "identity")` and `dGamma(shape, rate, beta, Inv_Dispersion = FALSE)`.  The single coefficient is the rate \(\lambda\).

```{r gp-glmb}
gp_beta <- matrix(br_shape / br_rate, 1L, 1L, dimnames = list(NULL, "(Intercept)"))
gp_pf   <- dGamma(shape = br_shape, rate = br_rate,
                  beta = gp_beta, Inv_Dispersion = FALSE)

set.seed(2026)
fit_gp <- glmb(
  n       = 20000,
  y ~ 1,
  data    = br_y_df,
  weights = rep(1L, br_n),
  family  = poisson(link = "identity"),
  pfamily = gp_pf
)
print(fit_gp)
```

```{r gp-compare}
gp_compare <- data.frame(
  Example = "Bayes Rules! daily counts",
  Posterior = gp_analytic$Posterior,
  `Analytic Mean` = gp_analytic$Mean,
  `Analytic SD`   = gp_analytic$SD,
  `glmb Post.Mean` = fit_gp$coef.means["(Intercept)"],
  `glmb Post.Sd`   = sd(fit_gp$coefficients[, "(Intercept)", drop = TRUE]),
  check.names = FALSE
)
knitr::kable(gp_compare, digits = 4,
  caption = "Analytic vs. glmb() posterior mean and SD")
```

The **`glmb Post.Mean`** and **`glmb Post.Sd`** columns should match the analytic **Mean** and **SD** to Monte Carlo error because `dGamma(Inv_Dispersion = FALSE)` implements the exact conjugate update.

---

## 4. Connection to glmbayes

The scalar Gamma–Poisson model is the conjugate prototype for **Poisson `glmb()` with `family = poisson(link = "identity")`**.  In that setting the single coefficient is the rate \(\lambda\), the prior is `dGamma(shape, rate, beta, Inv_Dispersion = FALSE)`, and the posterior update is the closed form above.

For general Poisson **regression** with covariates (log link), conjugacy is lost and `glmb()` uses envelope-based accept–reject sampling with a `dNormal()` prior on the regression coefficients (Chapter 10).  The Bayes Rules! **equality_index** Poisson regression example is worked in **Chapter 10, Appendix A**.

---

# Appendix A. Heart transplant mortality (@Albert2009)

The **`hearttransplants`** dataset [@Albert2020LearnBayes; @ChristiansenMorris1995] records, for each of 94 U.S. hospitals, the **exposure** `e` (expected deaths under the national baseline rate) and **observed deaths** `y` within 30 days of surgery.  The model is \(y_i \mid \lambda_i \sim \mathrm{Poisson}(e_i \lambda_i)\) where \(\lambda_i\) is the hospital's standardized mortality ratio.

```{r ht-load, eval = requireNamespace("LearnBayes", quietly = TRUE)}
library(LearnBayes)
data("hearttransplants")
cat(sprintf("%d hospitals  |  sum(y) = %d  |  sum(e) = %.0f\n",
            nrow(hearttransplants), sum(hearttransplants$y), sum(hearttransplants$e)))
```

**Albert's prior.**  Albert (Ch. 3.2) uses a \(\Gamma(16, 15174)\) prior on \(\lambda\), derived from historical national data (prior mean \(\approx 0.00105\)).  He focuses on two specific hospitals:

| Hospital | Exposure \(e\) | Deaths \(y\) | Analytic posterior | Posterior mean | Posterior SD |
|----------|-----------------|-------------|-------------------|----------------|--------------|
| A | 66 | 1 | \(\Gamma(17,\; 15240)\) | \(17/15240 \approx 0.001115\) | \(\sqrt{17}/15240 \approx 0.000271\) |
| B | 1767 | 4 | \(\Gamma(20,\; 16941)\) | \(20/16941 \approx 0.001181\) | \(\sqrt{20}/16941 \approx 0.000264\) |

```{r ht-prior}
alpha0_ht <- 16L;  beta0_ht <- 15174L
ex_A <- 66L;   yobs_A <- 1L
ex_B <- 1767L; yobs_B <- 4L
```

```{r ht-albert-analytic}
ht_albert <- data.frame(
  Hospital = c("A", "B"),
  e = c(ex_A, ex_B),
  y = c(yobs_A, yobs_B),
  Posterior = c("Gamma(17, 15240)", "Gamma(20, 16941)"),
  Mean = c(
    (alpha0_ht + yobs_A) / (beta0_ht + ex_A),
    (alpha0_ht + yobs_B) / (beta0_ht + ex_B)
  ),
  SD = c(
    sqrt(alpha0_ht + yobs_A) / (beta0_ht + ex_A),
    sqrt(alpha0_ht + yobs_B) / (beta0_ht + ex_B)
  ),
  check.names = FALSE
)
knitr::kable(ht_albert, digits = 6,
  caption = "Albert Ch. 3.2: analytic posterior mean and SD (shape--rate Gamma)")
```

### A.1 Fitting with **`glmb()`**

The exposure enters as `weights`.  The conjugate sampler updates \(\Gamma(\alpha_0 + y,\; \beta_0 + e)\) exactly.

```{r ht-glmb-A, eval = requireNamespace("LearnBayes", quietly = TRUE)}
ht_beta <- matrix(alpha0_ht / beta0_ht, 1L, 1L, dimnames = list(NULL, "(Intercept)"))
ht_pf   <- dGamma(shape = alpha0_ht, rate = beta0_ht,
                  beta = ht_beta, Inv_Dispersion = FALSE)

set.seed(2026)
fit_A <- glmb(n = 20000, y ~ 1, data = data.frame(y = yobs_A),
              weights = ex_A, family = poisson(link = "identity"), pfamily = ht_pf)
print(fit_A)
```

```{r ht-glmb-B, eval = requireNamespace("LearnBayes", quietly = TRUE)}
set.seed(2026)
fit_B <- glmb(n = 20000, y ~ 1, data = data.frame(y = yobs_B),
              weights = ex_B, family = poisson(link = "identity"), pfamily = ht_pf)
print(fit_B)
```

```{r ht-glmb-compare, eval = requireNamespace("LearnBayes", quietly = TRUE)}
ht_compare <- data.frame(
  Hospital = c("A", "B"),
  e = c(ex_A, ex_B),
  y = c(yobs_A, yobs_B),
  Posterior = c("Gamma(17, 15240)", "Gamma(20, 16941)"),
  `Albert Mean` = ht_albert$Mean,
  `Albert SD`   = ht_albert$SD,
  `glmb Post.Mean` = c(fit_A$coef.means["(Intercept)"],
                       fit_B$coef.means["(Intercept)"]),
  `glmb Post.Sd`   = c(sd(fit_A$coefficients[, "(Intercept)", drop = TRUE]),
                       sd(fit_B$coefficients[, "(Intercept)", drop = TRUE])),
  check.names = FALSE
)
knitr::kable(ht_compare, digits = 6,
  caption = "Albert analytic vs. glmb() posterior mean and SD")
```

Each fit is for **one hospital**: `(Intercept)` is the posterior draw for that hospital's rate \(\lambda\).  The **`glmb Post.Mean`** and **`glmb Post.Sd`** columns should match Albert's **Mean** and **SD** to Monte Carlo error.

---

## See also

- **Chapter-02-S01** — introduction to Bayesian inference and conjugacy.
- **Chapter-02-S02** — Normal–Normal conjugacy for one mean.
- **Chapter-02-S03** — Beta–Binomial conjugacy for one proportion.
- **Chapter-02-S05** — Gamma–Gamma conjugacy for a Gamma response rate.
- **Chapter 10, Appendix A** — Bayes Rules! Poisson regression (`equality_index`).
- **Chapter 18** — hierarchical Poisson mixed model (BikeSharing; block Gibbs with `rglmb`).