Exponential family and `family` objects

family objects specify characteristics of the models used by functions such as glm. The families implemented in the stats package include binomial, gaussian, Gamma, inverse.gaussian, and poisson, which obvious corresponding distributions. These distributions are all special cases of the exponential family of distributions with probability mass or density function of the form \[ f_{Y}(y) = \exp\left\{\frac{y \theta - b(\theta) - c_1(y)}{\phi/m} - \frac{1}{2}a\left(-\frac{m}{\phi}\right) + c_2(y) \right\} \] for some sufficiently smooth functions \(b(.)\), \(c_1(.)\), \(a(.)\) and \(c_2(.)\), and a fixed weight \(m\). The expected value and the variance of \(Y\) are then \[\begin{align*} E(Y) & = \mu = b'(\theta) \\ Var(Y) & = \frac{\phi}{m}b''(\theta) = \frac{\phi}{m}V(\mu) \end{align*}\] where \(V(\mu)\) and \(\phi\) are the variance function and the dispersion parameter, respectively. Below we list the characteristics of the distributions supported by family objects.

Normal with mean \(\mu\) and variance \(\phi/m\)

\(\theta = \mu\), \(\displaystyle b(\theta) = \frac{\theta^2}{2}\), \(\displaystyle c_1(y) = \frac{y^2}{2}\), \(\displaystyle a(\zeta) = -\log(-\zeta)\), \(\displaystyle c_2(y) = -\frac{1}{2}\log(2\pi)\)

Binomial with index \(m\) and probability \(\mu\)

\(\displaystyle \theta = \log\frac{\mu}{1- \mu}\), \(\displaystyle b(\theta) = \log(1 + e^\theta)\), \(\displaystyle \phi = 1\), \(\displaystyle c_1(y) = 0\), \(\displaystyle a(\zeta) = 0\), \(\displaystyle c_2(y) = \log{m\choose{my}}\)

Poisson with mean \(\mu\)

\(\displaystyle \theta = \log\mu\), \(\displaystyle b(\theta) = e^\theta\), \(\displaystyle \phi = 1\), \(\displaystyle c_1(y) = 0\), \(\displaystyle a(\zeta) = 0\), \(\displaystyle c_2(y) = -\log\Gamma(y + 1)\)

Gamma with mean \(\mu\) and shape \(1/\phi\)

\(\displaystyle \theta = -\frac{1}{\mu}\), \(\displaystyle b(\theta) = -\log(-\theta)\), \(\displaystyle c_1(y) = -\log y\), \(\displaystyle a(\zeta) = 2 \log \Gamma(-\zeta) + 2 \zeta \log\left(-\zeta\right)\), \(\displaystyle c_2(y) = -\log y\)

Inverse Gaussian with mean \(\mu\) and variance \(\phi\mu^3\)

\(\displaystyle \theta = -\frac{1}{2\mu^2}\), \(\displaystyle b(\theta) = -\sqrt{-2\theta}\), \(\displaystyle c_1(y) = \frac{1}{2y}\), \(\displaystyle a(\zeta) = -\log(-\zeta)\), \(\displaystyle c_2(y) = -\frac{1}{2}\log\left(\pi y^3\right)\)

Components in `family` objects

family objects provide functions for the variance function (variance), a specification of deviance residuals (dev.resids) and the Akaike information criterion (aic). For example

inverse.gaussian()$dev.resids

## function (y, mu, wt) 
## wt * ((y - mu)^2)/(y * mu^2)
## <bytecode: 0x7fbe690297d0>
## <environment: 0x7fbe6902f4c0>

inverse.gaussian()$variance

## function (mu) 
## mu^3
## <bytecode: 0x7fbe69029990>
## <environment: 0x7fbe5f0335b0>

inverse.gaussian()$aic

## function (y, n, mu, wt, dev) 
## sum(wt) * (log(dev/sum(wt) * 2 * pi) + 1) + 3 * sum(log(y) * 
##     wt) + 2
## <bytecode: 0x7fbe69029108>
## <environment: 0x7fbe5f08ea50>

Enrichment options for `family` objects

The enrichwith R package provides methods for the enrichment of family objects with a function that links the natural parameter \(\theta\) with \(\mu\), the function \(b(\theta)\), the first two derivatives of \(V(\mu)\), \(a(\zeta)\) and its first four derivatives, and \(c_1(y)\) and \(c_2(y)\). To illustrate, let’s write a function that reconstructs the densities and probability mass functions from the components that result from enrichment

library("enrichwith")
dens <- function(y, m = 1, mu, phi, family) {
    object <- enrich(family)
    with(object, {
        c2 <- if (family == "binomial") c2fun(y, m) else c2fun(y)
        exp(m * (y * theta(mu) - bfun(theta(mu)) - c1fun(y))/phi -
            0.5 * afun(-m/phi) + c2)
    })
}

The following chunks test dens for a few distributions against the standard d* functions

## Normal
all.equal(dens(y = 0.2, m = 3, mu = 1, phi = 3.22, gaussian()),
          dnorm(x = 0.2, mean = 1, sd = sqrt(3.22/3)))

## [1] TRUE

## Gamma
all.equal(dens(y = 3, m = 1.44, mu = 2.3, phi = 1.3, Gamma()),
          dgamma(x = 3, shape = 1.44/1.3, 1.44/(1.3 * 2.3)))

## [1] TRUE

## Inverse gaussian
all.equal(dens(y = 0.2, m = 7.23, mu = 1, phi = 3.22, inverse.gaussian()),
          SuppDists::dinvGauss(0.2, nu = 1, lambda = 7.23/3.22))

## [1] TRUE

## Binomial
all.equal(dens(y = 0.34, m = 100, mu = 0.32, phi = 1, binomial()),
          dbinom(x = 34, size = 100, prob = 0.32))

## [1] TRUE

Enriching `family` objects: exponential family of distributions

Ioannis Kosmidis

2020-01-09

Introduction

Exponential family and `family` objects

Normal with mean \(\mu\) and variance \(\phi/m\)

Binomial with index \(m\) and probability \(\mu\)

Poisson with mean \(\mu\)

Gamma with mean \(\mu\) and shape \(1/\phi\)

Inverse Gaussian with mean \(\mu\) and variance \(\phi\mu^3\)

Components in `family` objects

Enrichment options for `family` objects

Enriching family objects: exponential family of distributions

Ioannis Kosmidis

2020-01-09

Introduction

Exponential family and family objects

Normal with mean \(\mu\) and variance \(\phi/m\)

Binomial with index \(m\) and probability \(\mu\)

Poisson with mean \(\mu\)

Gamma with mean \(\mu\) and shape \(1/\phi\)

Inverse Gaussian with mean \(\mu\) and variance \(\phi\mu^3\)

Components in family objects

Enrichment options for family objects

Enriching `family` objects: exponential family of distributions

Exponential family and `family` objects

Components in `family` objects

Enrichment options for `family` objects