1 Poisson-lognormal distribution

A compound Poisson-lognormal distribution is a Poisson probability distribution where its parameter \(\lambda\) is a random variable with lognormal distribution, that is to say log\(\lambda\) are normally distributed with mean \(\mu\) and variance \(\sigma^2\) (Bulmer 1974). The density function is

\[ P(k) = \frac{1}{\sqrt{2\pi\sigma^2}k!}\int^\infty_0\lambda^{k-1}exp(-\lambda)exp(\frac{-(log\lambda-\mu)^2}{2\sigma^2})d\lambda, \; \text{where} \; k = 0, 1, 2, ... \;. \]

The likelihood of the Poisson-lognormal distribution needs to be evaluated by numerical integration. Writing its log-likelihood function from scratch is a pain. In Stan, we can define the likelihood of the Poisson-lognormal distribution using priors literally (i.e., log\(\lambda\) are normally distributed with mean \(\mu\) and variance \(\sigma^2\)) as follows;

\[ \sigma \sim \text{Half-}Cauchy(0, 5) \] \[ \mu \sim N(0, 10) \] \[ log\lambda_i \sim N(\mu, \sigma) \] \[ y_i \sim Pois(\lambda_i) \; \text{or} \; y_i \sim PoisLog(log\lambda_i) \]

where \(y_i\) is given as data and \(PoisLog\) is log parameterization of the Poisson (it’s different from a compound Poisson-lognormal distribution) (Stan Version 2.17.0 Manual page 520).

writeLines(readLines("poilog.stan"))

## data{
##   int<lower=0> N;
##   int y[N];
## }
## 
## parameters {
##   real mu;
##   real<lower=0> sigma;
##   vector[N] log_lambda;
## }
## 
## model {
##   sigma ~ cauchy(0, 5);
##   mu ~ normal(0, 10);
##   log_lambda ~ normal(mu, sigma);
##   for (n in 1:N)
##     target += poisson_log_lpmf(y[n] | log_lambda[n]);
## }

2 Testing

We generate dummy data, a sample of 100 random positive integers drawn from a Poisson distribution with mean of log\(\lambda_i\) which are normally distributed with a mean of log(50) \(\simeq\) 3.91 and a standard deviation of 1.0.

set.seed(123)
N <- 100
mu <- log(50)
sigma <- 1
log_lambda <- rnorm(N, mu, sigma)
Y <- rpois(N, exp(log_lambda))

library(ggthemes)
theme_set(theme_solarized(light=FALSE))
dat <- data_frame(Y)

ggplot(dat, aes(Y)) +
  geom_histogram(bins=30, fill = "#268bd2", col = "black")

We then use the Stan to fit the data.

library(rstan)
library(loo)
rstan_options(auto_write = TRUE)
options(mc.cores = parallel::detectCores())

list_dat <- list(N = N, y = Y)

fit <- stan(file = "poilog.stan",
          data = list_dat,
          iter = 1000,
          warmup = 500,
          thin = 1,
          chains =  4,
          refresh = 500,
          control = list(adapt_delta = 0.9, max_treedepth = 20))

SAMPLING FOR MODEL 'poilog' NOW (CHAIN 1).

Gradient evaluation took 6.4e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.64 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 0.262078 seconds (Warm-up)
               0.187632 seconds (Sampling)
               0.44971 seconds (Total)


SAMPLING FOR MODEL 'poilog' NOW (CHAIN 2).

Gradient evaluation took 4.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.45 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 0.254045 seconds (Warm-up)
               0.1933 seconds (Sampling)
               0.447345 seconds (Total)


SAMPLING FOR MODEL 'poilog' NOW (CHAIN 3).

Gradient evaluation took 3.5e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.35 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 0.229415 seconds (Warm-up)
               0.185794 seconds (Sampling)
               0.415209 seconds (Total)


SAMPLING FOR MODEL 'poilog' NOW (CHAIN 4).

Gradient evaluation took 4.2e-05 seconds
1000 transitions using 10 leapfrog steps per transition would take 0.42 seconds.
Adjust your expectations accordingly!


Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)

 Elapsed Time: 0.230132 seconds (Warm-up)
               0.187833 seconds (Sampling)
               0.417965 seconds (Total)

The resulting estimates well recovered the original values (\(\mu\) and \(\sigma\)) 🍺


print(fit)

Inference for Stan model: poilog.
4 chains, each with iter=1000; warmup=500; thin=1;
post-warmup draws per chain=500, total post-warmup draws=2000.

                   mean se_mean   sd    2.5%     25%     50%     75%   97.5% n_eff Rhat
mu                 4.00    0.00 0.10    3.80    3.93    4.00    4.06    4.18  2000 1.00
sigma              0.94    0.00 0.07    0.82    0.89    0.94    0.99    1.10  2000 1.00
log_lambda[1]      3.20    0.00 0.20    2.79    3.06    3.20    3.33    3.56  2000 1.00
log_lambda[2]      3.66    0.00 0.16    3.35    3.56    3.67    3.76    3.95  2000 1.00
log_lambda[3]      5.53    0.00 0.06    5.41    5.49    5.53    5.58    5.65  2000 1.00
.....

log_lambda[97]     6.14    0.00 0.05    6.05    6.11    6.14    6.17    6.24  2000 1.00
log_lambda[98]     5.48    0.00 0.07    5.35    5.43    5.48    5.52    5.61  2000 1.00
log_lambda[99]     3.61    0.00 0.16    3.28    3.50    3.61    3.72    3.91  2000 1.00
log_lambda[100]    2.83    0.01 0.23    2.36    2.67    2.84    3.00    3.25  2000 1.00
lp__            -385.12    0.30 7.16 -400.14 -389.95 -384.88 -380.07 -372.43   580 1.01

Samples were drawn using NUTS(diag_e) at Fri Sep 21 18:17:19 2018.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at
convergence, Rhat=1).

3 Computing Environment

writeLines(readLines(file.path(Sys.getenv("HOME"), ".R/Makevars")))

## CXXFLAGS=-O3 -mtune=native -march=native -Wno-unused-variable -Wno-unused-function
## CC=clang
## CXX=clang++ -arch x86_64 -ftemplate-depth-256

devtools::session_info("rstan")

## Session info -------------------------------------------------------------

##  setting  value                       
##  version  R version 3.5.1 (2018-07-02)
##  system   x86_64, darwin17.7.0        
##  ui       unknown                     
##  language (EN)                        
##  collate  en_US.UTF-8                 
##  tz       America/Chicago             
##  date     2018-09-23

## Packages -----------------------------------------------------------------

##  package      * version   date       source                         
##  assertthat     0.2.0     2017-04-11 CRAN (R 3.5.0)                 
##  BH             1.66.0-1  2018-02-13 CRAN (R 3.5.0)                 
##  cli            1.0.0     2017-11-05 CRAN (R 3.5.0)                 
##  colorspace     1.3-2     2016-12-14 CRAN (R 3.5.0)                 
##  crayon         1.3.4     2017-09-16 CRAN (R 3.5.0)                 
##  digest         0.6.15    2018-01-28 CRAN (R 3.5.0)                 
##  ggplot2      * 3.0.0     2018-07-03 cran (@3.0.0)                  
##  glue           1.3.0     2018-09-21 Github (tidyverse/glue@4e74901)
##  graphics     * 3.5.1     2018-07-26 local                          
##  grDevices    * 3.5.1     2018-07-26 local                          
##  grid           3.5.1     2018-07-26 local                          
##  gridExtra      2.3       2017-09-09 CRAN (R 3.5.0)                 
##  gtable         0.2.0     2016-02-26 CRAN (R 3.5.0)                 
##  inline         0.3.15    2018-05-18 CRAN (R 3.5.0)                 
##  labeling       0.3       2014-08-23 CRAN (R 3.5.0)                 
##  lattice        0.20-35   2017-03-25 CRAN (R 3.5.1)                 
##  lazyeval       0.2.1     2017-10-29 CRAN (R 3.5.0)                 
##  magrittr       1.5       2014-11-22 CRAN (R 3.5.0)                 
##  MASS           7.3-50    2018-04-30 CRAN (R 3.5.1)                 
##  Matrix         1.2-14    2018-04-13 CRAN (R 3.5.1)                 
##  methods      * 3.5.1     2018-07-26 local                          
##  mgcv           1.8-24    2018-06-23 CRAN (R 3.5.1)                 
##  munsell        0.5.0     2018-06-12 CRAN (R 3.5.0)                 
##  nlme           3.1-137   2018-04-07 CRAN (R 3.5.1)                 
##  pillar         1.2.3     2018-05-25 CRAN (R 3.5.0)                 
##  plyr           1.8.4     2016-06-08 CRAN (R 3.5.0)                 
##  R6             2.2.2     2017-06-17 CRAN (R 3.5.0)                 
##  RColorBrewer   1.1-2     2014-12-07 CRAN (R 3.5.1)                 
##  Rcpp           0.12.18   2018-07-23 cran (@0.12.18)                
##  RcppEigen      0.3.3.4.0 2018-02-07 CRAN (R 3.5.0)                 
##  reshape2       1.4.3     2017-12-11 CRAN (R 3.5.0)                 
##  rlang          0.2.2     2018-08-16 cran (@0.2.2)                  
##  rstan        * 2.17.3    2018-01-20 CRAN (R 3.5.0)                 
##  scales         1.0.0     2018-08-09 cran (@1.0.0)                  
##  StanHeaders  * 2.17.2    2018-01-20 CRAN (R 3.5.0)                 
##  stats        * 3.5.1     2018-07-26 local                          
##  stats4         3.5.1     2018-07-26 local                          
##  stringi        1.2.3     2018-06-12 CRAN (R 3.5.0)                 
##  stringr      * 1.3.1     2018-05-10 CRAN (R 3.5.0)                 
##  tibble       * 1.4.2     2018-01-22 CRAN (R 3.5.0)                 
##  tools          3.5.1     2018-07-26 local                          
##  utf8           1.1.4     2018-05-24 CRAN (R 3.5.0)                 
##  utils        * 3.5.1     2018-07-26 local                          
##  viridisLite    0.3.0     2018-02-01 CRAN (R 3.5.0)                 
##  withr          2.1.2     2018-03-15 CRAN (R 3.5.0)

Fitting a Poisson-lognormal distribution in Stan (rstan)

Masatoshi Katabuchi

September 2018

1 Poisson-lognormal distribution

2 Testing

3 Computing Environment

Fitting a Poisson-lognormal distribution in Stan (rstan)

Masatoshi Katabuchi

September 2018

1 Poisson-lognormal distribution

2 Testing

3 Computing Environment

4 Related