Lec 20 - More PyMC3

class: center, middle, inverse, title-slide

# Lec 20 - More PyMC3
## <br/> Statistical Computing and Computation
### Sta 663 | Spring 2022
### <br/> Dr. Colin Rundel

---

exclude: true

```python
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import scipy

import pymc3 as pm
import arviz as az

plt.rcParams['figure.dpi'] = 200

np.set_printoptions(
  edgeitems=30, linewidth=200,
  precision = 5, suppress=True
  #formatter=dict(float=lambda x: "%.5g" % x)
)

pd.set_option("display.width", 150)
pd.set_option("display.max_columns", 10)
pd.set_option("display.precision", 6)
```

```r
knitr::opts_chunk$set(
  fig.align="center",
  cache=FALSE
)

library(lme4)
```

```
## Loading required package: Matrix
```

```r
local({
  hook_err_old <- knitr::knit_hooks$get("error")  # save the old hook
  knitr::knit_hooks$set(error = function(x, options) {
    # now do whatever you want to do with x, and pass
    # the new x to the old hook
    x = sub("## \n## Detailed traceback:\n.*$", "", x)
    x = sub("Error in py_call_impl\$.*?\$\\: ", "", x)
    #x = stringr::str_wrap(x, width = 100)
    hook_err_old(x, options)
  })
  
  hook_warn_old <- knitr::knit_hooks$get("warning")  # save the old hook
  knitr::knit_hooks$set(warning = function(x, options) {
    x = sub("<string>:1: ", "", x)
    #x = stringr::str_wrap(x, width = 100)
    hook_warn_old(x, options)
  })
  
  hook_msg_old <- knitr::knit_hooks$get("output")  # save the old hook
  knitr::knit_hooks$set(output = function(x, options) {
    x = stringr::str_replace(x, "(## ).* ([A-Za-z]+Warning:)", "\\1\\2")
    x = stringr::str_split(x, "\n")[[1]]
    #x = stringr::str_wrap(x, width = 120, exdent = 3)
    x = stringr::str_remove_all(x, "\r")
    x = stringi::stri_wrap(x, width=120, exdent = 3, normalize=FALSE)
    x = paste(x, collapse="\n")
    
    #x = stringr::str_wrap(x, width = 100)
    hook_msg_old(x, options)
  })
})
```

---

## Demo 1 - Bayesian Lasso

```python
n = 50
k = 100

np.random.seed(1234)
X = np.random.normal(size=(n, k))

beta = np.zeros(shape=k)
beta[[10,30,50,70]] =  10
beta[[20,40,60,80]] = -10

y = X @ beta + np.random.normal(size=n)
```

.footnote[Based on [Bayesian Sparse Regression](https://betanalpha.github.io/assets/case_studies/bayes_sparse_regression.html) and [Lasso regression with block updating](https://docs.pymc.io/en/v3/pymc-examples/examples/pymc3_howto/lasso_block_update.html)]

---

## Naive Model

```python
with pm.Model() as bayes_lasso:
  b = pm.Laplace("beta", 0, 1, shape=k)#lam*tau, shape=k)
  y_est = X @ b
  s = pm.HalfNormal('sigma', sd=1)
  
  likelihood = pm.Normal("y", mu=y_est, sigma=s, observed=y)
  
  trace = pm.sample(return_inferencedata=True, random_seed=1234)
```

---

```python
az.summary(trace)
```

```
##            mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
## beta[0]   0.067  0.861  -1.650    1.681      0.015    0.015    3234.0    1938.0   1.00
## beta[1]   0.215  0.729  -1.133    1.693      0.012    0.013    3632.0    2284.0   1.00
## beta[2]  -0.080  0.852  -1.789    1.501      0.014    0.015    3866.0    2652.0   1.00
## beta[3]  -0.290  0.814  -1.926    1.193      0.016    0.015    2870.0    1729.0   1.00
## beta[4]   0.079  0.809  -1.479    1.691      0.014    0.014    3577.0    2158.0   1.00
## ...         ...    ...     ...      ...        ...      ...       ...       ...    ...
## beta[96]  0.106  0.726  -1.271    1.542      0.013    0.013    3471.0    2487.0   1.00
## beta[97] -0.156  0.716  -1.591    1.160      0.013    0.013    3188.0    1798.0   1.00
## beta[98]  0.289  0.763  -1.076    1.827      0.014    0.015    3107.0    2408.0   1.00
## beta[99] -0.278  0.768  -1.747    1.205      0.013    0.013    3575.0    2568.0   1.00
## sigma     0.980  0.478   0.275    1.859      0.046    0.032     102.0     211.0   1.05
##
## [101 rows x 9 columns]
```

```python
az.summary(trace).iloc[[0,10,20,30,40,50,60,70,80,100]]
```

```
##            mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
## beta[0]   0.067  0.861  -1.650    1.681      0.015    0.015    3234.0    1938.0   1.00
## beta[10]  8.327  1.242   5.945   10.622      0.027    0.019    2075.0    2710.0   1.00
## beta[20] -8.288  1.335 -10.697   -5.733      0.030    0.021    2003.0    1746.0   1.00
## beta[30]  8.610  1.023   6.678   10.447      0.023    0.017    2011.0    1702.0   1.00
## beta[40] -8.765  1.507 -11.485   -5.929      0.030    0.022    2461.0    2531.0   1.00
## beta[50]  8.966  1.016   6.995   10.860      0.023    0.016    2035.0    1842.0   1.00
## beta[60] -9.248  1.121 -11.381   -7.162      0.022    0.015    2708.0    2371.0   1.00
## beta[70]  8.521  1.210   6.220   10.788      0.026    0.018    2238.0    2575.0   1.00
## beta[80] -9.923  0.899 -11.634   -8.253      0.018    0.013    2436.0    2003.0   1.00
## sigma     0.980  0.478   0.275    1.859      0.046    0.032     102.0     211.0   1.05
```

---

```python
ax = az.plot_forest(trace)
plt.tight_layout()
plt.show()
```

---

## Plot helper

```python
def plot_slope(trace, prior="beta", chain=0):
  post = (trace.posterior[prior]
          .to_dataframe()
          .reset_index()
          .query("chain == 0")
         )
  
  sns.catplot(x="beta_dim_0", y="beta", data=post, kind="boxen", linewidth=0, color='blue', aspect=2, showfliers=False)
  plt.tight_layout()
  plt.show()
  
```

---

```python
plot_slope(trace)
```

---

## Weakly Informative Prior

```python
with pm.Model() as bayes_weak:
  b = pm.Normal("beta", 0, 10, shape=k)
  y_est = X @ b
  
  s = pm.HalfNormal('sigma', sd=2)
  
  likelihood = pm.Normal("y", mu=y_est, sigma=s, observed=y)
  
  trace = pm.sample(return_inferencedata=True, random_seed=12345)
```

```
## █
## Auto-assigning NUTS sampler...
## Initializing NUTS using jitter+adapt_diag...
## Multiprocess sampling (4 chains in 4 jobs)
## NUTS: [sigma, beta]
## Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 57 seconds.
## The acceptance probability does not match the target. It is 0.9760397075294559, but should be close to 0.8. Try to
   increase the number of tuning steps.
## The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.
## There was 1 divergence after tuning. Increase `target_accept` or reparameterize.
## There were 15 divergences after tuning. Increase `target_accept` or reparameterize.
## The acceptance probability does not match the target. It is 0.7066410867916934, but should be close to 0.8. Try to
   increase the number of tuning steps.
## There was 1 divergence after tuning. Increase `target_accept` or reparameterize.
## The acceptance probability does not match the target. It is 0.9310317960474614, but should be close to 0.8. Try to
   increase the number of tuning steps.
## The chain reached the maximum tree depth. Increase max_treedepth, increase target_accept or reparameterize.
## The rhat statistic is larger than 1.05 for some parameters. This indicates slight problems during sampling.
## The estimated number of effective samples is smaller than 200 for some parameters.
```

---

```python
ax = az.plot_forest(trace)
plt.tight_layout()
plt.show()
```

---

```python
plot_slope(trace)
```

---

## Demo 2 - Gaussian Process

```python
np.random.seed(12345)

n = 50
x = np.linspace(0, 1, n)
X = x.reshape(-1,1)

nugget = 0.75
sigma2_true = 4.0
l_true = 10

cov_func = sigma2_true * pm.gp.cov.ExpQuad(1, 1/l_true)
mean_func = pm.gp.mean.Zero()

y_true = np.random.multivariate_normal(
    mean_func(X).eval(), cov_func(X).eval(), 1
).flatten()

y = y_true + nugget * np.random.randn(n)
```

---

.small[

```python
fig = plt.figure(figsize=(12, 5))
plt.plot(X, y_true, "-b", lw=3)
plt.plot(X, y, "ok", ".")
plt.show()
```

<img src="Lec20_files/figure-html/unnamed-chunk-12-9.png" width="1152" style="display: block; margin: auto;" />
]