---
title: "Bioequivalence Tests for Parallel Trial Designs with Log-Normal Data"
output: 
  rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Bioequivalence Tests for Parallel Trial Designs with Log-Normal Data}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
bibliography: 'references.bib'
link-citations: yes
---

```{r setup, include=FALSE, message = FALSE, warning = FALSE}
knitr::opts_chunk$set(echo = TRUE)
knitr::opts_chunk$set(comment = "#>", collapse = TRUE)
options(rmarkdown.html_vignette.check_title = FALSE) #title of doc does not match vignette title
doc.cache <- T #for cran; change to F
```

In the `SimTOST` R package, which is specifically designed for sample size estimation for bioequivalence studies, hypothesis testing is based on the Two One-Sided Tests (TOST) procedure. [@sozu_sample_2015]  In TOST, the equivalence test is framed as a comparison between the the null hypothesis of ‘new product is worse by a clinically relevant quantity’ and the alternative hypothesis of ‘difference between products is too small to be clinically relevant’. This vignette focuses on a parallel design, with 2 arms/treatments and 5 primary endpoints.

In the following two examples, we demonstrate the use of **SimTOST** for parallel trial designs with data assumed to follow a normal distribution on the log scale. We start by loading the package.

```{r, echo = T, message=F}
library(SimTOST)
```

# Multiple Independent Co-Primary Endpoints
Here, we consider a bio-equivalence trial with 2 treatment arms and $m=5$ endpoints. The sample size is calculated to ensure that the test and reference products are equivalent with respect to all 5 endpoints. The true ratio between the test and reference products is assumed to be 1.05. It is assumed that the standard deviation of the log-transformed response variable is $\sigma = 0.3$, and that all tests are independent ($\rho = 0$). The equivalence limits are set at 0.80 and 1.25. The significance level is 0.05. The sample size is determined at a power of 0.8. 

This example is adapted from @mielke_sample_2018, who employed a difference-of-means test on the log scale. The sample size calculation can be conducted using two approaches, both of which are illustrated below.

## Approach 1: Using sampleSize_Mielke
In the first approach, we calculate the required sample size for 80% power using the [sampleSize_Mielke()](../reference/sampleSize_Mielke.html) function. This method directly follows the approach described in @mielke_sample_2018, assuming a difference-of-means test on the log-transformed scale with specified parameters.

```{r, eval = TRUE}
ssMielke <- sampleSize_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0, 
                              sigma = 0.3, true.diff = log(1.05), 
                              equi.tol = log(1.25), design = "parallel", 
                              alpha = 0.05, adjust = "no", seed = 1234, 
                              nsim = 10000)
ssMielke
```
For 80\% power, `r ssMielke["SS"]` subjects per sequence (`r ssMielke["SS"] * 2` in total) would be required. 

## Approach 2: Using sampleSize
Alternatively, the sample size calculation can be performed using the [sampleSize()](../reference/sampleSize.html) function. This method assumes that effect sizes are normally distributed on the log scale and uses a difference-of-means test (`ctype = "DOM"`) with user-specified values for `mu_list` and `sigma_list`. This method allows for greater flexibility than Approach 1 in specifying parameter distributions.

```{r, eval = TRUE}
mu_r <- setNames(rep(log(1.00), 5), paste0("y", 1:5))
mu_t <- setNames(rep(log(1.05), 5), paste0("y", 1:5))
sigma <- setNames(rep(0.3, 5), paste0("y", 1:5))
lequi_lower <- setNames(rep(log(0.8), 5), paste0("y", 1:5))
lequi_upper <- setNames(rep(log(1.25), 5), paste0("y", 1:5))

ss <- sampleSize(power = 0.8, alpha = 0.05,
                 mu_list = list("R" = mu_r, "T" = mu_t),
                 sigma_list = list("R" = sigma, "T" = sigma),
                 list_comparator = list("T_vs_R" = c("R", "T")),
                 list_lequi.tol = list("T_vs_R" = lequi_lower),
                 list_uequi.tol = list("T_vs_R" = lequi_upper),
                 dtype = "parallel", ctype = "DOM", lognorm = FALSE, 
                 adjust = "no", ncores = 1, nsim = 10000, seed = 1234)
ss
```     
For 80\% power, a total of `r ss$response$n_total` subjects would be required. 

Consider an alternative scenario in which the standard deviation of the log-transformed response variable is unknown, but the standard deviation on the original scale is known ($\sigma = 1$). In such cases, the [sampleSize()](../reference/sampleSize.html) function can still accommodate adjustments to handle these uncertainties by transforming parameters accordingly. We now provide all data on the raw scale, including the equivalence bounds, and set `ctype = ROM` and `lognorm = TRUE`.


# Multiple Correlated Co-Primary Endpoints
In the second example, we set $k=m=5$, $\sigma = 0.3$ and $\rho = 0.8$. This example is also adapted from @mielke_sample_2018, who employed a difference-of-means test on the log scale. The sample size calculation can again be conducted using two approaches, both of which are illustrated below.

## Approach 1: Using sampleSize_Mielke
In the first approach, we calculate the required sample size for 80% power using the [sampleSize_Mielke()](../reference/sampleSize_Mielke.html) function. This method directly follows the approach described in @mielke_sample_2018, assuming a difference-of-means test on the log-transformed scale with specified parameters.

```{r, eval = TRUE}
ssMielke <- sampleSize_Mielke(power = 0.8, Nmax = 1000, m = 5, k = 5, rho = 0.8, 
                              sigma = 0.3, true.diff = log(1.05), 
                              equi.tol = log(1.25), design = "parallel", 
                              alpha = 0.05, adjust = "no", seed = 1234, 
                              nsim = 10000)
ssMielke
```
For 80\% power, `r ssMielke["SS"]` subjects per sequence (`r ssMielke["SS"] * 2` in total) would be required. 

## Approach 2: Using sampleSize
Alternatively, the sample size calculation can be performed using the [sampleSize()](../reference/sampleSize.html) function. This method assumes that effect sizes are normally distributed on the log scale and uses a difference-of-means test (`ctype = "DOM"`) with user-specified values for `mu_list`, `sigma_list`, and the correlation parameter `rho`. 

```{r}
mu_r <- setNames(rep(log(1.00), 5), paste0("y", 1:5))
mu_t <- setNames(rep(log(1.05), 5), paste0("y", 1:5))
sigma <- setNames(rep(0.3, 5), paste0("y", 1:5))
lequi_lower <- setNames(rep(log(0.8), 5), paste0("y", 1:5))
lequi_upper <- setNames(rep(log(1.25), 5), paste0("y", 1:5))

ss <- sampleSize(power = 0.8, alpha = 0.05,
                 mu_list = list("R" = mu_r, "T" = mu_t),
                 sigma_list = list("R" = sigma, "T" = sigma),
                 rho = 0.8, # high correlation between the endpoints
                 list_comparator = list("T_vs_R" = c("R", "T")),
                 list_lequi.tol = list("T_vs_R" = lequi_lower),
                 list_uequi.tol = list("T_vs_R" = lequi_upper),
                 dtype = "parallel", ctype = "DOM", lognorm = FALSE, 
                 adjust = "no", ncores = 1, k = 5, nsim = 10000, seed = 1234)
ss
```

# References