## Introduction
This vignette documents the use of the `GFD` function for the analysis of general factorial designs. The `GFD` function
calculates the Wald-type statistic (WTS), the ANOVA-type statistic (ATS) and a permuted Wald-type statistic (WTPS).
These test statistics can be used for general factorial designs (crossed or nested) with an arbitrary number of
factors, unequal covariance matrices among groups and unbalanced data even for small sample sizes.
## Data Example (crossed design)
For illustration purposes, we will use the data set `pizza` which is included in the `GFD` package. We first load the
`GFD` package and the data set.
```{r}
library(GFD)
data(pizza)
```
The objective of the study was to see how the delivery time in minutes would be affected by
three different factors: whether thick or thin crust was ordered (factor A), whether Coke was
ordered with the pizza or not (factor B), and whether or not garlic bread was ordered as a
side (factor C).
```{r}
head(pizza)
```
This is a three-way crossed design, where each factor has two levels. We will now analyze this design with the `GFD` function.
The `GFD` function takes as arguments:
* `formula`: A formula consisting of the outcome variable on the left hand side of a \~ operator and the factor
variables of interest on the right hand side. An interaction term must be specified.
* `data`: A data.frame, list or environment containing the variables in `formula`.
* `nperm`: The number of permutations. Default value is 10000.
* `alpha`: The significance level, default is 0.05.
* `CI.method`: Specifies the method used for calculating the CIs, either `t-quantile` (default) or `perm`.
```{r}
set.seed(1234)
model1 <- GFD(Delivery ~ Crust * Coke * Bread, data = pizza, nperm = 1000, alpha = 0.05)
summary(model1)
```
The output consists of three parts: `model1$Descriptive` gives an overview of the descriptive statistics: The number of observations,
mean and variance as well as confidence intervals (based on quantiles of the t-distribution or the permutation distribution) are displayed for each factor level combination.
`model1$WTS` contains the results for the Wald-type test: The test statistic, degree of freedom and p-values based on the asymptotic $\chi^2$ distribution
and the permutation procedure, respectively, are displayed. Note that the $\chi^2$ approximation is very liberal for small sample sizes and therefore the WTPS is
recommended for such situations. Finally, `model1$ATS` contains the corresponding results based on the ATS. This test statistic tends to rather
conservative decisions in the case of small sample sizes and is even asymptotically only an approximation, thus not providing an asymptotic level $\alpha$ test.
We find a significant influence of the factors Crust and Bread. The WTS and WTPS also
suggest a significant interaction between the factors Crust and Coke at 5% level, which is only
borderline significant when using the ATS.
## Data example (nested design)
Nested designs can also be analyzed using the `GFD` function.
Note that in nested designs, the levels of the nested factor usually have the same labels
for all levels of the main factor, i.e., for each level $i=1, ..., a$ of the main factor A
the nested factor levels are labeled as $j=1, ..., b_i$. If the levels of the nested factor
are named uniquely, this has to be specified by setting the parameter `nested.levels.unique`
to TRUE.
In this package, only analysis of balanced
nested designs is possible, that is, the same number of levels of the nested factor for each level
of the main factor.
We consider the data set `curdies` from the `GFD` package:
```{r}
data("curdies")
set.seed(987)
nested <- GFD(dugesia ~ season + season:site, data = curdies, nested.levels.unique = TRUE)
summary(nested)
```
The aim of the study was to describe basic patterns of variation in a small flatworm, Dugesia, in the Curdies
River, Western Victoria. Therefore, worms were sampled at two different seasons and three
different sites within each season. For our analyses we consider both factors as fixed (e.g.,
some sites may only be accessed in summer). In this setting, both WTS and WTPS detect a significant influence of the season whereas
the ATS, again, only shows a borderline significance at 5% level. The effect of the site is not
significant.
## Plotting
The `GFD` package is equipped with a plotting function, displaying the calculated means along with $(1-\alpha)$ confidence intervals.
The `plot` function takes a `GFD` object as an argument. In addition, the factor of interest may be specified. If this argument is
omitted in a two- or higher-way layout, the user is asked to specify the factor for plotting. Furthermore, additional graphical parameters
can be used to customize the plots. The optional argument `legendpos` specifies the position of the legend in higher-way layouts.
```{r}
plot(model1, factor = "Crust:Coke:Bread", legendpos = "center", main = "Delivery time of pizza", xlab = "Bread")
plot(model1, factor = "Crust:Coke", legendpos = "topleft", main = "Two-way interaction", xlab = "Coke", col = 3:5, pch = 17)
plot(nested, factor = "season:site", xlab = "site")
```
## optional GUI
The `GFD` package is equipped with an optional graphical user interface, which is based on `RGtk2`. The GUI may be started in `R` (if `RGtk2` is installed) using the
command `calculateGUI()`.
The user can specify the data location
(either directly or via the "load data" button), the formula, the number of permutations and
the significance level. Additionally, one can specify whether or not headers are
included in the data file, and which separator (e.g., ',' for *.csv files) and character symbols are used for decimals
in the data file. The GUI also provides a plotting option, which generates a new window
for specifying the factors to be plotted (in higher-way layouts) along with a few plotting
parameters.