--- title: "policy_data" output: rmarkdown::html_vignette: fig_caption: true toc: true toc_depth: 2 vignette: > %\VignetteIndexEntry{policy_data} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} bibliography: ref.bib --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) ``` ```{r setup, message = FALSE} library(polle) ``` This vignette is a guide to `policy_data()`. As the name suggests, the function creates a `policy_data` object with a specific data structure making it easy to use in combination with `policy_def()`, `policy_learn()`, and `policy_eval()`. The vignette is also a guide to some of the associated S3 functions which transform or access parts of the data, see `?policy_data` and `methods(class="policy_data")`. We will start by looking at a simple single-stage example, then consider a fixed two-stage example with varying actions sets and data in wide format, and finally we will look at an example with a stochastic number of stages and data in long format. # Single-stage: wide data Consider a simple single-stage problem with covariates/state variables $(Z, L, B)$, binary action variable $A$, and utility outcome $U$. We use `sim_single_stage()` to simulate data: ```{r single stage data} (d <- sim_single_stage(n = 5e2, seed=1)) |> head() ``` We give instructions to `policy_data()` which variables define the `action`, the state `covariates`, and the `utility` variable: ```{r pdss} pd <- policy_data(d, action="A", covariates=list("Z", "B", "L"), utility="U") pd ``` In the single-stage case the history $H$ is just $(B, Z, L)$. We access the history and actions using `get_history()`: ```{r gethistoryss} get_history(pd)$H |> head() get_history(pd)$A |> head() ``` Similarly, we access the utility outcomes $U$: ```{r get} get_utility(pd) |> head() ``` ```{r cleanup, include=FALSE} rm(list = ls()) ``` # Two-stage: wide data Consider a two-stage problem with observations $O = (B, BB, L_{1}, C_{1}, U_{1}, A_1, L_2, C_{2}, U_{2}, A_2, U_{3})$. Following the general notation introduced in Section 3.1 of [@nordland2023policy], $(B,BB)$ are the baseline covariates, $S_k =(L_{k, C_{k}})$ are the state covariates at stage k, $A_{k}$ is the action at stage k, and $U_k$ is the reward at stage $k$. The utility is the sum of the rewards $U=U_{1}+U_{2}+U_{3}$. We use `sim_two_stage_multi_actions()` to simulate data: ```{r simtwostage} d <- sim_two_stage_multi_actions(n=2e3, seed = 1) colnames(d) ``` Note that the data is in wide format. The data is transformed using `policy_data()` with instructions on which variables define the actions, baseline covariates, state covariates, and the rewards: ```{r pdtwostage} pd <- policy_data(d, action = c("A_1", "A_2"), baseline = c("B", "BB"), covariates = list(L = c("L_1", "L_2"), C = c("C_1", "C_2")), utility = c("U_1", "U_2", "U_3")) pd ``` The length of the character vector `action` determines the number of stages `K` (in this case 2). If the number of stages is 2 or more, the `covariates` argument must be a named list. Each element must be a character vector with length equal to the number of stages. If a covariate is not available at a given stage we insert an `NA` value, e.g., `L = c(NA, "L_2")`. Finally, the `utility` argument must be a single character string (the utility is observed after stage K) or a character vector of length K+1 with the names of the rewards. In this example, the observed action sets vary for each stage. `get_action_set()` returns the global action set and `get_stage_action_sets()` returns the action set for each stage: ```{r getactionsets} get_action_set(pd) get_stage_action_sets(pd) ``` The full histories $H_1 = (B, BB, L_{1}, C_{1})$ and $H_2=(B, BB, L_{1}, C_{1}, A_{1}, L_{2}, C_{2})$ are available using `get_history()` and `full_history = TRUE`: ```{r gethistwostage} get_history(pd, stage = 1, full_history = TRUE)$H |> head() get_history(pd, stage = 2, full_history = TRUE)$H |> head() ``` Similarly, we access the associated actions at each stage via list element `A`: ```{r} get_history(pd, stage = 1, full_history = TRUE)$A |> head() get_history(pd, stage = 2, full_history = TRUE)$A |> head() ``` Alternatively, the state/Markov type history and actions are available using `full_history = FALSE`: ```{r gethisstate} get_history(pd, full_history = FALSE)$H |> head() get_history(pd, full_history = FALSE)$A |> head() ``` Note that `policy_data()` overrides the action variable names to `A_1`, `A_2`, ... in the full history case and `A` in the state/Markov history case. As in the single-stage case we access the utility, i.e. the sum of the rewards, using `get_utility()`: ```{r getutiltwo} get_utility(pd) |> head() ``` # Multi-stage: long data In this example we illustrate how `polle` handles decision processes with a stochastic number of stages, see Section 3.5 in [@nordland2023policy]. The data is simulated using `sim_multi_stage()`. Detailed information on the simulation is available in `?sim_multi_stage`. We simulate data from 2000 iid subjects: ```{r sim_data} d <- sim_multi_stage(2e3, seed = 1) ``` As described, the stage data is in long format: ```{r view_data} d$stage_data[, -(9:10)] |> head() ``` The `id` variable is important for identifying which rows belong to each subjects. The baseline data uses the same `id` variable: ```{r view_b_data} d$baseline_data |> head() ``` The data is transformed using `policy_data()` with `type = "long"`. The names of the `id`, `stage`, `event`, `action`, and `utility` variables must be specified. The event variable, inspired by the event variable in `survival::Surv()`, is `0` whenever an action occur and `1` for a terminal event. ```{r pd} pd <- policy_data(data = d$stage_data, baseline_data = d$baseline_data, type = "long", id = "id", stage = "stage", event = "event", action = "A", utility = "U") pd ``` In some cases we are only interested in analyzing a subset of the decision stages. `partial()` trims the maximum number of decision stages: ```{r partial} pd3 <- partial(pd, K = 3) pd3 ``` # SessionInfo ```{r sessionInfo} sessionInfo() ``` # References