---
title: "Cartesian Products and Partitions of Groups"
author: "Joseph Wood"
date: "01/31/2025"
output:
  rmarkdown::html_vignette:
    toc: true
    number_sections: false
vignette: >
  %\VignetteIndexEntry{Cartesian Products and Partitions of Groups}
  %\VignetteEngine{knitr::rmarkdown}
   \usepackage[utf8]{inputenc}
---

In combinatorics, sometimes it can be difficult to figure out exactly which tool/method we need to attack our problem. Do we need combinations? permutations? Cartesian product? partitions? compositions? What about repetition or multiplicity? The list goes on. Oftentimes the solution ends up being overly complicated and prone to error, or more commonly, a simple brute force solution is employed. The latter is okay in some situations, but in many real world problems, this approach becomes untenable very quickly. We address two such problems in this article.

Before we dive into the details, we will first introduce `expandGrid`, which generates the Cartesian product of its inputs.

## `expandGrid`

Just like its base R counterpart `expand.grid`, we can generate the Cartesian product using `expandGrid`. There are a few caveats that are discussed in detail in the docs (see `?expandGrid`). The main difference is that `expandGrid` varies the first column the slowest and if all of the inputs are of the same type, a `matrix` will be returned.

``` r
library(RcppAlgos)

ht <- function(d, m = 5, n = m) {
    ## print the head and tail together
    cat("head -->\n")
    print(head(d, m))
    cat("--------\n")
    cat("tail -->\n")
    print(tail(d, n))
}

## Base R first. Example inspired by expand.grid docs.
expand.grid(height = seq(60, 80, 10), weight = seq(100, 200, 50),
            sex = c("Male","Female"))
#>    height weight    sex
#> 1      60    100   Male
#> 2      70    100   Male
#> 3      80    100   Male
#> 4      60    150   Male
#> 5      70    150   Male
#> 6      80    150   Male
#> 7      60    200   Male
#> 8      70    200   Male
#> 9      80    200   Male
#> 10     60    100 Female
#> 11     70    100 Female
#> 12     80    100 Female
#> 13     60    150 Female
#> 14     70    150 Female
#> 15     80    150 Female
#> 16     60    200 Female
#> 17     70    200 Female
#> 18     80    200 Female

## Now RcppAlgos::expandGrid
expandGrid(height = seq(60, 80, 10), weight = seq(100, 200, 50),
           sex = c("Male","Female"))
#>    height weight    sex
#> 1      60    100   Male
#> 2      60    100 Female
#> 3      60    150   Male
#> 4      60    150 Female
#> 5      60    200   Male
#> 6      60    200 Female
#> 7      70    100   Male
#> 8      70    100 Female
#> 9      70    150   Male
#> 10     70    150 Female
#> 11     70    200   Male
#> 12     70    200 Female
#> 13     80    100   Male
#> 14     80    100 Female
#> 15     80    150   Male
#> 16     80    150 Female
#> 17     80    200   Male
#> 18     80    200 Female
```

### Matrix vs Data.Frame Output

``` r
lst = Map(\(x, y) x:y, 8:12, 13:17)

class(expand.grid(lst))
#> [1] "data.frame"

## ht defined above
ht(expand.grid(lst))
#> head -->
#>   Var1 Var2 Var3 Var4 Var5
#> 1    8    9   10   11   12
#> 2    9    9   10   11   12
#> 3   10    9   10   11   12
#> 4   11    9   10   11   12
#> 5   12    9   10   11   12
#> --------
#> tail -->
#>      Var1 Var2 Var3 Var4 Var5
#> 7772    9   14   15   16   17
#> 7773   10   14   15   16   17
#> 7774   11   14   15   16   17
#> 7775   12   14   15   16   17
#> 7776   13   14   15   16   17

class(expandGrid(lst))
#> [1] "matrix" "array"

ht(expandGrid(lst))
#> head -->
#>      Var1 Var2 Var3 Var4 Var5
#> [1,]    8    9   10   11   12
#> [2,]    8    9   10   11   13
#> [3,]    8    9   10   11   14
#> [4,]    8    9   10   11   15
#> [5,]    8    9   10   11   16
#> --------
#> tail -->
#>         Var1 Var2 Var3 Var4 Var5
#> [7772,]   13   14   15   16   13
#> [7773,]   13   14   15   16   14
#> [7774,]   13   14   15   16   15
#> [7775,]   13   14   15   16   16
#> [7776,]   13   14   15   16   17
```

### Always Return `data.frame`

If you really need to always return a `data.frame`, we can utilize the argument `return_df`:

``` r
class(expandGrid(lst, return_df = TRUE))
#> [1] "data.frame"

ht(expandGrid(lst, return_df = TRUE))
#> head -->
#>   Var1 Var2 Var3 Var4 Var5
#> 1    8    9   10   11   12
#> 2    8    9   10   11   13
#> 3    8    9   10   11   14
#> 4    8    9   10   11   15
#> 5    8    9   10   11   16
#> --------
#> tail -->
#>      Var1 Var2 Var3 Var4 Var5
#> 7772   13   14   15   16   13
#> 7773   13   14   15   16   14
#> 7774   13   14   15   16   15
#> 7775   13   14   15   16   16
#> 7776   13   14   15   16   17
```

### Familiar `RcppAlgos` API Components

Just as in other `RcppAlgos` functions, we can take advantage of the arguments `lower`, `upper`, and `nThreads`. For example, we can see a decrease in execution time by using `nThreads`:

``` r
library(microbenchmark)
options(digits = 4)
stdThreadMax()
#> [1] 8

numThreads = as.integer(stdThreadMax() / 2)
lst_med = Map(\(x, y) x:y, 8:17, 11:20)
expandGridCount(lst_med)
#> [1] 1048576

microbenchmark(
    baseR = expand.grid(lst_med),
    RcppAlgos_Ser = expandGrid(lst_med),
    RcppAlgos_Par = expandGrid(lst_med, nThreads = numThreads),
    unit = "relative"
)
#> Warning in microbenchmark(baseR = expand.grid(lst_med), RcppAlgos_Ser =
#> expandGrid(lst_med), : less accurate nanosecond times to avoid potential
#> integer overflows
#> Unit: relative
#>           expr    min     lq  mean median    uq   max neval
#>          baseR 13.911 11.572 7.035 10.306 8.829 2.589   100
#>  RcppAlgos_Ser  3.516  2.999 1.962  2.762 2.324 1.415   100
#>  RcppAlgos_Par  1.000  1.000 1.000  1.000 1.000 1.000   100
```

### `expandGridSample`

If we want a random sample of the Cartesian product, we can call upon `expandGridSample`. Just as in other `RcppAlgos` sampling functions, we can utlize the `n`, `sampleVec`, `nThreads`, and `namedSample` arguments.

``` r
## lst_med is defined above
all_carts = expandGrid(lst_med)

cart_samp = expandGridSample(lst_med, n = 5, seed = 42, namedSample = TRUE)
cart_samp
#>        Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10
#> 61413     8    9   13   13   15   16   17   17   17    17
#> 54425     8    9   13   12   13   13   16   16   18    17
#> 623844   10   10   12   11   13   13   17   17   16    20
#> 74362     8   10   10   13   12   15   15   18   18    18
#> 46208     8    9   12   14   13   13   15   18   19    20

as.numeric(rownames(cart_samp))
#> [1]  61413  54425 623844  74362  46208

## cart_samp has same output as subsetting all_carts
all_carts[as.numeric(rownames(cart_samp)), ]
#>      Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10
#> [1,]    8    9   13   13   15   16   17   17   17    17
#> [2,]    8    9   13   12   13   13   16   16   18    17
#> [3,]   10   10   12   11   13   13   17   17   16    20
#> [4,]    8   10   10   13   12   15   15   18   18    18
#> [5,]    8    9   12   14   13   13   15   18   19    20
```

### Powerful Iterators with expandGridIter

As with many other functions in `RcppAlgos`, there is an iterator offering for the Cartesian product with `expandGridIter`. These iterators are both memory efficient and computationally efficient. They are flexible as well allowing users to grab only the next iteration, the next *n* iterations, random access, and more.

The example below is from the docs (see `?expandGridIter`):

``` r
a = expandGridIter(factor(state.abb), euro, islands)
a@nextIter()
#>   Var1  Var2  Var3
#> 1   AL 13.76 11506

a@nextNIter(3)
#>   Var1  Var2  Var3
#> 1   AL 13.76  5500
#> 2   AL 13.76 16988
#> 3   AL 13.76  2968

a@front()
#>   Var1  Var2  Var3
#> 1   AL 13.76 11506

all_remaining = a@nextRemaining()
dim(all_remaining)
#> [1] 26399     3

a@summary()
#> $description
#> [1] "Cartesian Product of the source (see the sourceVector method for more info)"
#> 
#> $currentIndex
#> [1] 26401
#> 
#> $totalResults
#> [1] 26400
#> 
#> $totalRemaining
#> [1] -1

a@back()
#>   Var1  Var2 Var3
#> 1   WY 200.5   82

a[[5]]
#>   Var1  Var2 Var3
#> 1   AL 13.76   16

a@summary()
#> $description
#> [1] "Cartesian Product of the source (see the sourceVector method for more info)"
#> 
#> $currentIndex
#> [1] 5
#> 
#> $totalResults
#> [1] 26400
#> 
#> $totalRemaining
#> [1] 26395

a[[c(1, 17, 3)]]
#>   Var1  Var2  Var3
#> 1   AL 13.76 11506
#> 2   AL 13.76    13
#> 3   AL 13.76 16988

a@summary()
#> $description
#> [1] "Cartesian Product of the source (see the sourceVector method for more info)"
#> 
#> $currentIndex
#> [1] 5
#> 
#> $totalResults
#> [1] 26400
#> 
#> $totalRemaining
#> [1] 26395
```

Now we will discuss two problems that can get unwieldy very quickly.

## Cartesian Product where Order does not Matter

Given a list of vectors, *v<sub>1</sub>*, *v<sub>2</sub>*, … , *v<sub>n</sub>*, where the intersection of two or more vectors in non-empty, find all unique combinations (order does not matter) of elements of the Cartesian product of all of the vectors.

For example, lets say we have: `v1 = 1:4` and `v2 = 2:5`. The Cartesian product is given by `expand.grid(v1, v2)` (We continue to use the `ht` function defined in the [Combination and Permutation Basics](<https://jwood000.github.io/RcppAlgos/articles/GeneralCombinatorics.html>) vignette):

``` r
expand.grid(1:4, 2:5)
#>    Var1 Var2
#> 1     1    2
#> 2     2    2
#> 3     3    2  ### <-- Same as row 6
#> 4     4    2  ### <-- Same as row 10
#> 5     1    3
#> 6     2    3  ### <-- Same as row 3
#> 7     3    3
#> 8     4    3  ### <-- Same as row 11
#> 9     1    4
#> 10    2    4  ### <-- Same as row 4
#> 11    3    4  ### <-- Same as row 8
#> 12    4    4
#> 13    1    5
#> 14    2    5
#> 15    3    5
#> 16    4    5
```

If we don’t care about order, the following row pairs are considered equal and can therefore be pruned to obtain our desired results:

- (r3, r6)
- (r4, r10)
- (r8, r11)

With `comboGrid` no duplicates are generated:

``` r
comboGrid(1:4, 2:5)
#>       Var1 Var2
#>  [1,]    1    2
#>  [2,]    1    3
#>  [3,]    1    4
#>  [4,]    1    5
#>  [5,]    2    2
#>  [6,]    2    3
#>  [7,]    2    4
#>  [8,]    2    5
#>  [9,]    3    3
#> [10,]    3    4
#> [11,]    3    5
#> [12,]    4    4
#> [13,]    4    5
```

Note that the order of `expand.grid` and `comboGrid` differ. The order of `comboGrid` is lexicographical meaning that the last column will vary the fastest whereas with `expand.grid`, the first column will vary the fastest.

You will also note that the output of `expand.grid` is a `data.frame` whereas with `comboGrid`, we get a `matrix`. With `comboGrid`, we only get a `data.frame` when the classes of each vector are different as generally speaking, working with matrices is preferable.

With the small example above, we only had to filter out 3 out of 16 total results (less than 20%). That isn’t that bad. If this was the general case, we might as well just stick with `expand.grid` as it is very efficient. Unfortunately, this is not the general case and as the number of vectors with overlap increases, filtering will become impractical.

Consider the following example:

``` r
pools = list(c(1, 10, 14, 6),
             c(7, 2, 4, 8, 3, 11, 12),
             c(11, 3, 13, 4, 15, 8, 6, 5),
             c(10, 1, 3, 2, 9, 5,  7),
             c(1, 5, 10, 3, 8, 14),
             c(15, 3, 7, 10, 4, 5, 8, 6),
             c(14, 9, 11, 15),
             c(7, 6, 13, 14, 10, 11, 9, 4),
             c(6,  3,  2, 14,  7, 12,  9),
             c(6, 11,  2,  5, 15,  7))

## If we used expand.grid, we would have to filter
## more than 100 million results
prod(lengths(pools))
#> [1] 101154816

## With comboGrid, this is no problem
system.time(myCombs <- comboGrid(pools))
#>    user  system elapsed 
#>   0.256   0.022   0.279

print(object.size(myCombs), unit = "Mb")
#> 92 Mb

ht(myCombs)
#> head -->
#>      Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10
#> [1,]    1    2    3    1    1    3    9    4    2     2
#> [2,]    1    2    3    1    1    3    9    4    2     5
#> [3,]    1    2    3    1    1    3    9    4    2     6
#> [4,]    1    2    3    1    1    3    9    4    2     7
#> [5,]    1    2    3    1    1    3    9    4    2    11
#> --------
#> tail -->
#>            Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10
#> [1205736,]   14   12   15   10   14   15   15   11   12    15
#> [1205737,]   14   12   15   10   14   15   15   13   12    15
#> [1205738,]   14   12   15   10   14   15   15   13   14    15
#> [1205739,]   14   12   15   10   14   15   15   14   12    15
#> [1205740,]   14   12   15   10   14   15   15   14   14    15

## This is just the time to create the cartesian product
## Generating keys, then filtering will take much more time
system.time(cartProd <- expand.grid(pools))
#>    user  system elapsed 
#>   3.565   0.755   4.358

## Creates huge object
print(object.size(cartProd), unit = "Mb")
#> 7717.5 Mb

## What if we want results with unique values...
## Simply set repetition = FALSE
system.time(myCombsNoRep <- comboGrid(pools, repetition = FALSE))
#>    user  system elapsed 
#>   0.004   0.000   0.004

ht(myCombsNoRep)
#> head -->
#>      Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10
#> [1,]    1    2    3    5    8    4    9    6    7    11
#> [2,]    1    2    3    5    8    4    9    6    7    15
#> [3,]    1    2    3    5    8    4    9    6   12     7
#> [4,]    1    2    3    5    8    4    9    6   12    11
#> [5,]    1    2    3    5    8    4    9    6   12    15
#> --------
#> tail -->
#>         Var1 Var2 Var3 Var4 Var5 Var6 Var7 Var8 Var9 Var10
#> [2977,]   14    3    4    5    8   15    9   13   12    11
#> [2978,]   14    3    4    7    5   15    9   13   12    11
#> [2979,]   14    3    4    7    8   15    9   13   12    11
#> [2980,]   14    3    5    7    8   15    9   13   12    11
#> [2981,]   14    4    5    7    8   15    9   13   12    11
```

The function `comboGrid` was highly inspired by the following question on stackoverflow:

- [Picking unordered combinations from pools with overlap](<https://stackoverflow.com/q/51834467/4408538>)

Currenlty, the underlying algorithm is not the gold standard. By that, we mean that results are not generated one by one. Efforts are underway to achieve this, but up until this point it has proven quite difficult (See the comprehensive answer by Tim Peters (yes, that [Tim Peters](<https://stackoverflow.com/users/2705542/tim-peters>))).

The algorithm in `comboGrid` leverages [The Fundamental Theorem of Arithmetic](<https://en.wikipedia.org/wiki/Fundamental_theorem_of_arithmetic>) to efficiently generate keys that will be used in a hash function to determine if a particular combination of elements have been encountered. For greater efficiency, we make use of deduplication as [user2357112](<https://stackoverflow.com/a/51886857/4408538>) suggests.

### In the Wild

- [R - Expand Grid Without Duplicates](<https://stackoverflow.com/q/68047141/4408538>)
- [Non-redundant version of expand.grid](<https://stackoverflow.com/a/68050873/4408538>)

## Partitions of Groups with `comboGroups`

Given a vector of length *n* and *k* groups, where *k* divides *n*, each group is comprised of a combination of the vector chosen *g = n / k* at a time. As is stated in the documentation (see `?comboGroups`), these can be constructed by first generating all permutations of the vector and subsequently removing entries with permuted groups. Let us consider the following example. Given `v = 1:12`, generate all partitions `v` into 3 groups each of size 4.

``` r
funBruteGrp <- function(myLow = 1, myUp) {
    mat <- do.call(
        rbind,
        permuteGeneral(12, lower = myLow, upper = myUp,
            FUN = \(x) {
                sapply(seq(0, 8, 4), \(y) {
                     paste(c("(", x[(y + 1):(y + 4)], ")"), collapse = " ")
                })
            }
        )
    )
    colnames(mat) <- paste0("Grp", 1:3)
    rownames(mat) <- myLow:myUp
    mat
}

## All of these are the same as only the 3rd group is being permuted
funBruteGrp(myUp = 6)
#>   Grp1          Grp2          Grp3            
#> 1 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 9 10 11 12 )"
#> 2 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 9 10 12 11 )"
#> 3 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 9 11 10 12 )"
#> 4 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 9 11 12 10 )"
#> 5 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 9 12 10 11 )"
#> 6 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 9 12 11 10 )"

## We found our second distinct partition
funBruteGrp(myLow = 23, myUp = 26)
#>    Grp1          Grp2          Grp3            
#> 23 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 12 11 9 10 )"
#> 24 "( 1 2 3 4 )" "( 5 6 7 8 )" "( 12 11 10 9 )"
#> 25 "( 1 2 3 4 )" "( 5 6 7 9 )" "( 8 10 11 12 )"  ### <-- 2nd distinct partition of groups
#> 26 "( 1 2 3 4 )" "( 5 6 7 9 )" "( 8 10 12 11 )"

funBruteGrp(myLow = 48, myUp = 50)
#>    Grp1          Grp2           Grp3            
#> 48 "( 1 2 3 4 )" "( 5 6 7 9 )"  "( 12 11 10 8 )"
#> 49 "( 1 2 3 4 )" "( 5 6 7 10 )" "( 8 9 11 12 )"  ### <-- 3rd distinct partition of groups
#> 50 "( 1 2 3 4 )" "( 5 6 7 10 )" "( 8 9 12 11 )"
```

We are starting to see a pattern. Each new partition is exactly 24 spots away. This makes sense as there are `factorial(4) = 24` permutations of size 4. Now, this is an oversimplification as if we simply generate every *24<sup>th</sup>* permutation, we will still get duplication as they start to carry over to the other groups. Observe:

``` r
do.call(rbind, lapply(seq(1, 169, 24), function(x) {
    funBruteGrp(myLow = x, myUp = x)
}))
#>     Grp1          Grp2           Grp3            
#> 1   "( 1 2 3 4 )" "( 5 6 7 8 )"  "( 9 10 11 12 )"
#> 25  "( 1 2 3 4 )" "( 5 6 7 9 )"  "( 8 10 11 12 )"
#> 49  "( 1 2 3 4 )" "( 5 6 7 10 )" "( 8 9 11 12 )" 
#> 73  "( 1 2 3 4 )" "( 5 6 7 11 )" "( 8 9 10 12 )" 
#> 97  "( 1 2 3 4 )" "( 5 6 7 12 )" "( 8 9 10 11 )" 
#> 121 "( 1 2 3 4 )" "( 5 6 8 7 )"  "( 9 10 11 12 )"  ### <-- This is the same as the 1st
#> 145 "( 1 2 3 4 )" "( 5 6 8 9 )"  "( 7 10 11 12 )"  ### partition. The only difference is
#> 169 "( 1 2 3 4 )" "( 5 6 8 10 )" "( 7 9 11 12 )"   ### that the 2nd Grp has been permuted
```

This only gets more muddled as the number of groups increases. It is also very inefficient, however this exercise hopefully serves to better illustrate these structures.

The algorithm in `comboGroups` avoids all of this duplication by implementing a novel algorithm akin to [std::next_permutation](<https://en.cppreference.com/w/cpp/algorithm/next_permutation>) from the algorithm library in `C++`.

``` r
system.time(comboGroups(12, numGroups = 3))
#>    user  system elapsed 
#>       0       0       0

ht(comboGroups(12, numGroups = 3))
#> head -->
#>      Grp1 Grp1 Grp1 Grp1 Grp2 Grp2 Grp2 Grp2 Grp3 Grp3 Grp3 Grp3
#> [1,]    1    2    3    4    5    6    7    8    9   10   11   12
#> [2,]    1    2    3    4    5    6    7    9    8   10   11   12
#> [3,]    1    2    3    4    5    6    7   10    8    9   11   12
#> [4,]    1    2    3    4    5    6    7   11    8    9   10   12
#> [5,]    1    2    3    4    5    6    7   12    8    9   10   11
#> --------
#> tail -->
#>         Grp1 Grp1 Grp1 Grp1 Grp2 Grp2 Grp2 Grp2 Grp3 Grp3 Grp3 Grp3
#> [5771,]    1   10   11   12    2    5    8    9    3    4    6    7
#> [5772,]    1   10   11   12    2    6    7    8    3    4    5    9
#> [5773,]    1   10   11   12    2    6    7    9    3    4    5    8
#> [5774,]    1   10   11   12    2    6    8    9    3    4    5    7
#> [5775,]    1   10   11   12    2    7    8    9    3    4    5    6
```

Just as in `{combo|permute}General`, we can utilize the arguments `lower`, `upper`, `Parallel`, and `nThreads`.

``` r
comboGroupsCount(30, 6)
#> Big Integer ('bigz') :
#> [1] 123378675083039376

system.time(a1 <- comboGroups(30, numGroups = 6,
                              lower = "123378675000000000",
                              upper = "123378675005000000"))
#>    user  system elapsed 
#>   0.121   0.034   0.164

## Use specific number of threads
system.time(a2 <- comboGroups(30, numGroups = 6,
                              lower = "123378675000000000",
                              upper = "123378675005000000", nThreads = 4))
#>    user  system elapsed 
#>   0.144   0.060   0.109

## Use n - 1 number of threads (in this case, there are 7)
system.time(a3 <- comboGroups(30, numGroups = 6,
                              lower = "123378675000000000",
                              upper = "123378675005000000", Parallel = TRUE))
#>    user  system elapsed 
#>   0.161   0.074   0.078

identical(a1, a2)
#> [1] TRUE

identical(a1, a3)
#> [1] TRUE
```

### Partitions of Groups of Varying Sizes

As of `2.8.+` we can generate partitions of groups of varying sizes. For example, say we want to generate all partitions of the vector `v = 1:15` into 2 groups of 3, 1 groups of 4, and 1 group of 5:

``` r
system.time(a4 <- comboGroups(15, grpSizes = c(3, 3, 4, 5)))
#>    user  system elapsed 
#>   0.125   0.030   0.164

ht(a4)
#> head -->
#>      Grp1 Grp1 Grp1 Grp2 Grp2 Grp2 Grp3 Grp3 Grp3 Grp3 Grp4 Grp4 Grp4 Grp4 Grp4
#> [1,]    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
#> [2,]    1    2    3    4    5    6    7    8    9   11   10   12   13   14   15
#> [3,]    1    2    3    4    5    6    7    8    9   12   10   11   13   14   15
#> [4,]    1    2    3    4    5    6    7    8    9   13   10   11   12   14   15
#> [5,]    1    2    3    4    5    6    7    8    9   14   10   11   12   13   15
#> --------
#> tail -->
#>            Grp1 Grp1 Grp1 Grp2 Grp2 Grp2 Grp3 Grp3 Grp3 Grp3 Grp4 Grp4 Grp4
#> [6306296,]   10   14   15   11   12   13    5    6    7    8    1    2    3
#> [6306297,]   10   14   15   11   12   13    5    6    7    9    1    2    3
#> [6306298,]   10   14   15   11   12   13    5    6    8    9    1    2    3
#> [6306299,]   10   14   15   11   12   13    5    7    8    9    1    2    3
#> [6306300,]   10   14   15   11   12   13    6    7    8    9    1    2    3
#>            Grp4 Grp4
#> [6306296,]    4    9
#> [6306297,]    4    8
#> [6306298,]    4    7
#> [6306299,]    4    6
#> [6306300,]    4    5

dim(a4)
#> [1] 6306300      15
```

All of the flexibility offered with groups of equal size are present with groups of varying sizes as well. For example, we can generate groups of varying sizes in parallel:

``` r
system.time(a5 <- comboGroups(15, grpSizes = c(3, 3, 4, 5), nThreads = 4))
#>    user  system elapsed 
#>   0.143   0.034   0.058

identical(a4, a5)
#> [1] TRUE
```

There is one additional argument (i.e. `retType`) not present in the other two general functions that allows the user to specify the type of object returned. The user can select between `"matrix"` (the default) and `"3Darray"`. This structure has a natural connection to 3D space when the size of each group is uniform. We have a particular result (*1<sup>st</sup>* dimension) broken down into groups (*2<sup>nd</sup>* dimension) of a certain size (*3<sup>rd</sup>* dimension).

``` r
my3D <- comboGroups(factor(month.abb), 4, retType = "3Darray")
my3D[1, , ]
#>      Grp1 Grp2 Grp3 Grp4
#> [1,] Jan  Apr  Jul  Oct 
#> [2,] Feb  May  Aug  Nov 
#> [3,] Mar  Jun  Sep  Dec 
#> Levels: Apr Aug Dec Feb Jan Jul Jun Mar May Nov Oct Sep

comboGroupsCount(12, 4)
#> [1] 15400

my3D[15400, , ]
#>      Grp1 Grp2 Grp3 Grp4
#> [1,] Jan  Feb  Mar  Apr 
#> [2,] Nov  Sep  Jul  May 
#> [3,] Dec  Oct  Aug  Jun 
#> Levels: Apr Aug Dec Feb Jan Jul Jun Mar May Nov Oct Sep
```

### Relevant Posts on Stackoverflow as well as OEIS.

- [Iterating through combinations of groups of 4 within a group of 16](<https://stackoverflow.com/a/51754958/4408538>)
- [Create Combinations in R by Groups](<https://stackoverflow.com/q/57732672/4408538>)
- [Algorithm that can create all combinations and all groups of those combinations](<https://stackoverflow.com/q/39126712/4408538>)
- [R expand.grid for repeated combinations of a vector in groups](<https://stackoverflow.com/q/74160916/4408538>)
- [Find all possible teams in R](<https://stackoverflow.com/q/76064565/4408538>)
- <https://oeis.org/A025035> (See also sequences A025036-A025042)