Pattern heterogeneity between two variables across conditions is often fundamental to a scientific inquiry. For example, a biologist could ask whether co-expression between two genes in a cancer cell has been modified from a normal cell. The ‘DiffXTables’ R package answers such questions via evaluating statistical evidence for distributional changes in the involved variables based on observed data without using a parametric mathematical model.
The package provides statistical methods for hypothesis testing of differences in the underlying distributions across two or more contingency tables. They include five statistical tests:
The package also provides a comparative type analysis of difference in association across contingency tables to reveal the highest order of their differences.
Their null test statistics all follow an asymptotically chi-squared null distribution. These options test for heterogeneous patterns that differ in either the first order (marginal) or the second order (joint distribution deviation from product of marginals). Second-order differences may reveal more fundamental changes than first-order differences across heterogeneous patterns.
This package takes a model-free approach without assuming an underlying parametric model for the relationship between variables, in contrast to differential correlation based on differences between linear models. Its input is contingency tables that store the counts or frequencies of discrete variables. Thus, continuous variables need to be discretized before using the tests. One option to do discretization is via optimal univariate clustering provided by the ‘Ckmeans.1d.dp’ R package.
install.packages("DiffXTables")