
<!-- README.md is generated from README.Rmd. Please edit that file -->

# inphr

<!-- badges: start -->

[![Codecov test
coverage](https://codecov.io/gh/tdaverse/inphr/graph/badge.svg)](https://app.codecov.io/gh/tdaverse/inphr)
[![R-CMD-check](https://github.com/tdaverse/inphr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/tdaverse/inphr/actions/workflows/R-CMD-check.yaml)
[![CRAN
status](https://www.r-pkg.org/badges/version/inphr)](https://CRAN.R-project.org/package=inphr)
<!-- badges: end -->

The goal of [{inphr}](https://tdaverse.github.io/inphr/) is to provide a
set of functions for performing null hypothesis testing on samples of
persistence diagrams using the theory of permutations. Currently, only
two-sample testing is implemented. Inputs can be either samples of
persistence diagrams themselves or vectorizations. In the former case,
they are embedded in a metric space using either the Bottleneck or
Wasserstein distance. In the former case, persistence data becomes
functional data and inference is performed using tools available in the
[{fdatest}](https://permaverse.github.io/fdatest/) package.

## Installation

You can install the development version of inphr from
[GitHub](https://github.com/) with:

``` r
# install.packages("pak")
pak::pak("tdaverse/inphr")
```

## Usage

Let us start by loading the package:

``` r
library(inphr)
```

### Toy data

The package contains three toy data sets of persistence diagrams, which
can be used for testing. They are available in the package as
`trefoils1`, `trefoils2`, and `archspirals`. The first two sets contain
persistence diagrams computed from noisy samples of trefoil knots, while
the third set contains persistence diagrams computed from noisy samples
of 2-armed Archimedean spirals. Each set contains 24 persistence
diagrams, each computed from a sample of 120 points sampled from the
respective shape, with Gaussian noise added (standard deviation = 0.05).
The persistence diagrams were computed using the
[`TDA::ripsDiag()`](https://www.rdocumentation.org/packages/TDA/versions/1.9.1/topics/ripsDiag)
function with a maximum scale of 6 and up to dimension 2.

### Test in the space of diagrams

You can use the
[`two_sample_diagram_test()`](https://tdaverse.github.io/inphr/reference/two_sample_diagram_test.html)
function to perform a two-sample test on these persistence diagrams in
the space of diagrams themselves. For example, to test whether the
persistence diagrams from `trefoils1` are significantly different from
the persistence diagrams from `trefoils2`, you can run:

``` r
two_sample_diagram_test(trefoils1, trefoils2, B = 100L)
#> [1] 1
```

To test whether the persistence diagrams from `trefoils1` are
significantly different from the persistence diagrams from
`archspirals`, you can run:

``` r
two_sample_diagram_test(trefoils1, archspirals, B = 100L)
#> [1] 0.00990099
```

Optionnally, the `two_sample_diagram_test()` function can also output
the distribution of the test statistic under the null hypothesis as
estimated by the permutation scheme. To do that, you can use the
optional argument `keep_null_distribution = TRUE`. It is also possible
to ask for the permutations themselves to be saved as part of the
output. To do that, you can use the optional argument
`keep_permutations = TRUE`.

Test in the space of diagrams themselves is performed using test
statistics that only rely on distances between sampled diagrams. By
default, two such statistics that mimic Student’s t-statistic and
Fisher’s F-statistic are used as proposed in Lovato, I., Pini, A.,
Stamm, A., & Vantini, S. (2020), *Model-free two-sample test for
network-valued data*. Computational Statistics & Data Analysis, **144**,
106896.

### Test in functional spaces

You can use the
[`two_sample_functional_test()`](https://tdaverse.github.io/inphr/reference/two_sample_functional_test.html)
function to perform a two-sample test on these persistence diagrams in
functional spaces using one of five functional representations of
persistence diagrams, namely: (i) Betti, (ii) Euler characteristic,
(iii) normalized life, (iv) silhouette and (v) entropy curves.
Computation of these functional representations is powered by the
[{TDAvec}](https://cran.r-project.org/package=TDAvec) package. For
example, to test whether the persistence diagrams from `trefoils1` are
significantly different from the persistence diagrams from
`archspirals`, you can use the Betti curve representation and run:

``` r
out <- two_sample_functional_test(
  trefoils1,
  archspirals,
  representation = "betti",
  B = 100L
)
```

The output is a length-4 list. The first two elements are `xfd` and
`yfd` which are numeric matrices storing evaluations of the functional
representation of the diagrams on a grid stored as the third element
`scale_seq`. You can therefore have a look at the functional data that
the function produced using something like:

``` r
matplot(
  out$scale_seq[-1],
  t(rbind(out$xfd, out$yfd)),
  type = "l",
  col = c(rep(1, length(trefoils1)), rep(2, length(archspirals)))
)
```

<img src="man/figures/README-unnamed-chunk-7-1.png" width="100%" />

In the case of testing in functional spaces, {inphr} uses the
interval-wise testing (IWT) procedure powered by the
[{fdatest}](https://cran.r-project.org/package=fdatest) package which
has been proposed in Pini, A., & Vantini, S. (2017), *Interval-wise
testing for functional data*. Journal of Nonparametric Statistics,
**29**(2), 407-424.

The output indicates on which portions of the scale sequence does the
difference between the two samples occur, providing strong control of
the familywise error rate:

``` r
plot(out$iwt, xrange = range(out$scale_seq))
```

<img src="man/figures/README-unnamed-chunk-8-1.png" width="100%" /><img src="man/figures/README-unnamed-chunk-8-2.png" width="100%" />

## Contributions

### Code of Conduct

Contributions are welcome! Please feel free to open an issue or a pull
request if you have any suggestions or improvements. The package is
still in its early stages, so any feedback is appreciated.

Please note that the {inphr} project is released with a [Contributor
Code of
Conduct](https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html).
By contributing to this project, you agree to abide by its terms.

### Acknowledgements

This project was funded by [an ISC grant from the R
Consortium](https://r-consortium.org/all-projects/2024-group-1.html#modular-interoperable-and-extensible-topological-data-analysis-in-r)
and done in coordination with Jason Cory Brunson and with guidance from
Bertrand Michel and Paul Rosen. It builds upon conversations with
Mathieu Carrière and Vincent Rouvreau who are among the authors of the
[GUDHI](https://gudhi.inria.fr) library. Package development also
benefited from the support of colleagues at the [Department of
Mathematics Jean Leray](https://www.math.sciences.univ-nantes.fr) and
the use of equipment at [Nantes
University](https://english.univ-nantes.fr).
