# Nonasymptotic one-and two-sample tests in high dimension with unknown covariance structure

2 DATASHAPE - Understanding the Shape of Data
CRISAM - Inria Sophia Antipolis - Méditerranée , Inria Saclay - Ile de France
Abstract : Let $\mathbf{X} = (X_i)_{1\leq i \leq n}$ be an i.i.d. sample of square-integrable variables in $\mathbb{R}^d$, with common expectation $\mu$ and covariance matrix $\Sigma$, both unknown. We consider the problem of testing if $\mu$ is $\eta$-close to zero, i.e. $\|\mu\| \leq \eta$ against $\|\mu\| \geq (\eta + \delta)$; we also tackle the more general two-sample mean closeness (also known as *relevant difference*) testing problem. The aim of this paper is to obtain nonasymptotic upper and lower bounds on the minimal separation distance $\delta$ such that we can control both the Type I and Type II errors at a given level. The main technical tools are concentration inequalities, first for a suitable estimator of $\|\mu\|^2$ used a test statistic, and secondly for estimating the operator and Frobenius norms of $\Sigma$ coming into the quantiles of said test statistic. These properties are obtained for Gaussian and bounded distributions. A particular attention is given to the dependence in the pseudo-dimension $d_*$ of the distribution, defined as $d_* := \|\Sigma\|_2^2/\|\Sigma\|_\infty^2$. In particular, for $\eta=0$, the minimum separation distance is ${\Theta}( d_*^{\frac{1}{4}}\sqrt{\|\Sigma\|_\infty/n})$, in contrast with the minimax estimation distance for $\mu$, which is ${\Theta}(d_e^{\frac{1}{2}}\sqrt{\|\Sigma\|_\infty/n})$ (where $d_e:=\|\Sigma\|_1/\|\Sigma\|_\infty$). This generalizes a phenomenon spelled out in particular by Baraud (2002).
Keywords :
Document type :
Preprints, Working Papers, ...
Domain :

https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03329848
Contributor : Gilles Blanchard Connect in order to contact the contributor
Submitted on : Thursday, October 7, 2021 - 11:49:50 AM
Last modification on : Wednesday, April 20, 2022 - 3:44:11 AM

### Files

Blanchard_Fermanian_clean_HAL_...
Files produced by the author(s)

### Identifiers

• HAL Id : hal-03329848, version 2

### Citation

Gilles Blanchard, Jean-Baptiste Fermanian. Nonasymptotic one-and two-sample tests in high dimension with unknown covariance structure. 2021. ⟨hal-03329848v2⟩

Record views