Publications
Un ouvrage
2018
François Husson, Eric Matzner-Løber, Arnaud Guyader, Pierre Cornillon, Julie Josse, Laurent Rouviere, Nicolas Klutchnikoff, Benoit Thieurmel, Nicolas Jégou, Erwann Le Pennec.
R pour la statistique et la science des données, 2018.
Un article soumis
2025
Hanen Daayeb, Christian Genest, Salah Khardani, Nicolas Klutchnikoff, Frédéric Ouimet.
A comparison of Dirichlet kernel regression methods on the simplex, 2025.
Résumé HAL DOI
An asymmetric Dirichlet kernel version of the Gasser-Müller estimator is introduced for regression surfaces on the simplex, extending the univariate analog proposed by Chen [Statist. Sinica, 10(1) (2000), pp. 73-91]. Its asymptotic properties are investigated under the condition that the design points are known and fixed, including an analysis of its mean integrated squared error (MISE) and its asymptotic normality. The estimator is also applicable in a random design setting. A simulation study compares its performance with two recently proposed alternatives: the Nadaraya–Watson estimator with Dirichlet kernel and the local linear smoother with Dirichlet kernel. The results show that the local linear smoother consistently outperforms the others. To illustrate its applicability, the local linear smoother is applied to the GEMAS dataset to analyze the relationship between soil composition and pH levels across various agricultural and grazing lands in Europe.Vingt et un articles publiés
2025
Karine Bertin, Nicolas Klutchnikoff, Frédéric Ouimet.
A new adaptive local polynomial density estimation procedure on complicated domains. Bernoulli, 2025, 31(3).
Résumé HAL DOI
This paper presents a novel approach for pointwise estimation of multivariate density functions on known domains of arbitrary dimensions using nonparametric local polynomial estimators. Our method is highly flexible, as it applies to both simple domains, such as open connected sets, and more complicated domains that are not star-shaped around the point of estimation. This enables us to handle domains with sharp concavities, holes, and local pinches, such as polynomial sectors. Additionally, we introduce a data-driven selection rule based on the general ideas of Goldenshluger and Lepski. Our results demonstrate that the local polynomial estimators are minimax under a \(L^2\) risk across a wide range of H"older-type functional classes. In the adaptive case, we provide oracle inequalities and explicitly determine the convergence rate of our statistical procedure. Simulations on polynomial sectors show that our oracle estimates outperform those of the most popular alternative method, found in the sparr package for the R software. Our statistical procedure is implemented in an online R package which is readily accessible.Omar Kassi, Nicolas Klutchnikoff, Valentin Patilea.
Learning the regularity of multivariate functional data. Electronic Journal of Statistics, 2025, 19(2).
Résumé HAL DOI
Combining information both within and between sample realizations, we propose a simple estimator for the local regularity of surfaces in the functional data framework. The independently generated surfaces are measured with error at possibly random discrete times. Non-asymptotic exponential bounds for the concentration of the regularity estimators are derived. An indicator for anisotropy is proposed and an exponential bound of its risk is derived. Two applications are proposed. We first consider the class of multi-fractional, bi-dimensional, Brownian sheets with domain deformation, and study the nonparametric estimation of the deformation. As a second application, we build optimal, bivariate kernel estimators for the reconstruction of the surfaces.Steven Golovkine, Nicolas Klutchnikoff, Valentin Patilea.
Adaptive optimal estimation of irregular mean and covariance functions. Bernoulli, 2025, 31(2).
Résumé HAL DOI
We propose straightforward nonparametric estimators for the mean and the covariance functions of functional data. Our setup covers a wide range of practical situations. The random trajectories are, not necessarily differentiable, have unknown regularity, and are measured with error at discrete design points. The measurement error could be heteroscedastic. The design points could be either randomly drawn or common for all curves. The definition of our nonparametric estimators depends on the local regularity of the stochastic process generating the functional data. We first propose a simple estimator of this local regularity which takes strength from the replication and regularization features of functional data. Next, we use the “smoothing first, then estimate” approach for the mean and the covariance functions. The new nonparametric estimators achieve optimal rates of convergence. They can be applied with both sparsely or densely sampled curves, are easy to calculate and to update, and perform well in simulations. Simulations built upon a real data example on household power consumption illustrate the effectiveness of the new approach.2024
Arthur Stéphanovitch, Ugo Tanielian, Benoît Cadre, Nicolas Klutchnikoff, Gérard Biau.
Optimal 1-Wasserstein distance for WGANs. Bernoulli, 2024, 30(4).
Résumé HAL DOI
The mathematical forces at work behind Generative Adversarial Networks raise challenging theoretical issues. Motivated by the important question of characterizing the geometrical properties of the generated distributions, we provide a thorough analysis of Wasserstein GANs (WGANs) in both the finite sample and asymptotic regimes. We study the specific case where the latent space is univariate and derive results valid regardless of the dimension of the output space. We show in particular that for a fixed sample size, the optimal WGANs are closely linked with connected paths minimizing the sum of the squared Euclidean distances between the sample points. We also highlight the fact that WGANs are able to approach (for the 1-Wasserstein distance) the target distribution as the sample size tends to infinity, at a given convergence rate and provided the family of generative Lipschitz functions grows appropriately. We derive in passing new results on optimal transport theory in the semi-discrete setting.Sunny Wang, Valentin Patilea, Nicolas Klutchnikoff.
Adaptive functional principal components analysis. Journal of the Royal Statistical Society. Series B. Methodological, 2024.
Résumé HAL DOI
Functional data analysis almost always involves smoothing discrete observations into curves, because they are never observed in continuous time and rarely without error. Although smoothing parameters affect the subsequent inference, data-driven methods for selecting these parameters are not well-developed, frustrated by the difficulty of using all the information shared by curves while being computationally efficient. On the one hand, smoothing individual curves in an isolated, albeit sophisticated way, ignores useful signals present in other curves. On the other hand, bandwidth selection by automatic procedures such as cross-validation after pooling all the curves together quickly become computationally unfeasible due to the large number of data points. In this paper we propose a new data-driven, adaptive kernel smoothing, specifically tailored for functional principal components analysis through the derivation of sharp, explicit risk bounds for the eigen-elements. The minimization of these quadratic risk bounds provide refined, yet computationally efficient bandwidth rules for each eigen-element separately. Both common and independent design cases are allowed. Rates of convergence for the estimators are derived. An extensive simulation study, designed in a versatile manner to closely mimic the characteristics of real data sets supports our methodological contribution. An illustration on a real data application is provided.2023
Karine Bertin, Christian Genest, Frédéric Ouimet, Nicolas Klutchnikoff.
Minimax properties of Dirichlet kernel density estimators. Journal of Multivariate Analysis, 2023, 195, 105158.
Résumé HAL DOI
This paper is concerned with the asymptotic behavior in \(\beta\)-H"older spaces and under \(L^p\) losses of a Dirichlet kernel density estimator introduced by Aitchison & Lauder (1985) and studied theoretically by Ouimet & Tolosana-Delgado (2021). It is shown that the estimator is minimax when \(p \in [1, 3)\) and \(\beta \in (0, 2]\), and that it is never minimax when \(p \in [4, \infty)\) or \(\beta \in (2, \infty)\). These results rectify in a minor way and, more importantly, extend to all dimensions those already reported in the univariate case by Bertin & Klutchnikoff (2011).2022
Nicolas Klutchnikoff, Audrey Poterie, Laurent Rouviere.
Statistical analysis of a hierarchical clustering algorithm with outliers. Journal of Multivariate Analysis, 2022, 192, article n° 105075.
Résumé HAL DOI
It is well known that the classical single linkage algorithm usually fails to identify clusters in the presence of outliers. In this paper, we propose a new version of this algorithm, and we study its mathematical performances. In particular, we establish an oracle type inequality which ensures that our procedure allows to recover the clusters with large probability under minimal assumptions on the distribution of the outliers. We deduce from this inequality the consistency and some rates of convergence of our algorithm for various situations. Performances of our approach is also assessed through simulation studies and a comparison with classical clustering algorithms on simulated data is also presented.Steven Golovkine, Nicolas Klutchnikoff, Valentin Patilea.
Learning the smoothness of noisy curves with application to online curve estimation. Electronic Journal of Statistics, 2022, 16(1), 1485-1560.
Résumé HAL DOI
Combining information both within and across trajectories, we propose a simple estimator for the local regularity of the trajectories of a stochastic process. Independent trajectories are measured with errors at randomly sampled time points. Non-asymptotic bounds for the concentration of the estimator are derived. Given the estimate of the local regularity, we build a nearly optimal local polynomial smoother from the curves from a new, possibly very large sample of noisy trajectories. We derive non-asymptotic pointwise risk bounds uniformly over the new set of curves. Our estimates perform well in simulations. Real data sets illustrate the effectiveness of the new approaches.Steven Golovkine, Nicolas Klutchnikoff, Valentin Patilea.
Clustering multivariate functional data using unsupervised binary trees. Computational Statistics and Data Analysis, 2022, 168, article n°107376.
Résumé HAL DOI
We propose a model-based clustering algorithm for a general class of functional data for which the components could be curves or images. The random functional data realizations could be measured with error at discrete, and possibly random, points in the definition domain. The idea is to build a set of binary trees by recursive splitting of the observations. The number of groups are determined in a data-driven way. The new algorithm provides easily interpretable results and fast predictions for online data sets. Results on simulated datasets reveal good performance in various complex settings. The methodology is applied to the analysis of vehicle trajectories on a German roundabout.2021
Karine Bertin, Nicolas Klutchnikoff.
Adaptive regression with Brownian path covariate. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques, 2021, 57(3), 1495-1520.
Résumé HAL DOI
This paper deals with estimation with functional covariates. More precisely , we aim at estimating the regression function m of a continuous outcome Y against a standard Wiener coprocess W. Following Cadre and Truquet (2015) and Cadre, Klutchnikoff, and Massiot (2017) the Wiener-Itô decomposition of m(W) is used to construct a family of estimators. The minimax rate of convergence over specific smoothness classes is obtained. A data-driven selection procedure is defined following the ideas developed by Goldenshluger and Lepski (2011). An oracle-type inequality is obtained which leads to adaptive results.2020
Karine Bertin, Nicolas Klutchnikoff, Fabien Panloup, Maylis Varvenne.
Adaptive estimation of the stationary density of a stochastic differential equation driven by a fractional Brownian motion. Statistical Inference for Stochastic Processes, 2020, 23(2), 271-300.
Karine Bertin, Nicolas Klutchnikoff, José León, Clémentine Prieur.
Adaptive density estimation on bounded domains under mixing conditions. Electronic Journal of Statistics, 2020, 14(1), 2198-2237.
Résumé HAL DOI
In this article, we propose a new adaptive estimator for compact supported density functions, in the framework of multivariate mixing processes. Several procedures have been proposed in the literature to tackle the boundary bias issue encountered using classical kernel estimators on the unit d-dimensional hypercube. We extend such results to more general bounded domains in R d. We introduce a specific family of kernel-type estimators adapted to the estimation of compact supported density functions. We then propose a data-driven Goldenshluger and Lepski type procedure to jointly select a kernel and a bandwidth. We prove the optimality of our procedure in the adaptive framework, stating an oracle-type inequality. We illustrate the good behavior of our new class of estimators on simulated data. Finally, we apply our procedure to a real dataset.2019
Karine Bertin, Salima El Kolei, Nicolas Klutchnikoff.
Adaptive density estimation on bounded domains. Annales de l’Institut Henri Poincaré, 2019, 55(4), 1916-1947.
Résumé HAL DOI
We study the estimation, in L p-norm, of density functions dened on [0, 1] d. We construct a new family of kernel density estimators that do not suer from the so-called boundary bias problem and we propose a data-driven procedure based on the Goldenshluger and Lepski approach that jointly selects a kernel and a bandwidth. We derive two estimators that satisfy oracle-type inequalities. They are also proved to be adaptive over a scale of anisotropic or isotropic Sobolev-Slobodetskii classes (which are particular cases of Besov or Sobolev classical classes). The main interest of the isotropic procedure is to obtain adaptive results without any restriction on the smoothness parameter. Abstract Nous étudions l’estimation, en norme L p , d’une densité de probabilté dénie sur [0, 1] d. Nous construisons une nouvelle famille d’estimateurs à noyaux qui ne sont pas biaisés au bord du domaine de dénition et nous proposons une procédure de sélection simultanée d’un noyau et d’une fenêtre de lissage en adaptant la méthode développée par Goldenshluger et Lepski. Deux estimateurs diérents, déduits de cette procédure générale, sont proposés et des inégalités oracles sont établies pour chacun d’eux. Ces inégalités permettent de prouver que les-dits estimateurs sont adapatatifs par rapport à des familles de classes de Sobolev-Slobodetskii anisotropes ou isotropes. Dans cette dernière situation aucune borne supérieure sur le paramètre de régularité n’est imposée.2017
Benoît Cadre, Nicolas Klutchnikoff, Gaspar Massiot.
Minimax regression estimation for Poisson coprocess. ESAIM: Probability and Statistics, 2017, 21, 138-158.
Résumé HAL DOI
For a Poisson point process X, Itô’s famous chaos expansion implies that every square integrable regression function r with covariate X can be decomposed as a sum of multiple stochastic integrals called chaos. In this paper, we consider the case where r can be decomposed as a sum of δ chaos. In the spirit of Cadre and Truquet (2015), we introduce a semiparametric estimate of r based on i.i.d. copies of the data. We investigate the asymptotic minimax properties of our estimator when δ is known. We also propose an adaptive procedure when δ is unknown.Karine Bertin, Nicolas Klutchnikoff.
Pointwise adaptive estimation of the marginal density of a weakly dependent process. Journal of Statistical Planning and Inference, 2017, 187, 115-129.
Résumé HAL DOI
This paper is devoted to the estimation of the common marginal density function of weakly dependent processes. The accuracy of estimation is measured using pointwise risks. We propose a datadriven procedure using kernel rules. The bandwidth is selected using the approach of Goldenshluger and Lepski and we prove that the resulting estimator satisfies an oracle type inequality. The procedure is also proved to be adaptive (in a minimax framework) over a scale of H"older balls for several types of dependence: stong mixing processes, \(\lambda\)-dependent processes or i.i.d. sequences can be considered using a single procedure of estimation. Some simulations illustrate the performance of the proposed method.2016
2015
Stéphane Auray, Nicolas Klutchnikoff, Laurent Rouvière.
On clustering procedures and nonparametric mixture estimation. Electronic Journal of Statistics, 2015, 9, 266-297.
Résumé HAL DOI
This paper deals with nonparametric estimation of conditional den-sities in mixture models in the case when additional covariates are available. The proposed approach consists of performing a prelim-inary clustering algorithm on the additional covariates to guess the mixture component of each observation. Conditional densities of the mixture model are then estimated using kernel density estimates ap-plied separately to each cluster. We investigate the expected L 1 -error of the resulting estimates and derive optimal rates of convergence over classical nonparametric density classes provided the clustering method is accurate. Performances of clustering algorithms are measured by the maximal misclassification error. We obtain upper bounds of this quantity for a single linkage hierarchical clustering algorithm. Lastly, applications of the proposed method to mixture models involving elec-tricity distribution data and simulated data are presented.2014
Nicolas Klutchnikoff.
Pointwise adaptive estimation of a multivariate function. Mathematical Methods of Statistics, 2014, 23(2), 132-150.
Karine Bertin, Nicolas Klutchnikoff.
Adaptative estimation of a density function using beta kernels. ESAIM: Probability and Statistics, 2014.
Résumé HAL DOI
In this paper we are interested in the estimation of a density − defined on a compact interval of R − from n independent and identically distributed observations. In order to avoid boundary effect, beta kernel estimators are used and we propose a procedure (inspired by Lepski’s method) in order to select the bandwidth. Our procedure is proved to be adaptive in an asymptotically minimax framework. Our estimator is compared with both the cross-validation algorithm and the oracle estimator using simulated data.2011
Karine Bertin, Nicolas Klutchnikoff.
Minimax properties of beta kernel estimators. Journal of Statistical Planning and Inference, 2011, 141(7), 2287-2297.
Trois résultats non publiés
2016
Nicolas Klutchnikoff, Gaspar Massiot.
Kernel estimation of the intensity of Cox processes, 2016.
Résumé HAL
Counting processes often written \(N=(N_t)_{t\in\mathbb{R}^+}\) are used in several applications of biostatistics, notably for the study of chronic diseases. In the case of respiratory illness it is natural to suppose that the count of the visits of a patient can be described by such a process which intensity depends on environmental covariates. Cox processes (also called doubly stochastic Poisson processes) allows to model such situations. The random intensity then writes \(\lambda(t)=\theta(t,Z_t)\) where \(\theta\) is a non-random function, \(t\in\mathbb{R}^+\) is the time variable and \((Z_t)_{t\in\mathbb{R}^+}\) is the \(d\)-dimensional covariates process. For a longitudinal study over \(n\) patients, we observe \((N_t^k,Z_t^k)_{t\in\mathbb{R}^+}\) for \(k=1,\ldots,n\). The intention is to estimate the intensity of the process using these observations and to study the properties of this estimator.2005
Nicolas Klutchnikoff.
Adaptive estimation on anisotropic Hölder spaces Part II. Partially adaptive case, 2005.
Résumé HAL
In this paper, we consider a problem of estimation. The model is the same as in the first part of this paper. Our goal is to construct an adaptive estimator over a family of Hölder spaces if an additionnal information is known. Typically, we suppose that the “effective smoothness parameter” is known. A knowledge of this type is to be understood as follows: If you want to estimate with a given rate (precision), you have to fix this parameter. We construct an estimator which is minimax over the union of all Hölder spaces defined using this effective smoothness. This problem is linked to the maxiset theory.Nicolas Klutchnikoff.
Adaptive estimation on anisotropic Hölder spaces Part I. Fully adaptive case, 2005.
Résumé HAL
In this paper, we consider the following problem: We want to estimate a noisy signal. The main problem is to find a “good” estimator. To do that, we propose a new criterion to chose, among all possible estimators, the best one. This criterion is useful to define the best family of rates of convergence for any adaptive problem (in a minimax sense). Then, we construct an adaptive estimator (over a family of anisotropic Hölder spaces) for the pointwise loss and we prove its optimality in our sense. This estimator looks like the Lepski’s procedure.Une thèse
2005
Nicolas Klutchnikoff.
Sur l’estimation adaptative de fonctions anisotropes, 2005.