Last year's top ten most cited papers
Top ten most cited papers in 2020 according to Web of Science (WOS)

Twenty years of Psplines (invited article)
Paul H.C. Eilers, Brian D. Marx and Maria Durbán
Abstract: Psplines first appeared in the limelight twenty years ago. Since then they have become popular in applications and in theoretical work. The combination of a rich Bspline basis and a simple difference penalty lends itself well to a variety of generalizations, because it is based on regression. In effect, Psplines allow the building of a “backbone” for the “mixing and matching” of a variety of additive smooth structure components, while inviting all sorts of extensions: varyingcoefficient effects, signal (functional) regressors, twodimensional surfaces, nonnormal responses, quantile (expectile) modelling, among others. Strong connections with mixed models and Bayesian analysis have been established. We give an overview of many of the central developments during the first two decades of Psplines.
Keywords: Bsplines, penalty, additive model, mixed model, multidimensional smoothing.
Pages: 149–186
DOI: 10.2436/20.8080.02.25
Vol 39 (2) 2015

On the interpretation of differences between groups for compositional data
JosepAntoni MartínFernández, Josep DaunisiEstadella and Glòria MateuFigueras
Abstract: Social polices are designed using information collected in surveys; such as the Catalan Time Use survey. Accurate comparisons of time use data among population groups are commonly analysed using statistical methods. The total daily time expended on different activities by a single person is equal to 24 hours. Because this type of data are compositional, its sample space has particular properties that statistical methods should respect. The critical points required to interpret differences between groups are provided and described in terms of logratio methods. These techniques facilitate the interpretation of the relative differences detected in multivariate and univariate analysis.
Keywords: Logratio transformations, MANOVA, perturbation, simplex, subcomposition.
Pages: 231–252
DOI: 10.2436/20.8080.02.28
Vol 39 (2) 2015

Estimation in the BirnbaumSaunders distribution based on scalemixture of normals and the EMalgorithm
Narayanaswamy Balakrishnan, Víctor Leiva, Antonio Sanhueza and Filidor Vilca
Abstract: Scale mixtures of normal (SMN) distributions are used for modeling symmetric data. Membersof this family have appealing properties such as robust estimates, easy number generation, andefficient computation of the ML estimates via the EMalgorithm. The BirnbaumSaunders (BS)distribution is a positively skewed model that is related tothe normal distribution and has receivedconsiderable attention. We introduce a type of BS distributions based on SMN models, producea lifetime analysis, develop the EMalgorithm for ML estimation of parameters, and illustrate theobtained results with real data showing the robustness of the estimation procedure.
Keywords: BirnbaumSaunders distribution, EMalgorithm, kurtosis, maximum likelihood methods,robust estimation, scale mixtures of normal distribution.
Pages: 171–192
Vol 33 (2) 2009

The normal distribution in some constrained sample spaces
Glòria MateuFigueras, Vera PawlowskyGlahn and JuanJosé Egozcue
Abstract: Phenomena with a constrained sample space appear frequently in practice. This is the case,for example, with strictly positive data, or with compositional data, such as percentages orproportions. If the natural measure of difference is not theabsolute one, simple algebraicproperties show that it is more convenient to work with a geometry different from the usualEuclidean geometry in real space, and with a measure different from the usual Lebesguemeasure, leading to alternative models that better fit the phenomenon under study. The generalapproach is presented and illustrated using the normal distribution, both on the positive real lineand on theDpart simplex. The original ideas of McAlister in his introduction to the lognormaldistribution in 1879, are recovered and updated.
Keywords: Additive logistic normal distribution, Aitchison measure, Lebesgue measure, lognormal distribution, orthonormal basis, simplex.
Pages: 29–56
Vol 37 (1) 2013

Thirty years of progeny from Chao’s inequality: Estimating and comparing richness with incidence data and incomplete sampling
Anne Chao and Robert K. Colwell
Abstract: In the context of capturerecapture studies, Chao (1987) derived an inequality among capture frequency counts to obtain a lower bound for the size of a population based on individuals’ capture/noncapture records for multiple capture occasions. The inequality has been applied to obtain a nonparametric lower bound of species richness of an assemblage based on species incidence (detection/nondetection) data in multiple sampling units. The inequality implies that the number of undetected species can be inferred from the species incidence frequency counts of the uniques (species detected in only one sampling unit) and duplicates (species detected in exactly two sampling units). In their pioneering paper, Colwell and Coddington (1994) gave the name “Chao2” to the estimator for the resulting species richness. (The “Chao1” estimator refers to a similar type of estimator based on species abundance data). Since then, the Chao2 estimator has been applied to many research fields and led to fruitful generalizations. Here, we first review Chao’s inequality under various models and discuss some related statistical inference questions: (1) Under what conditions is the Chao2 estimator an unbiased point estimator? (2) How many additional sampling units are needed to detect any arbitrary proportion (including 100%) of the Chao2 estimate of asymptotic species richness? (3) Can other incidence frequency counts be used to obtain similar lower bounds? We then show how the Chao2 estimator can be also used to guide a nonasymptotic analysis in which species richness estimators can be compared for equallylarge or equallycomplete samples via samplesizebased and coveragebased rarefaction and extrapolation. We also review the generalization of Chao’s inequality to estimate species richness under other samplingwithoutreplacement schemes (e.g. a set of quadrats, each surveyed only once), to obtain a lower bound of undetected species shared between two or multiple assemblages, and to allow inferences about undetected phylogenetic richness (the total length of undetected branches of a phylogenetic tree connecting all species), with associated rarefaction and extrapolation. A small empirical dataset for Australian birds is used for illustration, using online software SpadeR, iNEXT, and PhD.
Keywords: CauchySchwarz inequality, Chao2 estimator, extrapolation, GoodTuring frequency, formula, incidence data, phylogenetic diversity, rarefaction, sampling effort, shared species richness, species richness.
Pages: 3–54
DOI: 10.2436/20.8080.02.49
Vol 41 (1) 2017

A statistical learning based approach for parameter finetuning of metaheuristics
Laura Calvet, Angel A. Juan, Carles Serrat and Jana Ries
Abstract: Metaheuristics are approximation methods used to solve combinatorial optimization problems. Their performance usually depends on a set of parameters that need to be adjusted. The selection of appropriate parameter values causes a loss of efficiency, as it requires time, and advanced analytical and problemspecific skills. This paper provides an overview of the principal approaches to tackle the Parameter Setting Problem, focusing on the statistical procedures employed so far by the scientific community. In addition, a novel methodology is proposed, which is tested using an already existing algorithm for solving the MultiDepot Vehicle Routing Problem.
Keywords: Parameter finetuning, metaheuristics, statistical learning, biased randomization.
Pages: 201–224
DOI: 10.2436/20.8080.02.41
Volume 40 (1) 2016

Stressstrength reliability of Weibull distribution based on progressively censored samples
Akbar Asgharzadeh, Reza Valiollahi, and Mohammad Z. Raqab
Abstract: Based on progressively TypeII censored samples, this paper deals with inference for the stressstrength reliabilityR=P(Y<X) whenXandYare two independent Weibull distributions withdifferent scale parameters, but having the same shape parameter. The maximum likelihood estimator, and the approximate maximum likelihood estimator ofRare obtained. Different confidenceintervals are presented. The Bayes estimator ofRand the corresponding credible interval usingthe Gibbs sampling technique are also proposed. Further, weconsider the estimation ofRwhenthe same shape parameter is known. The results for exponential and Rayleigh distributions canbe obtained as special cases with different scale parameters. Analysis of a real data set as well aMonte Carlo simulation have been presented for illustrative purposes.
Keywords: Maximum likelihood estimator, Approximate maximum likelihood estimator, Bootstrapconfidence interval, Bayesian estimation, MetropolisHasting method, Progressive TypeII censorin.
Pages: 103–124
Vol 35 (2) 2011

A comparison of some confidence intervals for estimating the population coefficient of variation: a simulation study
Monika Gulhar, B. M. Golam Kibria, Ahmed N. Albatineh and Nasar U. Ahmed
Abstract: This paper considers several confidence intervals for estimating the population coefficient ofvariation based on parametric, nonparametric and modified methods. A simulation study has beenconducted to compare the performance of the existing and newly proposed interval estimators.Many intervals were modified in our study by estimating the variance with the median insteadof the mean and these modifications were also successful. Data were generated from normal,chisquare, and gamma distributions for CV = 0.1, 0.3, and 0.5. We reported coverage probabilityand interval length for each estimator. The results were applied to two public health data: childbirth weight and cigarette smoking prevalence. Overall, good intervals included an interval forchisquare distributions by McKay (1932), an interval estimator for normal distributions by Miller(1991), and our proposed interval.
Keywords: Average width, coefficient of variation, inverted coefficient of variation, confidence interval, coverage probability, simulation study, skewed distributions.
Pages: 45–68
Vol 36 (1) 2012

The new class of Kummer beta generalized distributions
Rodrigo R. Pescim, Gauss M. Cordeiro, Clarice G. B. Demétrio, Edwin M. M. Ortega and Saralees Nadarajah
Abstract: Ng and Kotz (1995) introduced a distribution that provides greater flexibility to extremes. We defineand study a new class of distributions called the Kummer betageneralized family to extend thenormal, Weibull, gamma and Gumbel distributions, among several other wellknown distributions.Some special models are discussed. The ordinary moments of any distribution in the new familycan be expressed as linear functions of probability weighted moments of the baseline distribution.We examine the asymptotic distributions of the extreme values. We derive the density functionof the order statistics, mean absolute deviations and entropies. We use maximum likelihoodestimation to fit the distributions in the new class and illustrate its potentiality with an applicationto a real data set.
Keywords: Generalized distribution, Kummer beta distribution, likelihood ratio test, moment, orderstatistic, Weibull distribution.
Pages: 153–180
Vol 36 (2) 2012

Hurdle negative binomial regression model with right censored count data
Seyed Ehsan Saffari, Robiah Adnan and William Greene
Abstract: A Poisson model typically is assumed for count data. In many cases because of many zeros inthe response variable, the mean is not equal to the variance value of the dependent variable.Therefore, the Poisson model is no longer suitable for this kind of data. Thus, we suggestusing a hurdle negative binomial regression model to overcome the problem of overdispersion.Furthermore, the response variable in such cases is censored for some values. In this paper,a censored hurdle negative binomial regression model is introduced on count data with manyzeros. The estimation of regression parameters using maximum likelihood is discussed and thegoodnessoffit for the regression model is examined.
Keywords: Hurdle negative binomial regression, censored data, maximum likelihood method,simulation.
Pages: 181–194
Vol 36 (2) 2012