Extreme value theory (EVT) deals with extreme (rare) events, which are sometimes reported as outliers. Certain textbooks encourage readers to remove outliers—in other words, to correct reality i
Data mining is concerned with the analysis of databases large enough that various anomalies, including outliers, incomplete data records, and more subtle phenomena such as misalignment errors, are vir
Cluster analysis finds groups in data automatically. Most methods have been heuristic and leave open such central questions as: how many clusters are there? Which method should I use? How should I handle outliers? Classification assigns new observations to groups given previously classified observations, and also has open questions about parameter tuning, robustness and uncertainty assessment. This book frames cluster analysis and classification in terms of statistical models, thus yielding principled estimation, testing and prediction methods, and sound answers to the central questions. It builds the basic ideas in an accessible but rigorous way, with extensive data examples and R code; describes modern approaches to high-dimensional data and networks; and explains such recent advances as Bayesian regularization, non-Gaussian model-based clustering, cluster merging, variable selection, semi-supervised and robust classification, clustering of functional data, text and images, and co-cl
Though the proportion of women in national assemblies still barely scrapes 16% on average, the striking outliers – Rwanda with 49% of its assembly female, Argentina with 35%, Liberia and Chile with ne