Statistics for Big data
Welcome to the site of the Big Statistics section! Our aim is to link Big Data to clinical response by novel, problem-specific statistical methods. Here, 'Big' means: big in sample size (n) and/or big in number of variables (p), including high-dimensional ("omics") data.
As part of the Department of Epidemiology and Data Science of the Amsterdam UMC, our section is involved in consultancy, research and teaching. More information about who we are, our work and how to reach us can be found in these pages.
Networks: Developing methods to learn molecular networks from omics data.
Statistical omics: Building dedicated statistical models to test associations of omics variables with clinical parameters.
Co-data learning: Improving prediction and variable selection by accounting for complementary data.
Big longitudinal data: Modelling high-dimensional longitudinal data.
Causal inference: Drawing conclusions on what causes what in complex learning problems applied to (big) data.
Record linkage: Linking clinical information with anonimized information on an aggregated level (e.g. postal code).
More on Statistics
Software & Support
Big data analysis support is core business for our group. We supply tailored solutions for a variety of big data analysis questions in the AUmc, covering study design, preprocessing and downstream analysis. We collaborate with researchers from a variety of disciplines, such as oncology, cardiology and neurology.
We love data. The more, the better. Many of us have experience with omics data, which refers to the high-throughput quantification of some pool of molecular molecules. Often, these data are high-dimensional meaning they have more features than observations. Our group provides statistical support for the processing and analysis of a wide variety of omics data, such as genomic, metabolomic, and radiomic data. Our expertise ranges from next-generation sequencing platforms for genomics, to various platforms for proteomics/metabolomics and imaging features (radiomics).
We also support analysis of truly big n data, with a focus on observational cohorts. We are, for example, involved in the exposome project for the analysis of large longitudinal data.More on Data
multiridge package on CRAN13-04-2021
The multiridge package is now available on CRAN: https://cran.r-project.org/web/packages/multiridge/index.html
Paper accepted: multiridge15-03-2021
The paper "Fast cross-validation for multi-penalty high-dimensional ridge regression" by Mark van de Wiel, Mirrelijn van Nee and Armin Rauschenberger has been accepted for publication by the J Comp Graph Stat.Read more