Home - Big Statistics

Welcome to the site of the Big Statistics section! Our aim is to link Big Data to clinical response by novel, problem-specific statistical methods. Here, 'Big' means: big in sample size (n) and/or big in number of variables (p), including high-dimensional ("omics") data.

As part of the Department of Epidemiology and Data Science of the Amsterdam UMC, our section is involved in consultancy, research and teaching. More information about who we are, our work and how to reach us can be found in these pages.

Statistics & Machine Learning

Co-data learning: Improving prediction and variable selection by accounting for complementary data.

Interpretable Machine learning: Application, development and interpretation of several machine learners (focus on tree-based learners) for a variety of big data applications

Causal inference & treatment heterogeneity: Estimating treatment effects in complex learning problems.

Record linkage: Linking clinical information with anonimized information on an aggregated level (e.g. postal code).

Statistical omics: Building dedicated statistical models to test associations of omics variables with clinical parameters.

Networks: Developing methods to learn molecular networks from omics data.

Software & Support

Research Support & Collaborations

Big data analysis support is core business for our group. We supply tailored solutions for a variety of big data analysis questions, covering study design, preprocessing and downstream analysis. We collaborate with researchers from a variety of disciplines, such as oncology, cardiology and neurology. We are also part of Adore, the onco-neuro campus at Amsterdam UMC.

Data

We love data. The more, the better. Many of us have experience with omics data, which refers to the high-throughput quantification of some pool of molecular molecules. Often, these data are high-dimensional meaning they have more features than observations. Our group provides statistical support for the processing and analysis of a wide variety of omics data, such as genomic, metabolomic, and proteomic data.

We also support analysis of truly big n data, with a focus on observational cohorts. We have expertise in enirching data by linking these to repositories. Finally, for analysis, we mix machine learning and statistical techniques to have the best of both worlds.

Statistics for Big data

Big Statistics

Statistics & Machine Learning

Software & Support

Data