In cancer research, omics data hold the promise to enable detection of cancer in an early stage and personalisation of treatment plans. With this goal in mind, researchers focus on finding the best prediction models to predict the answers to questions like: does this patient have cancer? How long will it take for a tumor to recur? Which treatment will work? Which omics markers are related to cancer?
As omics data contain, by definition, a lot of markers, it is hard to find the best prediction model and markers. Results may be improved by including more information in the form of co-data: auxiliary information on the markers. Just like regular data, these co-data may differ in type, like categorical or continuous, and content, such as one co-data source representing groups of genes corresponding to biological functions and another one corresponding to cellular processes.
My research focuses on developing prediction methods that make structural use of multiple and various co-data, to improve the answers to questions like mentioned above.
ecpc: an R-package for flexibly learning from multiple co-data for prediction and covariate selection.
Mirrelijn M. van Nee, Lodewyk F.A. Wessels, and Mark A. van de Wiel. "Flexible co-data learning for high-dimensional prediction". Preliminary version (arXiv).