Novel bioinformatics approaches and software
We develop approaches that connect quantitative and biological knowledge. Few fields have generated higher expectations, deeper frustrations, or more translation anxiety than human genomics. Early on, we argued (alongside others) for grounding the field in rigorous epidemiology and statistics if genomics and big data were to deliver on personalized medicine. We laid out this framing under the name Systems Epidemiology (Lund & Dumeaux, 2008 CEBP; Lund & Dumeaux, 2010 Int J Epi): integrate human -omics data with measurements from observational epidemiologic studies, characterize the factors that drive complex diseases, and infer causation rather than settling for correlation. The Norwegian Women and Cancer Study biobank (Dumeaux et al., 2008 BMC) was an early proving ground for this approach.

Critical to these efforts is the development of computational methodologies that support the integration and interpretation of complex real-life data. Specifically, we have developed novel methodologies for:
- the sensitive detection of low-amplitude changes in blood profiles across healthy individuals (developed in PLoS Genetics 2010);
- identifying genes, pathways, and processes that co-vary and interact across tissues and environments;
- predicting molecular pathway activation patient-by-patient under the constraints clinical practice imposes;
- and additional methodologies within collaborative manuscripts (Huttenhower et al., 2009 Genome Research; Barutcuoglu et al., 2009 Bioinformatics; Bettauer et al., 2022 Microbiology Spectrum).
More recently, we have extended this methodological program in two directions: FFPE-grade transcriptomics, through PREFFECT, and microbial functional archetypes, through deep-fMC. Both are settings where a lot is technically measurable but relatively little is yet clinically actionable, and that gap is what pushed us to build new tools.
Selected papers
- Generative and integrative modeling for transcriptomics with formalin-fixed paraffin-embedded material
- Detecting gene-signature activation in breast cancer in an absolute, single-patient manner
- Building applications for interactive data exploration in systems biology
- A deep learning approach to capture the essence of Candida albicans morphologies
- Reproducible data analysis pipelines for precision medicine
Selected software
- MIxT — multi-tissue transcriptional integration.
- PREFFECT — generative modelling for FFPE RNA-seq.
- Candescence — C. albicans morphology classifier.
- deep-fMC — functional microbial configurations from gut metagenomes.