MLLS

Machine Learning in Life Science

The Center for Basic Machine Learning in Life Science (MLLS) was established on January 21, 2021 with the generous support of the Novo Nordisk Foundation. We bring together leading machine learning research groups in Denmark to establish a solid foundation for future data science progress in the life sciences.

Artificial intelligence and data science are rapidly changing how science is being conducted. Researchers increasingly rely on a data-centric view, where experimental data is used to discover patterns in data and phrase new scientific hypotheses.

To extract information from data, modern data science techniques often "convert" the raw sensory measurements into abstract data representations, but these representations are often ill-understood and difficult to interpret, which hinders their use for phrasing robust hypotheses. This is particularly true in the life sciences, where data typically is “noisy” and incomplete.

To provide solutions to these problems, we conduct basic research in machine learning that is motivated and informed by fundamental problems in biology and biomedicine.

Machine Learning in Life Science

MLLS

Our Mission

We conduct the basic machine learning research needed to estimate representations of biomedical data that are

Robust

Interpretable

Data efficient

Reflective of inherent data uncertainty

Able to leverage existing knowledge

These representations are both predictive and knowledge discovery tasks.

Research

Our research focuses on four themes, and each theme advances different aspects of representation learning for life science and support each other:

Meaningful representation of data and computational and mathematical tools development to realize the answer.
Geometric constructions to incorporate existing knowledge into representations and ensure that the result is understandable by humans.
Representation of data often appearing within life science, such as trees, graphs, and sequences.
Inclusion of real data that is “noisy” and investigation of how associated uncertainty is best encoded.

TEAM MEMBER	RESEARCH AREA
Ole Winter	Hierarchical generative models, gene regulation and sequence bioinformatics. NLP and appropriate inferences.

Anders Krogh	Representation learning for gene expression data and DNA sequences. Machine learning in bioinformatics.

Wouter Boomsma	Sequence modelling, protein representation learning, Bayesian inference (Markov chain Monte Carlo).

Aasa Feragen	Geometric modeling, machine learning for biomedical imaging, structured data (trees, networks, …)

Jes Frellsen	Deep generative models, missing values, Bayesian modeling, approximate inference.

Søren Hauberg	Random geometry, uncertainty quantification, deep generative models

News & Events

News

07.12.2021 - 11.12.2021

NeurIPS 2021 Meetup

This NeurIPS 2021 meetup takes place in Copenhagen aiming to act as the central meetup for this area. The meetup is will feature a collection of focused reading groups. PhD students within ELLIS Copenhagen may get ECTS credit for joining these reading groups. The meetup is organized by ELLIS Copenhagen. If the situation allows, the meetup will be physical with on-ground activities.

To news

Relevant Publications

1. Hauberg, S., Freifeld, O. & Black, M. J. A Geometric Take on Metric Learning, NeurIPS (2012)

2. Sønderby, C. K., Raiko, T., Maaløe, L., Sønderby, S. K. & Winther, O. Ladder variational autoencoders in NeurIPS (2016)

3. Boomsma, W., Mardia, K. V., Taylor, C. C., Ferkinghoff-Borg, J., Krogh, A., & Hamelryck, T. A generative, probabilistic model of local protein structure. PNAS 105, 8932–8937 (2008)

4. Boomsma, W., Tian, P., Frellsen, J., Ferkinghoff-Borg, J., Hamelryck, T., Lindorff-Larsen, K., & Vendruscolo, M. Equilibrium simulations of proteins using molecular fragment replacement and NMR chemical shifts. PNAS 111, 13852–13857 (2014)

5. Boomsma, W., Frellsen, J. Spherical convolutions and their application in molecular modelling. NeurIPS (2017).

6. Arvanitidis, G., Hansen, L. K. & Hauberg, S. Latent space oddity: On the curvature of deep generative models, ICLR (2018)

7. Weiler, M., Geiger, M., Welling, M., Boomsma, W. & Cohen, T. S. 3D Steerable CNNs: Learning rotationally equivariant features in volumetric data, NeurIPS (2018)

8. Mallasto, A., Hauberg, S. & Feragen, A. Probabilistic Riemannian sub- manifold learning with wrapped Gaussian process latent variable models, AISTATS (2019)

9. Hauberg, S. Principal curves on Riemannian manifolds. IEEE Trans. Pat- tern Anal. Mach. Intell 38, 1915–1921 (2015)

10. Skafte, N., Jørgensen, M. & Hauberg, S. Reliable training and estimation of variance networks, NeurIPS (2019)

11. Palm, R., Paquet, U. & Winther, O. Recurrent relational networks in NeurIPS (2018)

12. Krogh, A., Larsson, B., Von Heijne, G. & Sonnhammer, E. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580 (2001)

13. Armenteros, J.J.A., Tsirigos, K.D., Sønderby, C.K., Petersen, T.N., Winther, O., Brunak, S., von Heijne, G. and Nielsen, H. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37, 420–423 (2019)

14. Feragen, A., Kasenburg, N., Petersen, J., de Bruijne, M. & Borgwardt, K. Scalable kernels for graphs with continuous attributes, NeurIPS (2013)

15. Feragen, A., Lo, P., de Bruijne, M., Nielsen, M. & Lauze, F. Toward a theory of statistical treeshape analysis. IEEE Trans. Pattern Anal. Mach. Intell 35, 2008–2021 (2012)

16. Hauberg, S., Schober, M., Liptrot, M., Hennig, P. & Feragen, A. A random riemannian metric for probabilistic shortest-path tractography, MICCAI (2015)

17. Navarro, A., Frellsen, J. & Turner, R. The Multivariate Generalised von Mises Distribution: Inference and Applications, AAAI (2017)

18. Mattei, P.-A. & Frellsen, J. MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets, ICML 97 (2019)

19. Maaløe, L., Sønderby, C. K., Sønderby, S. K. & Winther, O. Auxiliary deep generative models, ICML (2016)

20. Fraccaro, M., Sønderby, S. K., Paquet, U. & Winther, O. Sequential neural models with stochastic layers, NeurIPS (2016)

21. Maaløe,L, Fraccaro, M.,Lievin, V. & Winther, O. BIVA: A very deep hierarchy of latent variables for generative modeling, NeurIPS (2019)

Contact MLLS Team

Name	Position	email
Ole Winther, PhD (PI)	University of Copenhagen	ole.winther@bio.ku.dk
Anders Krogh, PhD (co-PI)	University of Copenhagen	akrogh@di.ku.dk
Wouter Boomsma, PhD (co-PI)	University of Copenhagen	wb@di.ku.dk
Aasa Feragen, PhD (co-PI)	Technical University of Denmark	afhar@dtu.dk
Jes Frellsen, PhD (co-PI)	Technical University of Denmark	jefr@dtu.dk
Søren Hauberg, PhD (co-PI)	Technical University of Denmark	sohau@dtu.dk

Advisory Board

Oliver Stegle, PhD, EMBL-EBI
Deborah Marks, PhD, Harvard University
Corinna Cortes, PhD, Google Research

Coordination

Lisbeth Borbye, PhD, University of Copenhagen – lisbeth.borbye@bio.ku.dk