Machine Learning in Life Science

    The Center for Basic Machine Learning in Life Science (MLLS) was established on January 21, 2021 with the generous support of the Novo Nordisk Foundation. We bring together leading machine learning research groups in Denmark to establish a solid foundation for future data science progress in the life sciences.

    Artificial intelligence and data science are rapidly changing how science is being conducted. Researchers increasingly rely on a data-centric view, where experimental data is used to discover patterns in data and phrase new scientific hypotheses.

    To extract information from data, modern data science techniques often "convert" the raw sensory measurements into abstract data representations, but these representations are often ill-understood and difficult to interpret, which hinders their use for phrasing robust hypotheses. This is particularly true in the life sciences, where data typically is “noisy” and incomplete.

    To provide solutions to these problems, we conduct basic research in machine learning that is motivated and informed by fundamental problems in biology and biomedicine.

Machine Learning in Life Science

Our Mission

    We conduct the basic machine learning research needed to estimate representations of biomedical data that are

    • Robust

    • Interpretable

    • Data efficient

    • Reflective of inherent data uncertainty

    • Able to leverage existing knowledge

    These representations are both predictive and knowledge discovery tasks.

Research

Our research focuses on four themes, and each theme advances different aspects of representation learning for life science and support each other:

  1. Meaningful representation of data and computational and mathematical tools development to realize the answer. 

  2. Geometric constructions to incorporate existing knowledge into representations and ensure that the result is understandable by humans. 

  3. Representation of data often appearing within life science, such as trees, graphs, and sequences. 

  4. Inclusion of real data that is “noisy” and investigation of how associated uncertainty is best encoded. 

 TEAM MEMBERRESEARCH AREA
 Ole Winter Hierarchical generative models, gene regulation and sequence bioinformatics. NLP and appropriate inferences.
 Anders Krogh Representation learning for gene expression data and DNA sequences. Machine learning in bioinformatics.
 Wouter Boomsma Sequence modelling, protein representation learning, Bayesian inference (Markov chain Monte Carlo).
 Aasa Feragen Geometric modeling, machine learning for biomedical imaging, structured data (trees, networks, …)
 Jes FrellsenDeep generative models, missing values, Bayesian modeling, approximate inference.
 Søren Hauberg Random geometry, uncertainty quantification, deep generative models

News & Events

News

NeurIPS 2021 Meetup

This NeurIPS 2021 meetup takes place in Copenhagen aiming to act as the central meetup for this area. The meetup is will feature a collection of focused reading groups. PhD students within ELLIS Copenhagen may get ECTS credit for joining these reading groups. The meetup is organized by ELLIS Copenhagen. If the situation allows, the meetup will be physical with on-ground activities.

Traunkirchen / OÖ

Relevant Publications

    1. Hauberg, S., Freifeld, O. & Black, M. J. A Geometric Take on Metric Learning, NeurIPS (2012)

    2. Sønderby, C. K., Raiko, T., Maaløe, L., Sønderby, S. K. & Winther, O. Ladder variational autoencoders in NeurIPS (2016)

    3. Boomsma, W., Mardia, K. V., Taylor, C. C., Ferkinghoff-Borg, J., Krogh, A., & Hamelryck, T. A generative, probabilistic model of local protein structure. PNAS 105, 8932–8937 (2008)

    4. Boomsma, W., Tian, P., Frellsen, J., Ferkinghoff-Borg, J., Hamelryck, T., Lindorff-Larsen, K., & Vendruscolo, M. Equilibrium simulations of proteins using molecular fragment replacement and NMR chemical shifts. PNAS 111, 13852–13857 (2014)

    5. Boomsma, W., Frellsen, J. Spherical convolutions and their application in molecular modelling. NeurIPS (2017).

    6. Arvanitidis, G., Hansen, L. K. & Hauberg, S. Latent space oddity: On the curvature of deep generative models, ICLR (2018)

    7. Weiler, M., Geiger, M., Welling, M., Boomsma, W. & Cohen, T. S. 3D Steerable CNNs: Learning rotationally equivariant features in volumetric data, NeurIPS (2018)

    8. Mallasto, A., Hauberg, S. & Feragen, A. Probabilistic Riemannian sub- manifold learning with wrapped Gaussian process latent variable models, AISTATS (2019)

    9. Hauberg, S. Principal curves on Riemannian manifolds. IEEE Trans. Pat- tern Anal. Mach. Intell 38, 1915–1921 (2015)

    10. Skafte, N., Jørgensen, M. & Hauberg, S. Reliable training and estimation of variance networks, NeurIPS (2019)

    11. Palm, R., Paquet, U. & Winther, O. Recurrent relational networks in NeurIPS (2018)

    12. Krogh, A., Larsson, B., Von Heijne, G. & Sonnhammer, E. L. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580 (2001)

    13. Armenteros, J.J.A., Tsirigos, K.D., Sønderby, C.K., Petersen, T.N., Winther, O., Brunak, S., von Heijne, G. and Nielsen, H. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat. Biotechnol. 37, 420–423 (2019)

    14. Feragen, A., Kasenburg, N., Petersen, J., de Bruijne, M. & Borgwardt, K. Scalable kernels for graphs with continuous attributes, NeurIPS (2013)

    15. Feragen, A., Lo, P., de Bruijne, M., Nielsen, M. & Lauze, F. Toward a theory of statistical treeshape analysis. IEEE Trans. Pattern Anal. Mach. Intell 35, 2008–2021 (2012)

    16. Hauberg, S., Schober, M., Liptrot, M., Hennig, P. & Feragen, A. A random riemannian metric for probabilistic shortest-path tractography, MICCAI (2015)

    17. Navarro, A., Frellsen, J. & Turner, R. The Multivariate Generalised von Mises Distribution: Inference and Applications, AAAI (2017)

    18. Mattei, P.-A. & Frellsen, J. MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets, ICML 97 (2019)

    19. Maaløe, L., Sønderby, C. K., Sønderby, S. K. & Winther, O. Auxiliary deep generative models, ICML (2016)

    20. Fraccaro, M., Sønderby, S. K., Paquet, U. & Winther, O. Sequential neural models with stochastic layers, NeurIPS (2016)

    21. Maaløe,L, Fraccaro, M.,Lievin, V. & Winther, O. BIVA: A very deep hierarchy of latent variables for generative modeling, NeurIPS (2019)

Contact MLLS Team  

    Name Position email
    Ole Winther, PhD (PI) University of Copenhagen ole.winther@bio.ku.dk
    Anders Krogh, PhD (co-PI) University of Copenhagen krogh@binf.ku.dk
    Wouter Boomsma, PhD (co-PI) University of Copenhagen wb@di.ku.dk
    Aasa Feragen, PhD (co-PI) Technical University of Denmark afhar@dtu.dk
    Jes Frellsen, PhD (co-PI) Technical University of Denmark jefr@dtu.dk
    Søren Hauberg, PhD (co-PI) Technical University of Denmark sohau@dtu.dk

Advisory Board 

  • Oliver Stegle, PhD, EMBL-EBI

  • Deborah Marks, PhD, Harvard University

  • Corinna Cortes, PhD, Google Research

Coordination