How to learn from complex data I: simulation-based inference in Physics and Life Sciences

A hands-on workshop

when: Sep 9-10, 2024 (10:00-16:00)
where: Faculty of Physics, Friedrich-Hund-Platz 1, 37077 Göttingen, room: C 00.110 (PC pool)
available seats: 20
lecturers: Prof. Dr. Michael Wibral (CIDBN, Dep. Data-driven Analysis of Biological Networks), Dr. Matthias Häring (CIDBN, Dep. Physics of Biological Systems)
organizers: Dr. Bernhard Bandow (GWDG / CIDBN), Prof. Dr. Fred Wolf (CIDBN, Dep. Physics of Biological Systems)
prerequisites for participation: familiarity with linux and python, basic knowledge of propability and statistics
registration: via e-mail to cidbn@uni-goettingen.de by Sep 2, 2024.
(Please state your full name and affiliation when registering. Thanks!)

day 1: An introduction to simulation based inference using pandemic spread as a case study


lecturer: Michael Wibral

abstract:
Simulation based inference usually contains three central ingredients:

  • a process to be simulated
  • a simulation environment
  • an inference algorithm for parameters that govern the dynamics of a process



hands-on outline:
theory block:
The course introduces the fundamentals of Bayesian inference that are necessary to understand in which sense the parameters of a dynamic process are inferred, and how to interpret the results of this inference.
It also introduce the fundamental concepts of sampling based Bayesian inference that are necessary to understand setting up sampling in pymc v4 and to interpret the sampler output.

practice block:
We will use observed data of the spreading dynamics during the COVID-19 pandemics in combination with a parametrized model of disease dynamics.
We will simulate disease spreading over a range of parameters using simple python scripts, and then implement these simulations to generate samples in a sampler-based approach to Bayesian inference using pymc 4.


day 2: Nested sampling for model selection and parameter inference in physics, biology and beyond


lecturer: Matthias Häring

abstract:
Nested Sampling is an algorithm used in computational physics and Bayesian inference to evaluate complex, multi-dimensional integrals. Nested Sampling offers a way to explore unknown parameter spaces, to infere and sample from multimodal and non-Gaussian posterior distributions and to directly estimate Bayesian evidence for model comparison. Successful applications of Nested Sampling were first established in cosmology, gravitational wave detection, and statistical mechanics. Users of NS can profit from a variety of versatile and efficient open source implementations of Nested Sampling taylored e.g. to high performance computing systems or the inference of complex dynamical systems.

In this course we will explore the value of Nested Sampling for biological systems where inference from large, high-dimensional and multimodal data sets is a rapidly evolving frontier. Understanding nested sampling can give you a powerful toolset to efficiently perform parameter inference and model comparison in modern biological and biophysics research.

hands-on outline:
The hands-on course offers practical experience with interactive application of Nested Sampling across a set of problems. We will begin by working together on a shared set of problems. Subsequently, you will have the opportunity to explore self-defined projects according to your research interests. Below is a preliminary set of possible topics.

part 1: fundamentals

  • Configuration: python environment, Jupyter notebooks, and dynesty
  • Exploring dynesty: nested sampling using pre-existing examples
  • Visualizing and analyzing nested sampling results
  • Hyperparameters (boundaries, live points, ...), parallelization, and dynamic nested sampling.



part 2: application of nested sampling

  • Solving complex integrals
  • Navigating multi-modal distributions and non-trivial areas of parameter space
  • Inference examples in biological systems
  • Inference examples in physics



part 3: advanced topics

  • Statistical mechanics: applying nested sampling to solve the Potts model for magnetization
  • Target state alignment: inference for biological processes that achieve functional goals
  • Inference for cytokinetic ring constriction in a TSA ensemble
  • Applying nested sampling to simulation-based inference tasks