Deciphering biological systems from data is a longstanding challenge in cell biology. Accurate models of these systems enable in silico screening of perturbations, facilitate interpretable experimental design, and streamline target identification for drug discovery. However, the inherent stochasticity, nonlinearity, and limited observability of the dynamic processes governing the behaviour of these systems render the construction of such models difficult. With the rapid advancement of high-throughput sequencing technology, high-dimensional single-cell perturbational datasets have become increasingly available. Consequently, artificial intelligence and machine learning methods have become de facto tools for addressing computational problems in cell biology. In particular, dynamics-based generative models have emerged as core approaches for modelling high-dimensional distributions and complex dynamic processes in the natural sciences.
The first part of this thesis addresses the problem of inference of biological systems. A novel generative flow network approach is introduced for the Bayesian inference of gene regulatory networks from single-cell data. An investigation into the generalization capabilities of generative flow networks for modelling unnormalized probability mass functions over graph generation tasks further studies their application for the large-scale structure learning of gene regulatory networks.
The second part of this thesis addresses the challenges of response prediction and control of biological systems. A flow-based generative framework is introduced that models the evolution of cells over time and is applied to predicting patient-specific treatment response. In addition, an inference-time method for systematically combining pre-trained diffusion models is introduced for the improved generation of more novel and more designable proteins.
Overall, the success of dynamics-based generative models, such as generative flow networks, flow-based models, and diffusion models, has inspired rapid adoption in many scientific domains such as physics, chemistry, and biology. This thesis takes a step towards addressing existing problems in systems biology through the advancement of such models.