Applicants to our CDT for 2026 start can choose from the following selection of projects. Applications now open Applications for 2026 entry are now open!Apply by 20 January if you'd like to join our CDT in September 2026. Learn how to apply (1) Data-Driven Insights for Improving Patient Journeys in Unscheduled Care: A Comprehensive Analysis of Healthcare Services in Scotland This project will use the Public Health Scotland Unscheduled Care Data Mart (UCD)—a linked patient-level dataset covering all of Scotland since 2011—to improve the efficiency and equity of unscheduled care. We will map patient pathways across NHS 24, ambulance, emergency, acute, and mental health services using descriptive statistics, pathway visualisation, and machine learning. Predictive models will identify factors affecting outcomes, while clustering will reveal common pathways and bottlenecks. Embedded within PHS, the project will deliver actionable policy recommendations to enhance data collection, optimise patient flows, and guide equitable redesign of urgent and unscheduled care services.Supervisory teamSyed Ahmar Shah and Saturnino Luz FilhoProject BackgroundHealthcare systems worldwide face growing pressure to deliver timely, efficient care while managing rising demand and constrained resources. The COVID-19 pandemic exposed critical vulnerabilities, disrupting routine services and overwhelming urgent care. In the UK, the first lockdown caused substantial drops in hospital admissions and major backlogs in elective care, with enduring impacts on waiting times and health outcomes [1–2]. In Scotland, the pandemic created sustained excess demand, prolonged delays, and excess mortality, underscoring the need for a resilient, responsive system capable of recovering from shocks and maintaining routine care [3–4].In response, the Scottish Government launched the Urgent and Unscheduled Care Collaborative as part of its NHS Recovery Plan to modernise services, strengthen coordination between primary and secondary care, and improve patient flow through initiatives such as Hospital at Home. However, unscheduled care remains fragmented, with patient information dispersed across multiple services, impeding timely decision-making, evaluation, and resource planning. These gaps disproportionately affect socioeconomically disadvantaged and minority ethnic groups, contributing to health inequalities.Public Health Scotland’s Unscheduled Care Data Mart (UCD) now offers a unique opportunity to link patient-level data across care settings, enabling system-wide analyses to identify bottlenecks and support data-driven redesign of unscheduled care pathways [5].Project AimsThis project aims to generate data-driven insights to improve the efficiency, equity, and resilience of unscheduled care services in Scotland. Specifically, it will:1. Map end-to-end patient pathways across NHS 24, ambulance, emergency, acute, and mental health services using the Unscheduled Care Data Mart (UCD);2. Identify systemic bottlenecks, data gaps, and their impact on patient outcomes;3. Develop predictive and clustering models to uncover drivers of delays and adverse outcomes; and4. Produce evidence-based policy recommendations to support service redesign, optimise patient flows, and reduce health inequalities in unscheduled care.Data and MethodologyThis project will leverage the Public Health Scotland (PHS) Unscheduled Care Data Mart (UCD), a comprehensive, linked dataset covering the entire Scottish population since 2011. The UCD integrates patient-level data across NHS 24, the Scottish Ambulance Service, Primary Care Out of Hours, Emergency Departments, Acute Hospital Admissions, Mental Health Admissions, and Death Records, capturing approximately 2.8 million unscheduled care pathways annually. Linkage is enabled via the Community Health Index (CHI) number, allowing complete tracking of individual patient journeys from first contact to discharge or death.Descriptive and Exploratory AnalysesInitial analyses will involve descriptive statistics and data visualisation to characterise service use and patient flows across unscheduled care. Network and Sankey-style pathway visualisations will map patient transitions between services, highlighting frequent routes, points of delay, and high-demand groups. These analyses will help identify candidate variables and outcomes for subsequent modelling.Predictive ModellingWe will develop supervised machine learning models (e.g. logistic regression, random forests, gradient boosting) to predict key outcomes such as hospital admission, waiting time, and length of stay. Models will be evaluated using standard performance metrics (AUROC, accuracy, calibration) and validated via k-fold cross-validation. Regularisation will be applied to prevent overfitting, and feature importance techniques will support model interpretability.Unsupervised ClusteringClustering methods (e.g. k-medoids, hierarchical clustering) will be used to identify common patient pathways and systemic bottlenecks. Clusters will be profiled on demographics, comorbidities, and outcomes, and findings will be reviewed with PHS and clinical stakeholders to ensure real-world relevance.Implementation ApproachThe student will be embedded within PHS two days per week, enabling close engagement with data engineers, analysts, and policymakers. This will facilitate timely access to data, iterative feedback on analysis, and co-production of actionable outputs to inform service redesign.Translational Potential and Expected ImpactAlthough centred on Scotland, this project addresses challenges shared by healthcare systems worldwide—managing surges in unscheduled care demand, reducing bottlenecks, and improving system resilience. By leveraging large-scale linked datasets to map patient journeys, the project will generate transferable methods and insights relevant to other national health systems. The analytical framework—combining pathway visualisation, predictive modelling, and clustering—can be adapted to different contexts, informing service redesign internationally. Findings will be disseminated through peer-reviewed publications, policy briefs, and international networks, contributing to global efforts to optimise urgent care delivery and strengthen health system preparedness.Training and Development Outcomes for the StudentThis project will provide comprehensive interdisciplinary training spanning data science, health informatics, and applied health services research. The student will gain advanced skills in data engineering, statistical analysis, machine learning, and pathway visualisation using large-scale linked health datasets. Embedding within Public Health Scotland (PHS) two days per week will offer hands-on experience with real-world data pipelines, governance procedures, and policy translation. They will also develop transferable skills in stakeholder engagement, responsible AI, scientific writing, and presenting findings to technical and non-technical audiences. This training will prepare the student for leadership roles in data-driven healthcare innovation.References[1]: Shah SA, Robertson C, Sheikh A. Effects of the COVID-19 pandemic on NHS England waiting times for elective hospital care: a modelling study. The Lancet. 2024 Jan 20;403(10423):241-3.[2]: Shah SA, Brophy S, Kennedy J, Fisher L, Walker A, Mackenna B, Curtis H, Inglesby P, Davy S, Bacon S, Goldacre B. Impact of first UK COVID-19 lockdown on hospital admissions: Interrupted time series study of 32 million people. EClinicalMedicine. 2022 Jul 1;49.[3]: Shah, S.A., Jeffrey, K., Robertson, C. and Sheikh, A., 2025. Impact of COVID-19 pandemic on elective care backlog trends, recovery efforts, and capacity needs to address backlogs in Scotland (2013–2023): a descriptive analysis and modelling study. The Lancet Regional Health–Europe, 50.[4]: Shah SA, Mulholland RH, Wilkinson S, Katikireddi SV, Pan J, Shi T, Kerr S, Agrawal U, Rudan I, Simpson CR, Stock SJ. Impact on emergency and elective hospital-based care in Scotland over the first 12 months of the pandemic: interrupted time-series analysis of national lockdowns. Journal of the Royal Society of Medicine. 2022 Nov;115(11):429-38.[5]: Public Health Scotland. Unscheduled Care Datamart (UCD) [Internet]. Available from: https://publichealthscotland.scot/services/national-data-catalogue/national-datasets/search-the-datasets/unscheduled-care-datamart-ucd/. Accessed 3 October 2024. (2) Identifying Emerging Zoonotic Disease Hotspots This project aims to address threats to human health associated with zoonotic and emerging infectious diseases by developing new AI-based solutions for identifying disease hotspots and supporting interventions to help mitigate their spread. Using Highly Pathogenic Avian Influenza as a motivating example, the project will develop techniques for identifying potential pandemic risk hotspots by integrating information from a variety of sources, including human population data, remote sensing data, and species observation data. The ultimate goal is to provide new tools to anticipate potential emerging disease risks and thus limit their impact on human health.Supervisory teamOisin Mac Aodha and Rowland KaoProject PartnerAnimal and Plant Health Agency (APHA)The APHA is an executive agency of the Department for Environment, Food and Rural Affairs (Defra) of the United Kingdom. They work to safeguard animal and plant health for the benefit of people, the environment, and the economy. They will provide advice about data and methods, in addition to mentorship of the student. Importantly, they will also link the work to real world use case relevant to human health.Project BackgroundIt is estimated that 60%–75% of emerging infectious diseases in humans originate from zoonotic pathogens from wildlife [1]. Increased interactions between humans and animals, driven by climate change and habitat loss and fragmentation, are intensifying the prevalence of zoonotic diseases [2]. As a result, predicting and combatting their emergence is a public human health priority [3].To mitigate the worst of these impacts, we need computational methods to be able to identify disease hotspots and pinpoint risk factors for disease spread. This is especially important in the context of Highly Pathogenic Avian Influenza, a likely candidate for a next future pandemic [4]. Driven by advances in AI [5], attempts have been made to predict likely spillover events from specific species and pathogens in specific regions [6]. However, our ability to predict these events globally across larger species groups and diseases are hampered by the lack of information available about the spatial ranges of species, their preferences for different habitat types, and changes in their propensity to come into contact with each other and humans as a result of habitat loss.Project AimsThis project aims to develop new tools, powered by recent advances in multi-modal AI, to predict regions that are most at risk for the emergence of zoonotic diseases. The goal is to provide practical solutions to benefit human health. To achieve this, the project will (i) develop spatial distribution modelling techniques to generate estimates of global biodiversity at geographical scales relevant to the circulation of pathogens and landscape management decisions, (ii) identify likely zoonotic disease hotspots using existing datasets of infection events, and (iii) develop methods to recommend land management suggestions to increase resilience to infectious disease spreading.Translational Potential and Expected ImpactThe main outputs of this project will be (i) new models and data products for spatial zoonotic disease spread risk prioritisation which will provide information to practitioners to support them in performing interventions to increase resilience to zoonotic diseases at local scales and (ii) new models and data products for estimating the spatial distributions of 100 thousand different species. The data and code generated by the project will be made available under open and permissive licences to researchers, respecting the licences of the original training data. These outputs will provide valuable information for human health as well as ecological research and further the objectives of the UK Biological Security Strategy [12].Training and Development Outcomes for the StudentIn addition to the training provided by the CDT, the recruited student will benefit by being integrated into OMA’s and RK’s groups. Upon starting, OMA will help them identify any knowledge gaps and co-develop an action plan such that they can acquire the missing necessary domain knowledge over years one and two. They will attend OMA’s larger weekly group meetings in the SoI to present work in progress and receive feedback. They will also be connected to the larger Edinburgh Infectious Diseases network, in addition to OMA and RK’s network of collaborators, which will provide opportunities for future collaboration.References[1] Jones et al., Global trends in emerging infectious diseases, Nature 2008[2] Wang et al., Emerging zoonotic viral diseases, Rev Sci Tech 2014[3] Allen et al., Global hotspots and correlates of emerging zoonotic diseases, Nature Communications 2017[4] Possas et al., Highly pathogenic avian influenza: pandemic preparedness for a scenario of high lethality with no vaccines, Front Public Health 2025[5] Guo et al., Innovative applications of artificial intelligence in zoonotic disease management, Science in One Health 2023[6] Sedricke Lapuz et al., Mapping the Potential Risk of Coronavirus Spillovers in a Global Hotspot, Global Change Biology 2025[7] Cole et al., Spatial Implicit Neural Representations for Global-Scale Species Mapping, ICML 2023[8] Lange et al., Active Learning-Based Species Range Estimation, NeurIPS 2023[9] Beery et al., Species distribution modeling for machine learning practitioners: A review, Conference on Computing and Sustainable Societies 2021[10] Daroya et al., WildSAT: Learning Satellite Image Representations from Wildlife Observations, ICCV 2025[11] iNaturalist, https://www.inaturalist.org [12] UK Biological Security Strategy, https://www.gov.uk/government/publications/uk-biological-security-strategy/uk-biological-security-strategy-html [13] Gamża et al., Using sequence data to study spatial scales of interactions driving spread of Highly Pathogenic Avian Influenza in Great Britain, arXiv 2024 (3) From sequence to function: next-generation deep learning tools for precision gene therapies The success of gene therapies relies on the ability to precisely control the function of engineered DNA sequences. This project aims to harness artificial intelligence and machine learning to predict molecular function directly from genetic sequences, using a combination of high-throughput genotype-phenotype data and validation with our partners Trogenix. We will develop computational frameworks capable of learning the structural and contextual dependencies within biological sequences, and how these correlate with delivery of therapeutic payloads. The resulting models will enable accurate sequence-to-function predictions to optimize the design gene therapies against aggressive cancers and other challenging conditions.Supervisory teamGrzegorz Kudla and Diego OyarzúnProject PartnerTrogenixAbout the ProjectWe are seeking candidates to join our multidisciplinary team to tackle one of the most pressing challenges in gene therapy: predicting how DNA sequences determine molecular function. Gene therapy holds transformative potential for treating a range of serious health conditions, including aggressive cancers and genetic disorders previously deemed untreatable. The core of our project is to develop next-generation AI and machine learning models that can accurately predict the function of DNA sequences. Partnering with the leading biotechnology company Trogenix, we aim to create fit-for-purpose frameworks that can decipher the complex associations between DNA and molecular phenotypes, enabling precision in the design of gene therapies.We will build predictors of function trained on libraries of regulatory DNA sequences using a combination of deep learning, geometric learning, and genomic language modelling, in tandem with global optimization algorithms for robust sequence design. We aim to develop technology suitable for low-N training and context-aware that ensures that therapeutic payloads are delivered at the right dose, at the right time and in the right tissue. As a member of our team, you will gain unparalleled experience and training in a vibrant ecosystem, with access to cross-domain knowledge as well as numerous networks and resources for career growth. (4) Computational analysis bridging multi-omic data from skin organoid culture to therapeutic targets for atopic eczema This PhD project will integrate lipidomic, metabolomic, and proteomic data from skin organoid models to uncover molecular drivers of eczema and identify candidate therapeutic targets. Under the supervision of Prof. Sara Brown (IGC, University of Edinburgh) and Prof. Mark Parsons (EPCC), the student will develop computational pipelines combining network biology, machine learning, and drug-target mapping. By bridging omics data with drug discovery resources, the project aims to define mechanistic pathways underlying skin barrier dysfunction and inflammation. Input from Eczema Outreach Support (EOS) will guide translation toward patient benefit and public communication, advancing responsible, data-driven dermatology.Supervisory teamSara Brown and Mark ParsonsProject PartnerTBCProject BackgroundEczema (atopic dermatitis) is a chronic, relapsing inflammatory skin disease affecting millions worldwide. Despite advances in immunomodulatory therapies, the molecular mechanisms driving its onset and persistence remain incompletely understood, particularly regarding lipid and metabolite dysregulation in the skin barrier. Prof. Sara Brown’s group has generated a rich multi-omics dataset, including lipidomics, metabolomics, and proteomics, derived from patient-relevant skin cells and organoid models that mimic human epidermal physiology. These data offer an exceptional opportunity to decode the molecular pathways driving eczema and to identify actionable therapeutic targets.This interdisciplinary project will leverage computational and systems-biology approaches to integrate these omic layers, define disease-associated molecular signatures, and link them to existing drug–target resources for discovery and repurposing. Collaboration with the Edinburgh Parallel Computing Centre (EPCC) ensures access to secure, high-performance computing environments. Sara Brown is a medical adviser and long-term collaborator of the patient support group Eczema Outreach Support (EOS). They will provide translational and patient-centred perspectives, supporting prioritisation of computational findings for real-world benefit and effective public engagement.Project Aims1. Integrate lipidomic, metabolomic, and proteomic profiles from skin organoids to model molecular networks underlying skin differentiation and barrier formation. eczema.2. Identify key dysregulated pathways and candidate biomarkers associated with barrier dysfunction and inflammation.3. Develop computational pipelines linking molecular signatures to drug–target interaction databases to propose therapeutic candidates.4. Establish reproducible, privacy-preserving workflows for multi-omics analysis within secure computing environments (EPCC).Data and MethodologyThe student will analyse existing multi-omics datasets generated by the Brown group, encompassing lipidomics, metabolomics, and proteomics from skin organoid models under varying experimental and disease-relevant conditions.1. Data Processing and Integration:Pre-processing will involve normalization, quality control, and batch correction across modalities. Integration strategies will include similarity network fusion, canonical correlation analysis, and deep representation learning to capture cross-layer molecular relationships.2. Network and Machine Learning Approaches:Graph-based clustering, network propagation, and representation learning (e.g., graph neural networks, multi-view autoencoders) will be explored to detect modules of co-regulated features. Biological interpretation will rely on pathway enrichment and ontology analyses.3. Drug Repurposing and Therapeutic Targeting:Using proteomic signatures, the student will perform connectivity mapping (CMap) and perturbation analysis to identify compounds that reverse disease-associated expression profiles. Network pharmacology approaches will map dysregulated proteins to known drug–target interaction graphs (DrugBank, STITCH, ChEMBL). Structural bioinformatics and docking tools may be explored for selected targets to evaluate compound–target affinity. This integrative pipeline will prioritize drug candidates for experimental validation.4. Computing Environment:Analyses will be conducted using EPCC’s secure high-performance computing resources to ensure scalability, reproducibility, and compliance with data governance frameworks. The student will have access to EPCC’s wide range of supercomputing, data science and AI systems.Deliverables:A reproducible computational pipeline, interpretable multi-omic networks, and a ranked list of candidate therapeutic targets linked to potential repurposing compounds.Translational Potential and Expected ImpactThis project unites expertise in dermatology (Brown Lab), computational science (EPCC), and translational engagement (EOS), fostering collaboration across academia, clinical research, and the third sector. By producing a scalable computational framework for integrating complex multi-omics data and linking findings to drug discovery pipelines, the project will have broad relevance for inflammatory and metabolic diseases. Outcomes will include novel mechanistic insights into skin development and eczema, prioritized therapeutic targets, and publicly accessible computational tools. EOS’s involvement ensures patient-centred prioritization and effective dissemination to lay audiences, maximising societal and international impact.Training and Development Outcomes for the StudentThe student will gain cross-disciplinary expertise spanning computational biology, systems medicine, and drug discovery informatics. They will develop advanced skills in data integration, network modelling, and high-performance computing through EPCC, as well as bioinformatics and translational research methods under Prof. Brown’s supervision. Interaction with EOS will offer experience in public engagement and third-sector collaboration. The project provides professional development in scientific communication, responsible research, and reproducible software engineering, equipping the candidate for future roles in academia, healthcare data science, or the pharmaceutical sector.ReferencesElias MS, Wright SC, Nicholson WV et al. Functional and proteomic analysis of a full thickness filaggrin-deficient skin organoid model [version 2; peer review: 3 approved]. Wellcome Open Res 2019, 4:134 (https://doi.org/10.12688/wellcomeopenres.15405.2)Brown, Sara J. Keratinocytes Listen, Respond, and Actively Contribute to Crosstalk in the Epidermal Community and Beyond. Journal of Investigative Dermatology, 2024 Volume 144, Issue 12, 2628 - 2630Budu-Aggrey, A., Kilanowski, A., Sobczyk, M.K. et al. European and multi-ancestry genome-wide association meta-analysis of atopic dermatitis highlights importance of systemic immune regulation. Nat Commun 2023; 14, 6172.Standl et al. et al. Gene-environment Interaction Affects Risk of Atopic Eczema: Population and In Vitro Studies. Allergy 2025 https://doi.org/10.1111/all.16605 (5) Clinically actionable insights into endometriosis symptom trajectories using longitudinal self-reports, biological samples, and data from digital technologies Endometriosis is a chronic debilitating condition affecting about 10% of women of reproductive age. There is an unmet clinical need to facilitate accurate, timely diagnosis, remote symptom monitoring, and intervention assessments. The project will focus on mining the largest longitudinal multimodal datasets in endometriosis (ongoing data collection from our team as part of two large scale grants) to provide new clinically actionable insights into how self-reports, home-collected biological samples, and data from wearable sensors can facilitate endometriosis telemonitoring.Supervisory teamThanasis Tsanas, Andrew Horne and Philippa SaundersProject PartnerRoche DiagnosticsProject BackgroundEndometriosis is a chronic condition associated with debilitating pain, fatigue, and heterogeneous symptom manifestation. It affects ~10% women of reproductive age, may take ~8 years to diagnose, and symptom progression typically relies on sparse clinical assessments. There is an urgent call for action to capitalize on recent biological and technical developments to improve diagnosis and symptom monitoring [1]. We have recently proposed developing a pioneering framework to transform endometriosis assessment capitalizing on digital technologies [2].Standardised patient reported outcome measures (PROMs) where people living with endometriosis regularly self-report on their symptoms are increasingly used to monitor symptom severity progression. Similarly, regularly collected biological samples may offer insights into symptom trajectories over time. The use of digital health technologies can provide additional continuous and passively collected data, which can be mined to obtain new insights complementing clinical reports, lab tests, and PROMs. We recently reported on the largest study of-its-kind endometriosis study, demonstrating how self-reports and wearable sensors can provide longitudinal insights into symptom trajectories and objective surgical intervention assessments [3]. Specifically, we have developed new signal processing and statistical machine learning algorithms towards assessing physical activity, sleep, and diurnal rhythm variability, demonstrating how these could complement and inform clinical assessments.Project AimsThe recruited student will further extend the algorithmic framework developed in the group to mine multimodal data (PROMs, lab-based results and clinical reports, data from wearables), to provide new clinically useful insights into endometriosis towards facilitating (a) longitudinal symptom monitoring, (b) objective intervention assessments, and (c) cohort stratification, capitalizing on some of our recently collected and ongoing data collection internationally (£6m EUMetriosis project).Ultimately, the goal is to develop clinical decision support tools for endometriosis assessment that will be embedded within the NHS/EXPPECT team (led by co-supervisor Prof. Andrew Horne) and potentially translated by the industrial project partner (Roche).Data and MethodologyThe student will explore multimodal datasets from recently completed and ongoing large international studies including (i) ENDO1000, EUmetriosis and ADVANTAGE projects that the supervisory team are leading (collectively >500 people living with endometriosis, collected longitudinally, comprising PROMs, biological samples and actigraphy data), (ii) additional unique actigraphy datasets to facilitate algorithm development with external measures of ground truth (e.g. in terms of actigraphy and polysomnography data, >100 participants already collected). These are unique resources that the student will have direct access from the point they start the project: they will not need to do any ethics/data collection, and their focus will be exclusively on data analysis.They will develop and apply signal processing, time-series analysis, and multimodal data processing and information fusion algorithms to provide clinically new insights and facilitate longitudinal symptom monitoring in visceral pain. The student will need to have or develop in depth understanding of statistics, signal processing, and machine learning algorithms, including towards feature engineering, feature selection, model selection and validation. Moreover the student will need to have or develop strong programming skills in a high-level programming language (e.g. MATLAB, R, or Python). Specifically, the student will develop methods to mine the questionnaires using specific methods (such as item response assessments), the data from wearables (actigraphy-based algorithms). They will also need to develop machine learning (feature selection, statistical mapping, information fusion) algorithms to provide insights into how the different modalities contribute to assessing endometriosis symptoms such as pain and fatigue longitudinally, and how these change as a result of interventions (e.g. dietary or surgical).The student will receive additional input, if required, from colleagues who are based at partnering institutions and regarding sleep and circadian health by the Circadian Mental Health Network - Prof. Tsanas is Co-I in the network and can make introductions if required.Translational Potential and Expected ImpactThe proposed PhD project builds on strong national and international partnerships of ongoing projects that the supervisory team lead: (1) ADVANTAGE, a £4.3m grant, and (2) EUMetriosis, a £6m grant. The former is UK-based with partners at the University of Cambridge, UCL etc., and the latter is international (led by colleagues from Belgium, data collection in the UK and Croatia).The student will focus on the data analytics and there is a clear expectation in the project to assess how findings generalize on international cohorts (which in turn has enormous potential for the resulting work to land in high IF journals and impact).Training and Development Outcomes for the StudentWe will train a T-shaped researcher having an understanding of both the technical work (signal processing, machine learning, programming) and the biomedical aspects (from PPIE to engaging with the clinical team in the NHS/EXPPECT), and also the clinical translation of work through the collaboration with Roche. We envisage the PhD graduate will have developed much sought-after skills in biomedical data science and will be exceptionally well placed to pursue their career in academia or industry.References[1] P.T.K. Saunders, A.W. Horne: Endometriosis: new insights and opportunities for relief of symptoms, Biology of Reproduction, (in press), https://doi.org/10.1093/biolre/ioaf164 [2] K. Edgley, A.W. Horne, P.T.K. Saunders, A. Tsanas: Symptom tracking in endometriosis using digital technologies: knowns, unknowns and future prospects, Cell Reports Medicine, Vol. 4(9), 101192, 2023[3] K. Edgley, P.T.K. Saunders, L.H.R. Whitaker, A.W. Horne, A. Tsanas: Insights into endometriosis symptom trajectories and assessment of surgical intervention outcomes using longitudinal actigraphy, npj Digital Medicine, Vol. 8:236, 2025[4] K. Woodward, E. Kanjo, A. Tsanas: Combining deep transfer learning with signal-image encoding for multi-modal mental wellbeing classification, ACM Transactions on Computing for Healthcare, Vol. 5(1):3, 2024 (6) Agent-Based Active Learning Model for Knowledge-Guided Molecular Design This project will develop an agent-based active learning framework that integrates human medicinal chemistry expertise into AI-driven molecular design. By embedding domain knowledge within iterative learning cycles, the project aims to create models that not only predict compound performance but also account for synthesisability and design feasibility. The resulting agent-based “human-in-the-loop” system will enable adaptive compound selection informed by both data and expert reasoning. The outcome will be an interpretable, industrially deployable tool that bridges computational discovery and experimental validation, advancing the translation of AI innovations into real-world drug discovery workflows.Supervisory teamAntonia Mey and Andrea WeisseProject PartnerBioAscentBioAscent will contribute to the success of this project through mentorship, knowledge exchange, and an industrial internship placement. The student will gain exposure to real-world discovery pipelines, compound design, and the practical constraints that shape medicinal chemistry decisions. Our scientists will provide guidance on chemical feasibility assessment, design strategy, and how AI-driven compound selection can be effectively implemented within industrial workflows.Project BackgroundDrug discovery remains a costly, time-intensive process, with high attrition rates often resulting from the synthesis of compounds that are theoretically promising but practically unfeasible. Artificial intelligence has demonstrated strong predictive capabilities in molecular property estimation, yet these models frequently overlook the implicit knowledge of experienced medicinal chemists, such as judgment on synthesisability, structural novelty, and project-specific priorities.Active learning provides a mechanism for models to iteratively query new data, focusing on the most informative compounds to test. However, traditional implementations are limited by purely statistical reasoning, which can diverge from the nuanced decision-making of human chemists and often lack explainability.This project seeks to integrate medicinal chemists’ knowledge into active learning frameworks through agent-based modelling, enabling algorithms to reason and adapt more like expert practitioners. By incorporating synthesisability assessments and heuristic rules derived from expert feedback, the system will create a more realistic, human-aligned decision process. Through experts at the University of Edinburgh and BioAscent medicinal chemistry expertise and industrial expertise, will ensure that the developed models are grounded in practical constraints and can be validated within real-world discovery pipelines.Project AimsThe project aims to design and evaluate an agent-based active learning framework that integrates human expertise into molecular design. Specific objectives include:1. Developing methods to encode medicinal chemist knowledge into AI decision-making loops.2. Implementing active learning agents capable of balancing exploration (novel structures) and exploitation (synthetic feasibility).3. Testing and validating the system using real-world compound datasets and expert-in-the-loop simulations and provide explainable reasoning for model choices.4. Demonstrating how expert-informed active learning improves both the efficiency and industrial relevance of AI-driven compound selection.Translational Potential and Expected ImpactThis project directly targets the translation of academic AI research into deployable drug discovery tools. By embedding chemist expertise into active learning systems, the resulting framework will improve compound prioritisation, reduce experimental waste, and accelerate lead optimisation. Industrial partners can apply the methodology to enhance decision-making efficiency and integrate AI seamlessly within discovery workflows. The project will produce open, interpretable models and validated industrial case studies, contributing to the broader adoption of human-aligned AI systems in medicinal chemistry and advancing the UK’s position in data-driven biomedical innovation.Training and Development Outcomes for the StudentThe student will acquire interdisciplinary expertise spanning machine learning, computational chemistry, and medicinal chemistry. Training will include advanced AI model development, cheminformatics, and experimental design principles. Through collaboration with BioAscent, the student will gain valuable industrial experience via an internship and ongoing mentorship, developing practical insight into real-world discovery pipelines. The project’s interdisciplinary nature will cultivate transferable skills in data science, research ethics, communication, and innovation management making sure the student will be prepared for a career at the interface of AI research and pharmaceutical R&D.References1. Schneider, G. (2018) Automating drug discovery. Nat Rev Drug Discov 17, 97–1132. Gorantla, R. et al. (2024) J. Chem. Inf. Model. 2024, 64, 6, 1955–19653. Ramos, M. et al. (2025) Chem. Sci., 16, 2514-25724. MacDermott-Opeskin, H. et al. (2025) 10.26434/chemrxiv-2025-zd9mr-v4 (7) From molecular mechanisms and cell states to Real-World Evidence and back in immunological disease A major challenge and opportunity in genomic medicine is integrating data across scales to identify and link disease-causal variants, molecular mechanism and cell types/states to clinical outcomes. This is necessary for efficient drug target candidate identification, as well as investigation of heterogeneous clinical outcomes with respect to disease progression trajectories or treatment response. We have developed stat/ML methodologies, Stator and TarGene, for high resolution disease cell type/state identification from single-cell RNA-seq data and disease-causal DNA variant prioritisation from large-scale biobanks, respectively. Here, we aim to develop novel stat/ML methodologies to integrate molecular states quantification with genotype-phenotype inference for application in disease state stratification in immunological disease.Supervisory teamAva Khamseh and Sara BrownProject PartnerJanssen Pharmaceutica NVJanssen’s primary interest is this project is in genotype-phenotype causal inference for the purpose of identifying patient populations in which a treatment may exhibit differential efficacy across distinct subgroups in the presence of multiplicity problems in characterizing such subgroups. Janssen’s Innovative Medicine department has extensive expertise in biostatistics, AI/ML, molecular biomedicine applications, and Real-World Evidence generation, which are of great value to this project. Janssen will delegate a representative to the advisory board of this project.Project BackgroundModern molecular biology, genomics and population medicine take advantage of thousands of variables at contrasting scales. Biology is only rarely conveyed by marginal variation involving a single molecule or phenotype at a time, or pair-wise correlation between two molecules or two phenotypes. We have recently developed two fully general state-of-the-art stat/ML methodologies, backed up by mathematical theory: (1) Stator, to identify cell types and states at high resolution from scRNA-seq data of disease vs healthy controls by taking advantage of high-order expression dependencies, (2) TarGene, for double-robust quantification of the of DNA variants and their interaction on disease outcomes for large-scale genotype-phenotype biobanks, with minimum bias and maximum power. TarGene can and has been used to integrated population genetics with functional genomics epistatic contributions to human traits via transcription factor mechanisms, thus prioritising candidates variants and genes to disease via molecular mechanisms. Given Stator works on the RNA scale, and TarGene on the genotype-to-phenotype scale, we now wish to integrate these data modalities together to link DNA variant to gene expression, mechanisms and disease phenotypes, which are expected to be heterogeneous for complex trait. This is then expected to lead to differences in disease trajectory, severity and treatment response.Project AimsThe first aim of the project is to develop novel stat/ML methodologies for linking disease (severity/response) genes derived from genotype-phenotype population studies to cell states and corresponding RNA expression programmes derived from scRNA-seq data. The second aim of the project is to investigate how the identified strata of cell states/genes relate to differences in disease trajectory and/or severity and/or response to treatment. The key element of this project is to prioritise causation with respect to disease-relevance of cell (sub)types and states and genotype-phenotype inference. This is important to identify genomic contributions to subpopulations of disease spectrum, in order to apply targeted therapies.Data and MethodologyStator utilises structure learning and model-free non-parametric estimators of higher-order interactions, implemented as a nextflow software, pipeline and shiny app. TarGene utilises Targeted Learning (TL), involving diverse machine learning libraries and double-robust estimation strategies, such as Targeted Maximum Likelihood Estimation. TL also applies to quantification of treatment effects on disease outcomes under different treatment interventions (for TarGene, DNA variants are the analogous of “treatment interventions” in Real-World Evidence studies). Broadly, the approach is analogous to LDscore regression which integrates GWAS summary statistics and gene expression data to investigate how genes prioritised from population studies of disease can be stratified by combinatorial gene expression in different cell (sub)types or states. The main differences are 3-fold: (1) Stator offers a higher resolution of cell (sub)types and states, with a focus on cell states, (2) TarGene can be utilised to discover new candidate variants/genes, both with and without functional genomics integration, depending on the type of input data, (3) the focus here is to identify strata of disease, and link these back to molecular differences amongst the individuals.The methodology proposed is completely general and applicable to a diversity of disease areas. In this project, we develop and apply the proposed approach in the context of immunology, taking atopic dermatitis (AD) as an exemplar. We will utilise publicly available scRNA-seq data of AD and healthy controls, as well as large-scale biobanks such as the UK Biobank, All of Us and Our Future Health.Translational Potential and Expected ImpactDrug discovery is generally an inefficient and costly process due to limited understanding of tissue heterogeneity, specifically related to identification of disease-relevant cell populations, their biological states, and the molecular mechanisms involved. Beyond initial discovery, treatments are often only successful in subpopulations of patients. There is therefore a need to prioritise causal variants, genes and cell types/states leading to disease trajectories and treatment response for optimal development of drug targets for various patient subpopulations who would otherwise respond differently to various treatments. The focus here is on quantification of heterogeneous genomic contribution to disease outcome and/or treatment response.Training and Development Outcomes for the StudentOn the methodological front of this cross-disciplinary project, the student will develop technical skills in development and application of rigorous statistical inference (semi-parametric efficiency theory) and machine learning techniques, throughout the PhD and by attending MSc levels courses in these areas and beyond. In application of biomedical data at various scales, on the biomedical front, the student will develop a deep understanding of molecular biology via scRNA-seq, genotype-phenotype inference in large-scale biobanks and Real-World Evidence generation. The student will further develop essential cross-disciplinary and translational communication with access to a supervisory team with diverse expertise ranging across AI/ML, biostatistics and molecular biomedicine.References1. Review article: “A brief history of human disease genetics”, Nature, 2020, https://doi.org/10.1038/s41586-019-1879-7 2. Review article: “Refining the impact of genetic evidence on clinical success”, Nature, 2024, https://doi.org/10.1038/s41586-024-07316-0 3. Review article: “Applications of single-cell RNA sequencing in drug discovery and development”, Nature reviews Drug Discovery, https://doi.org/10.1038/s41573-023-00688-4 4. Stator: “High order expression dependencies finely resolve cryptic states and subtypes in single cell data”, EMBO Molecular Systems Biology, 2025, https://doi.org/10.1038/s44320-024-00074-1 5. TarGene: “Semiparametric efficient estimation of small genetic effects in large-scale population cohorts”, Oxford Biostatistics, 2025, https://doi.org/10.1093/biostatistics/kxaf030 6. TarGene application: “Epistatic contributions to human traits via transcription factor mechanisms”. medRxiv, 2025, https://doi.org/10.1101/2025.09.28.25336826 7. “Atopic Eczema: How Genetic Studies Can Contribute to the Understanding of this Complex Trait”, Journal of Investigative Dermatology, 2022, https://doi.org/10.1016/j.jid.2021.12.020 8. “Multi-omic triangulation identifies molecular candidates of atopic dermatitis severity”, merRxiv, 2025, https://doi.org/10.1101/2025.08.04.25332125 (8) AI-Based Design and Cell-Free Synthesis of Next-Generation Phage Therapeutics This interdisciplinary project will develop an AI-based pipeline to engineer bacteriophage specificity, moving beyond discovery to active design. Leveraging the "Phrameworks" cell-free assembly platform developed with the external partner, Biophoundry, the student will train ML models to identify highly conserved regions on the bacterial surface. Using AI-based protein design methods, the student will design novel Receptor Binding Domains (RBDs) for the phage tail fibre, which will be assembled using Biophoundry’s proprietary "Trinity" technology for experimental validation. This computational-experimental cycle aims to develop effective antibacterial therapies by generating synthetic phages with a broad host range and a reduced risk of resistance evolution.Supervisory teamChris Wood and David GallyProject PartnerBiophoundryBiophoundry will serve as the industrial co-supervisor, leveraging their pioneering expertise in phage engineering and cell-free synthetic biology. Their primary contribution, aside from expertise provided through their co-supervision, will be providing access to their proprietary PHAX Foundry platform. This end-to-end platform unifies AI-driven design with cell-free production, and they will supply proprietary genomic and structural data from their T7 and K1f model phage systems to guide the student's machine learning model development. They will also provide intellectual and scientific input into both the generative AI models and cell-free synthesis methodologies, alongside supervisory, focusing on the translational pathway and commercial viability of the research for developing engineered phage assets. Project BackgroundAntimicrobial resistance (AMR) is a global health crisis, demanding innovative therapeutic solutions. Phage therapy, the use of viruses to kill bacteria, is a powerful alternative to traditional antibiotics, yet its narrow host range and susceptibility to bacterial resistance mechanisms limit its clinical use. Our approach tackles this limitation by developing a generalisable, non-host-dependent design and manufacturing platform, based on cutting edge protein design methods and sector leading cell-free phage assembly methods. The student will design novel receptor binding domains (RBDs) and test them in collaboration with Biophoundry using their “Trinity” platform that facilitates the rapid exchange of RBDs. The goal is to establish a predictive AI-based phage design pipeline, guided by evolutionary data to ensure sustained therapeutic efficacy against diverse pathogens.Project Aims1) Develop deep learning models to predict the binding affinity and killing efficacy of phage Receptor Binding Domains (RBDs) against diverse bacterial strains.2) Design a generative ML model to propose novel RBD amino acid sequences optimised for broad-spectrum killing, targeting functionally constrained regions on bacterial cell-surface proteins.3) Experimentally validate the best ML-designed RBDs using the Trinity engineering system and the Phrameworks cell-free assembly platform in collaboration with Biophoundry.Data and MethodologyThe project will draw on pre-existing data from Biophoundry and the supervisory team, as well as publicly available data, including:1) Genomic and structural data for model phages T7 and K1f.2) Genomic data from large, diverse panels of K. pneumoniae (100 strains) and Uropathogenic E. coli provided by the Gally lab.3) Results from a 100x100 cross-infection experiment mapping host range provided by Biophoundry, which will be augmented by synthetic training data already identified by Biophoundry’s PHAX pipeline.The student will develop a robust ML methodology in the following stages:Dataset Generation - This involves creating sophisticated representations of bacterial receptor targets and phage RBDs using techniques such as structural prediction/modelling, sequence analysis and protein language model embeddings. This will be combined with the cross infection data provided by Biophoundry to create the initial dataset.Predictive Modelling - Machine learning will be trained to predict bacterial receptor targets utilising data provided by Biophoundry, augmented with publicly available data.RBD Design - A generative model will be designed to propose novel RBD sequences that optimise for target binding, host range, and compatibility with the Trinity engineering platform.Wet-Lab Validation - The student will work closely with Biophoundry to synthesise and validate the top ML-designed RBDs. These will be integrated into the phage scaffold and assembled via the Phrameworks cell-free system. Efficacy and host range will be assessed using high-throughput Plaque Assays against the target bacterial panels, performed with the Gally Lab. Data generated experimentally will be fed back into the design pipeline to improve the models.Translational Potential and Expected ImpactThis work will deliver a high-value, translational platform for the rapid, intelligent design and generalisable manufacture of bacteriophage therapies. By shifting phage development from empirical discovery to precision, ML-guided engineering, we offer a scalable solution to the AMR crisis. The focus on conserved receptor targets yields broad-spectrum agents, while the cell-free assembly platform eliminates host-dependence in manufacturing. The external partner, Biophoundry, is perfectly positioned to translate the intellectual property and validated ML pipeline into commercial drug assets, ensuring immediate societal and economic impact in infectious disease treatment.Training and Development Outcomes for the StudentThe student will receive truly interdisciplinary training, becoming an expert in the convergence of AI and synthetic biology. Core ML Skills: Advanced training in deep learning architectures (GNNs, Transformers), protein sequence modelling, and generative design. Core Biomedical Skills: Expertise in synthetic biology (cell-free systems, phage engineering), molecular virology, and microbiology, including bacterial resistance mechanisms. The placement/collaboration with Biophoundry will provide invaluable experience in the drug development lifecycle, commercialisation strategy, IP management, and industrial-scale project delivery, making the student highly competitive for both academic and industrial careers.Referenceshttps://doi.org/10.1073/pnas.2313574121 https://doi.org/10.1002/pro.5148 https://doi.org/10.1021/acssynbio.2c00244 https://doi.org/10.1038/s41586-025-09429-6 https://doi.org/10.1126/sciadv.adt6432 https://doi.org/10.1101/2025.09.12.675911 (9) Unlocking the Image: Enhancing the use of Medical Scans for Brain Health Prediction through Radiology Report Analysis This project explores the use of radiology reports, combined with medical imaging on clinical data. By extracting more nuanced information from free text, this will enable richer phenotyping for research purposes, help identify referral reasons (improving generalisability) and improve image quality assessment. Additionally, through integrating Vision-Language Models, we will improve the prediction of brain health conditions. We will use data collected during general healthcare (facilitating future integration into clinical workflows), and process it within Trusted Research Environments (TREs) to ensure patient privacy.Supervisory teamMichael Camilleri, Beatrice Alex and Grant MairProject PartnerPublic Health ScotlandProject BackgroundThe use of health data in research is often constrained to structured entries (e.g. ICD codes [1]), while most of the qualitative and nuanced understanding of the patient health is recorded in free-text, such as GP notes or radiology reports [2].At the same time, the recent successes in Natural Language Processing (NLP) [3] provide a relatively untapped opportunity to extract value from such unstructured data. Automated processing of clinical notes can help ascertain existing conditions [4] or, as proposed herein, identify biases in the data [2] which can feed into improving the robustness of AI tools applied to health data. Additionally, integrating language with visual models promises to improve performance of downstream tasks such as disease classification and prediction [5].This is accelerated by the rising availability of Trusted Research Environments (TREs) [6], with the aim of opening up clinical data for research purposes, ensuring that any methods developed can be more easily integrated into clinical workflows. Chief among these is the Brain Health Data-Pilot (BHDP) [7], within the Scottish National Safe Haven (NSH) with more than 1.2 million brain scans and linked Electronich Health Records (EHRs) from across Scotland.Project AimsThe primary goals of this project will be to process free-text radiology reports accompanying medical images (MRI/CT) to: (a) extract key conditions, artefacts and image quality features, (b) identify the reason for the scan (why the subject was referred to have a scan), and (c) as a stretch goal, integrate with an Imaging module as a Vision-Language-Model (VLM) [5] to improve prediction of brain health conditions (e.g. Dementia).Data and MethodologyThis project uses clinical datasets, which provide orders of magnitude more data and heterogeneity than publicly available sources [7], while exhibiting novel research opportunities due to their 'raw' nature. Access to the TRE (ensuring patient privacy) will be facilitated through having eDRIS as our external partner for the Scottish NSH (BHDP). Furthermore, there is scope for using consented data (e.g. Generation Scotland [10] or UK Biobank) as an alternative source of data to complement the above.MethodsThe project has 3 work packages:1. Enrich Research Value of Radiology reports: The Language Technology Group [8] developed a rule-based system, EDIE-R [4] to identify 24 brain-scan phenotypes. This will provide a starting point to develop newer neural models (e.g. Transformers [9] or Large-Language Models [3]) to extract relevant concepts. Using neural models will also allow us to extend to other relevant phenotypes, and also to image quality metrics (e.g. movement artefacts).2. Understanding Scanning Bias: The next step is to infer the reason for the scan. This will involve eliciting signal from the clinical history portion of the report. Furthermore, this may be missing in some scans, and hence will necessitate learning a mapping from the radiologist report to the referral context in a semi-supervised setting, allowing reasoning about selection bias in scanned individuals.3. Improving Prediction of Brain Health: This can be extended to disease progression models, incorporating condition codes [11] or MRI/CT scans themselves (using a VLM [5]) to improve prediction of brain health conditions e.g. Dementia. Translational Potential and Expected ImpactThe use of clinical data and input from domain experts (and the project partner) will ensure that the aforementioned systems can more easily be deployed in clinical workflows. Concretely, this work will:1. Develop systems to accelerate health research by increase the value of free text reports, and which can, in clinical settings, summarise patient trajectory for new consultations.2. Provide a path to analysing biases in referrals to scanning, improving fairness and trusthworthiness of predictive models for diseases.3. Develop and advance TRE functionality in collaboration with eDRIS.Training and Development Outcomes for the Student* Developing skills in applying/implementing deep learning for NLP and medical imaging* Data Science for curation of raw data within constrained environments (TREs)* Experience in using and developing the emerging field of TREs, including ethics and governance procedures.* Experience in working with real-world health data and collaborating with clinical domain experts* Experience in Patient and Public Involvement to shape the direction of research.References- [1] International Statistical Classification of Diseases and Related Health Problems. https://www.who.int/standards/classifications/classification-of-diseases - [2] Tang, A.S., Woldemariam, S.R., Miramontes, S. et al. "Harnessing EHR data for health research". Nat Med 30, 1847–1855 (2024). https://doi.org/10.1038/s41591-024-03074-8 - [3] Artsi Y., Klang E. et al. "Large language models in radiology reporting - A systematic review of performance, limitations, and clinical implications". Intelligence-Based Medicine, 12 (2025), ISSN 2666-5212, https://doi.org/10.1016/j.ibmed.2025.100287 - [4] Alex, B., Grover, C., Tobin, R. et al. Text mining brain imaging reports. J Biomed Semant 10 (Suppl 1), 23 (2019). https://doi.org/10.1186/s13326-019-0211-7 - [5] Li X., Li L. et al. "Vision-Language Models in medical image analysis: From simple fusion to general large models". Information Fusion, 118 (2025), ISSN 1566-2535,https://doi.org/10.1016/j.inffus.2025.102995 .- [6] Trusted Research Environments. https://www.hdruk.ac.uk/access-to-health-data/trusted-research-environments/ - [7] Camilleri M., Gouzou D. et al. "A large dataset of brain imaging linked to health systems data: a whole system national cohort" (in preparation).- [8] Language Technology Group (website) https://www.ltg.ed.ac.uk/ - [9] Tay Y., Dehghani M., et al. "Efficient Transformers: A Survey". ACM Comput. Surv. 55, 6, Article 109 (June 2023), https://doi.org/10.1145/3530811 - [10] Generation Scotland https://genscot.ed.ac.uk/ - [11] Shmatko, A., Jung, A.W., Gaurav, K. et al. Learning the natural history of human disease with generative transformers. Nature (2025). https://doi.org/10.1038/s41586-025-09529-3 (10) Addressing patient mortality in hemodialysis via AI applied to metabolomics and material science Patients undergoing hemodialysis (HD) exhibit significantly higher mortality rates compared to those who had kidney transplants. This disparity is largely attributed to the accumulation of uremic toxins that standard HD treatments fail to completely remove. Despite this acknowledged issue, systematic identification of specific uremic toxins impacting mortality in patients receiving maintenance HD has not been effectively addressed. This project integrates AI, metabolomics, and biomedical materials science to accelerate the identification of key metabolites and biological pathways involved in the mortality of dialysis patients and to discover biocompatible filtering materials that could enhance HD efficacy in toxin removal. By leveraging data from existing literature and collaborations, this synergistic approach seeks to elucidate the mechanisms behind elevated mortality in HD patients and develop solutions to mitigate these risks, with the ultimate goal of reducing patient mortality.Supervisory teamGrazia De Angelis, Karl Burgess and Bryan ConwayProject PartnerKidney Research UKProject BackgroundApproximately 2 million individuals globally suffer from kidney failure, necessitating treatment options such as transplantation and dialysis. Transplantation is limited by donor availability, forcing many to rely on HD. Whereas transplant recipients exhibit approximately 80% survival rates five years post-procedure, those undergoing HD have less than a 50% chance of surviving the same period due to what's known as “residual uremic syndrome.” This condition results from the incomplete removal of certain uremic toxins during HD, significantly contributing to the higher mortality observed in these patients [1]. Current HD technologies rely on membranes which are limited by size, thus unable to effectively eliminate larger uremic toxins from the patient's bloodstream. This approach lacks precision and effectiveness as it is designed on small molecules like urea and fails to address other, more harmful toxins.Project AimsOur research aims to enhance HD treatment effectiveness and reduce mortality rates through a multidisciplinary strategy. Initially, we must identify metabolites linked to adverse effects, leveraging metabolomics combined with AI to uncover key molecules influencing kidney failure patient outcomes. Prior studies show inconsistent results, highlighting the complexity of metabolite impacts on patient mortality and emphasizing the need for deeper investigation. We plan to use an integrated metabolomics and AI approach to better understand these mechanisms, paving the way for future comprehensive studies and the development of materials tailored to remove toxic metabolites. AI will play a crucial role in rapidly advancing these objectives, tackling the vast scope of toxins and potential materials.Project ActivitiesAs a PhD student on this project, your primary role will involve:Utilizing data from landmark studies carried out over the past decade, enhancing your understanding of clinical outcomes in hemodialysis.Engaging in molecular simulations to assess databases containing thousands of porous materials, focusing particularly on Covalent Organic Frameworks, to identify those capable of efficiently removing harmful toxins from the bloodstream.Applying sophisticated machine learning techniques to screen these materials on a large scale, a methodology currently being developed by our Engineering group.Synthesizing and/or selecting optimal materials based on the unique properties required for effective toxin removal, thereby directly contributing to the design of more efficient and patient-centered hemodialysis treatments.Collaboration with Kidney Research UK and access to their NURTuRE biobank provides a rich, real-world context for your research, offering the opportunity to validate your findings against an extensive range of patient data. Translational Potential and Expected ImpactThis project not only aims to lead to significant academic contributions but also holds the potential to translate into real-world clinical applications that could drastically reduce patient mortality. We expect this project to lie the basis for interdisciplinary research between the involved groups and provide evidence for larger studies.Training and Development Outcomes for the StudentThrough this project, the student will gain invaluable skills in both the practical and theoretical aspects of biomedical research. They will develop proficiency in metabolomics and artificial intelligence techniques, learning to interpret complex biological data and to apply machine learning algorithms for real-world applications. Additionally, the student will enhance their capabilities in molecular simulations and materials science, crucial for addressing clinical challenges. Through collaboration with external partners, such as Kidney Research UK, and interdisciplinary teamwork, they will also improve their communication and project management skills. This comprehensive training will prepare them for a successful career in bioinformatics and materials engineering.References[1] The Kidney Project, University of California San Francisco, https://pharm.ucsf.edu/kidney [2] S. Al Awadhi et al, A Metabolomics Approach to Identify Metabolites Associated With Mortality in Patients Receiving Maintenance Hemodialysis, Kidney Int Rep 2024 9, 2718–26.[3] S. Kalim et al., A Plasma Long‐Chain Acylcarnitine Predicts Cardiovascular Mortality in Incident Dialysis Patients, J American Heart Association 2, 2013.[4] Hu, J.-R., et al Serum Metabolites and Cardiac Death in Patients on Hemodialysis, Clin J Am Society of Nephrology 14(5): 747-749, 2019.[5] https://nurturebiobank.org/ , visited on 4th October 2025.[6] T. Fabiani et al., In silico screening of nanoporous materials for urea removal in hemodialysis applications, Phys. Chem. Chem. Phys., 2023, 25, 24069.[7] REDIAL, redefining hemodialysis with data-driven materials innovation, project https://www.suspromgroup.eng.ed.ac.uk/redial [8] Zarghamidehagani and De Angelis, Machine learning-driven computational screening of covalent organic frameworks for gas separation applications, Separation and Purification Technology, 2025, 377, 134358.[9] Zarghamidehagani et al., Chemical engineering contribution to hemodialysis innovation: achieving the wearable artificial kidneys with nanomaterial based dialysate regeneration, Physical Sciences Reviews, 2025, 10(3), pp. 279–299 (11) AI for Enhanced Decision-Making for Imaged Abnormalities of the Pancreas Early detection of pancreas cancers and pre-malignant lesions offers the best chance of cure for pancreatic cancer. Currently, patients at risk are managed through frequent imaging and clinical assessment—processes that are manual, time-consuming, and prone to error. This project will develop an AI system integrating imaging models and clinical data to detect early malignant transformations in the pancreas. Supervisory teamEleonora D’Arnese, Amir Vaxman and Damian MoleProject PartnerNHS LothianProject BackgroundAbnormalities in the pancreas detected on CT carry a risk of malignant transformation and require long-term surveillance. The incidence of such referrals is rising rapidly due to the increased number of scans done for other reasons, placing increasing demand on skilled specialists who must manually compare scans over time. This process is labour-intensive, costly, and prone to error: false negatives can delay treatment or allow cancers to go undetected, while false positives may lead to unnecessary surgery. Moreover, patients with low- or negligible-risk abnormalities are subjected to prolonged and expensive monitoring, impacting their well-being. Early detection of malignant transformation of abnormal areas could substantially improve survival.Project AimsThe primary goal of this project is to develop an AI-based image analysis and decision-making augmentation solution for the surveillance and early detection of cancers or pre-cancers in pancreas. This project will create a new tool that, starting from routinely acquired images and clinical data, will monitor, analyse, and inform decision-making.Training and Development Outcomes for the StudentThe student will train in: AI for scientific computation, medical image processing, geometry processing, and clinical imaging diagnostics. The research will begin by sandboxing training examples (that could be synthetic), to develop the algorithms, progressing to exposure to the clinical dataset, to further develop the algorithm. Concrete development outcomes are: 1) Acquisition of fundamental AI, scientific computation, and diagnostic skills. 2) Create a mature proof-of-concept for pancreatic abnormality analysis. 3) Develop an algorithm using real-world clinical data to meet standardized diagnostic metrics. (12) Machine learning driven clinical prediction models using multimodal data for robot-assisted surgical informatics With the remarkable progress in Artificial Intelligence (AI), particularly in the field of Transformers, machine learning-driven clinical prediction models (CPM) are gaining prominence in the literature [1]. However, most of these models are yet to be applied in practice for real-world clinical decision-making. To translate these tools’ real-world applications, they need to be accessible, adaptable, and actionable. In this project, we will develop usable models and assess their translation potential to decision-making in robot-assisted surgery (RAS). Recent advances in RAS have revolutionized healthcare, and allowed the collection of real-time pre-, post- and during-surgery data that can assist critical decision-making around when these surgeries should be offered and what potential complications might arise from these surgeries. A usable predictive model will facilitate this and lead to safer decision-making, reducing the burden on individuals and the healthcare system.Supervisory teamSohan Seth and Ewen HarrisonProject PartnerIntuitiveProject BackgroundRecent years have witnessed significant progress in machine learning driven clinical prediction models [1]. These models are shown to be robust, accurate and well calibrated on various publicly available benchmark datasets, e.g., MIMIC-IV. Using these models in practice, however, is not straightforward, and additionally requires them to be accessible, adaptable, and actionable, such that they are equipped to deal with multimodal data under competing risks, predicting various outcomes of interest simultaneously in real-time while presenting their decisions in a human-interpretable manner for guiding practical decisions under various resource and safety constraints. This is challenging and particularly difficult in high-stakes environments such as lifesaving surgeries. Therefore, these models are yet to be applied to clinical practices for decision-making widely. Recent technological advancements have witnessed the advent on robot-assisted surgeries making them safer and proving real time measurements paving the way for data-driven decision-making. But critical decisions remain to made around the section of surgery in the context of whether the benefit from surgery outweighs to complications for postoperative care. Having a better sense of factual and counterfactual situations over multiple outcomes and constraints provide an holistic view of treatment that helps with more informed decision-making at an individual level, and resource allocation at a healthcare level.Project AimsWe aim to develop predictive models that are accessible, i.e., the model’s decision is understandable to the end-users and traceable to features responsible for the decision, adaptable, i.e., the model can be transferable to different populations relatively easily, and it can be adapted to a changing environment, and actionable, i.e., the model can integrate various data sources as potentially multiple resolutions, and can provide real-time outcome from longitudinal data. We aim to assess in model in uncovering the mechanisms that drive complications, resilience, and recovery, or to test whether different surgical approaches truly minimise physiological stress across diverse patient groups.Translational Potential and Expected ImpactThe project develops machine learning driven clinical prediction model to make these models usable. The project aims to assess the translation of a recently developed method into a real-world application. The current technology is at a Technology Readiness Level 3, and we expect it to explore its performance on real data beyond publicly available benchmarks to potentially move it towards Technology Readiness Level 4. However, we expect the project to evaluate performance beyond accuracy and calibration, and establish the method on various usability metrics based on transparency, traceability, accessibility, privacy, adaptability, etc. We expect the project to push the boundaries of translation-ready clinical predictive models and set standards in data and methods practices in healthcare informatics. The successful completion of the project will enable clinicians to make real-world decisions around life-saving surgeries and post-operative care.Training and Development Outcomes for the StudentWe expect the project to train the prospective student in cutting-edge AI tools and health informatics. The project requires developing machine learning models and deploying these models in clinical decision-making. The project also involves an understanding of the clinical variables, pre-processing and interpretation. The student will be based in the Data Science Unit at the School of Informatics. DSU hosts a diverse range of researchers working in various disciplines, including health, social science, chemistry, geosciences, etc.. This allows the student a diverse exposure. The student will also be based in the Surgical Informatics group, hosting researchers with a range of clinical and health informatics expertise, allowing the student to learn from a different discipline besides informatics.References[1] https://doi.org/10.1038/s41586-025-09529-3 (13) AI-Driven Multimodal Alignment for Predicting Treatment Outcomes in Renal Cell Carcinoma Immune checkpoint inhibitors (ICIs) have markedly improved survival for several cancers, but safe, effective deployment in the NHS requires better tools and data to optimise use and manage toxicities. A major unmet need is robust biomarkers that distinguish responders from non-responders, predict immune-related adverse events and guide personalised therapy.This project addresses that gap by integrating RNA sequencing, whole-exome sequencing and immunofluorescence imaging to predict treatment outcomes and discover novel biomarkers. It will develop advanced AI-based multimodal learning methods to align and combine these diverse data types, aiming to deliver a more accurate, comprehensive picture of tumour–immune interactions.Supervisory teamAjitha Rajan, Siddarth N. and Alexander LairdProject PartnersNHS Lothian, Francis Crick Institute and University of CalgaryThe external partner will provide patient data includign RNA seq, Exome Seq and immunohistochemistry data that is accompanied by high quality clinical data with the treatment given. NHS Lothian and Francis Crick Institute will also provide clinical expertise in understanding the data modalities, guidance in alignment of modalities, interpretation and validation od results from the AI models. Javier Alfaro at the University of Calagry will help with preprocessing raw sequencing data and helpign interpret RNA seq and Exome seq data.Project BackgroundRenal cell carcinoma (RCC) is the most common form of kidney cancer and the eighth most prevalent cancer in the UK. Immunotherapy, which activates the immune system to target cancer cells, has transformed outcomes in several malignancies, including melanoma, lung cancer, and metastatic RCC (mRCC). However, most patients with mRCC ultimately develop intrinsic/acquired resistance to ICI and die of their disease. Moreover, immune-related adverse events (irAEs) can restrict the safe use of immune checkpoint inhibitors, affecting treatment efficacy and patient quality of life.Identifying robust biomarkers that predict treatment response, resistance, and irAEs remains an urgent and unmet clinical need. Addressing this gap would enable more precise patient stratification, improve therapeutic outcomes, and optimise allocation of healthcare resources.This project will develop novel AI innovations in multimodal data integration and modelling using in-depth patient profiles containing RNA sequences, Whole Exome sequencing, and immunofluroscence. The key contribution will be understanding the role of these modalities for immunotherapy response, what information they convey, and how they align with each other to discover biomarkers and predict treatment outcomes.Understanding and analysing patient data will involve close collaboration with Dr. Alex Laird, a consultant urological surgeon at the Western General Hospital and clinical researchers at Francis Crick Institute.Project Aims- Biomarker Discovery and Treatment outcomes (Progression Free Survival)- Assess Unimodal efficacy- Multimodal alignment and efficacy for biomarkers- Uncertainty quantification in the unimodal and multimodal settings for prediction- Evaluation using patient dataData and MethodologyData -Data from 122 patients for all three modalities (incl. clinical and blood) is already available to use. Additional data sources (e.g. through Glasgow hospitals) will also be explored in the first year of the project. Supervision team has experience in data sharing agreements and accessing NHS data for renal cancer (prior H2020 project: KATY). The PhD will also use the existing multi-modal data on Renal cell carcinoma, collected as part of the Manifest project (funded by MRC) led by Francis Crick Institute (that Dr. Amy Strange, Prof. Ajitha Rajan and Dr. Alex Laird are a part of) – multi omics data, histopathology and clinical data -- to design predictive AI models for biomarker discovery, predicting treatment outcomes. In particular, the MANIFEST project has access to the following clinical trial dataRAMPART – Sample size 551. The study looks at two new immunotherapy treatments. The aim was to find out whether taking one drug (durvalumab) or a combination of two drugs (durvalumab and tremelimumab) for one year can prevent or delay kidney cancer from coming back compared to the current standard of care (active monitoring after surgery).PRISM [5] – Sample size 192. The aim of the PRISM study is to assess whether less frequent dosing of ipilimumab (12-weekly versus 3-weekly), in combination with nivolumab, is associated with a favourable toxicity profile without adversely impacting efficacy.MITRE [6] – Sample size 81. The MITRE study explores and validates a microbiome signature in a larger scale prospective study across several different cancer types.The Cancer Genome Atlas Program (TCGA) will also be used.Methodology -Research challenges to be addressed in this project are as follows:1. Unimodal: Explore SOTA methods for prediction and review results across modalities (late fusion) to:- Understand which modalities provide the best predictors- Understand the relationships between modalities- Understand the optimal combination and ordering of a pipeline of models which could be mapped to clinical care (and collection of samples).2. Integration: Explore data-integration to see where and how modalities can be combined to provide further insight (via intermediate fusion). Situations involving varying amounts of overlap across modalities will require developing novel approaches. For example, similar modalities such as omics might allow earlier integration (i.e., jointly learning representations), whereas other cases involving distinct image modalities may be integrated later. A hierarchy of integrations can ensure fusion between modalities is effective at each level.3. Alignment: Evaluate the extent information can be aligned between different modalities and whether such alignment enhances the prediction of treatment outcomes. This involves using information-theoretic measures to identify and quantify alignment, investigate methods to merge aligned information at different stages, and learn such alignment from scratch. Techniques such as AJIVE and latent-space exploration will be employed to assess and interpret cross-modal relationships.UQ: Develop metrics to associate model predictions with a degree of confidence for clinical correlations. Explore Stochastic Weight Averaging (SWAG) and EpiNets as initial techniques.Evaluation: Conduct an evaluation study with clinicians to evaluate the accuracy of the model outputs and explanations for treatment outcomes with patient data from NHS Lothian and MANIFEST cohorts.Translational Potential and Expected Impact- A suite of unimodal models to predict immunotherapy response and toxicities- Multimodal fusion for modular integration of unimodal immunotherapy response and understanding modalities with shared and disjoint information on immunotherapy response.- Uncertainty quantification as a human oversight measure for confident predictionsThe project has immense scope for impact in the clinic and industry through the network of clinical partners in a recently completed H2020 project, KATY and other NHS collaborators and vast network of industry partners through the MANIFEST project.Training and Development Outcomes for the StudentThe PhD student will develop expertise in multimodal AI for predicting renal cancer treatment response, gaining skills in integrating imaging, omics, and clinical data using deep learning, multi modal alignment and uncertainty quantification of AI methods. They will learn robust model development, validation, and reproducibility practices while ensuring ethical and responsible use of patient data. Domain knowledge in renal cell carcinoma, biomarkers, and treatment mechanisms will be strengthened. The student will enhance scientific communication, collaboration with clinicians, and project management abilities, contributing to multidisciplinary research outputs and publications, preparing for careers in academia, healthcare AI, or precision oncology.References1. Maddox, W.J., Izmailov, P., Garipov, T., Vetrov, D.P. and Wilson, A.G., 2019. A simple baseline for bayesian uncertainty in deep learning. Advances in neural information processing systems, 32.2. Osband, I., Wen, Z., Asghari, S. M., Dwaracherla, V., Ibrahimi, M., Lu, X., & Van Roy, B. (2023). Epistemic neural networks. Advances in Neural Information Processing Systems, 36, 2795-2823.3. Feng, Qing, et al. "Angle-based joint and individual variation explained." Journal of multivariate analysis 166 (2018): 241-265.4. Qu, Linhao, et al. "Multi-modal data binding for survival analysis modeling with incomplete data and annotations." International Conference on Medical Image Computing and Computer-Assisted Intervention. Cham: Springer Nature Switzerland, 2024.5. Buckley HL, Collinson FJ, Ainsworth G, Poad H, Flanagan L, Katona E, Howard HC, Murden G, Banks RE, Brown J, Velikova G, Waddell T, Fife K, Nathan PD, Larkin J, Powles T, Brown SR, Vasudev NS. PRISM protocol: a randomised phase II trial of nivolumab in combination with alternatively scheduled ipilimumab in first-line treatment of patients with advanced or metastatic renal cell carcinoma. BMC Cancer. 2019 Nov 14;19(1):1102. doi: 10.1186/s12885-019-6273-1. PMID: 31727024; PMCID: PMC6854710.6. Thompson NA, Stewart GD, Welsh SJ, Doherty GJ, Robinson MJ, Neville BA, Vervier K, Harris SR, Adams DJ, Dalchau K, Bruce D, Demiris N, Lawley TD, Corrie PG. The MITRE trial protocol: a study to evaluate the microbiome as a biomarker of efficacy and toxicity in cancer patients receiving immune checkpoint inhibitor therapy. BMC Cancer. 2022 Jan 24;22(1):99. doi: 10.1186/s12885-021-09156-x. PMID: 35073853; PMCID: PMC8785032. (14) BRAID: Breast Radiological AI-Integrated Cancer Diagnosis – A Clinician-Centric Framework Deep learning has demonstrated strong performance in medical imaging, yet its clinical adoption remains limited due to the opaque, black-box nature of many models. In high-stakes settings like cancer diagnostics, accuracy alone is insufficient; clinicians need clear, interpretable explanations to ensure patient safety and build confidence in AI-assisted decisions. Therefore, this project, by specifically focusing on breast cancer, will be developing a clinician-driven AI framework for breast cancer diagnosis. It will develop a transparent, explainable, and robust solution to support effective, safe, and trustworthy decision-making in real clinical settings.Supervisory teamAjitha Rajan, Eleonora D'Arnese and Rishi RamaeshProject PartnersNHS Lothian and Erasmus MC (Rotterdam)NHS Lothian will support this project by helping gain access to anonymised patient mammograms and breast MRI data, together with associated clinical and demographic information, through Public Health Scotland and eDRiS in accordance with ethical and data governance regulations.Rishi's clinical team, including consultant radiologists and breast imaging specialists, will contribute clinical insights to guide model design and interpretation, ensuring alignment with diagnostic workflows and clinical reasoning. They will also assist in defining clinically relevant causal relationships to inform the project’s causal graph design. Additionally, they will participate in the evaluation of AI outputs, providing structured feedback from several radiologists to assess interpretability, usability, and clinical impact. Finally, Rishi as NHS Innovation fellow will support translation of the project in clinical practice.Jacob Visser at Erasmuc MC in Rotterdam will provide guidance on identifying diagnostically relevant image features and clinical variables to inform model development and causal analysis.He will also assist in defining clinically meaningful causal relationships that can be used to build and validate the project’s causal reasoning framework, ensuring that the resulting AI models align with real-world diagnostic reasoning. As the project progresses, Erasmus MC radiologists will participate in the clinical evaluation of AI-generated outputs, offering qualitative and quantitative feedback on interpretability, trustworthiness, and clinical utility.Having experts in two different sites will also mitigate expert and location bias for the project.Project BackgroundBreast cancer is the most common type of cancer in women, with around 55,000 people being diagnosed with the disease yearly. Currently, UK women between 50 and 71 are invited to be screened every 2/3 years to help detect cases. This equates to around 2.1 million breast cancer screens carried out annually, helping to prevent around 1,300 deaths. Accurate and timely diagnosis is critical in the management of breast cancer, with early detection through imaging significantly improving patient outcomes, reducing the need for invasive interventions, and easing the financial and operational burden on healthcare services. However, recent reports have highlighted persistent challenges in interpreting medical imaging within healthcare services like the NHS. In response, several artificial intelligence (AI) tools are being trialled in hospitals to assist radiographers by triaging images, prioritising abnormal findings, and expediting urgent cases. While these developments prove the potential of AI to enhance diagnostic workflows, the opaque, black-box nature of many deep learning–based systems pose a significant barrier to clinical integration. For AI tools to be fully adopted and trusted in sensitive, high-stakes settings such as breast cancer imaging, it is essential to develop interpretable, transparent models that provide clear, understandable reasoning along diagnostic outputs.Project AimsThe project aims at developing a clinician-driven AI framework for breast cancer diagnosis - one that is transparent, explainable, and robust to support effective, safe, and trustworthy decision-making. The approach will prioritise clinical interpretability and reasoning, aiming to build models that perform well while providing meaningful insights that clinicians can trust and act upon. The framework will require the development and integration of transparent and interpretable AI models, causal reasoning, and robustness which will rely on generative AI-based synthetic images.Data and MethodologyData -We aim to have 100K mammograms and 10K breast MRIs from patients in Scotland. The larger number of mammograms is because it is the primary modality for routine scanning. We plan to gain HSC –PBPP approval for accessing this data before January 2027. The supervisory team have experience in obtaining HSC-PBPP approval for imaging data from Public Health Scotland through eDRiS and working in the national safe haven. This prior experience will mitigate the risk of patient image data access.Methodology -To ensure models transparency and clinical interpretability, a set of clinically meaningful concepts for both mammograms and MRIs will be defined. These sets will be derived from the BI-RADS atlas and refined through collaboration with radiologists to ensure clinical relevance. These concepts will form the basis of a concept bottleneck model, a two-stage classification architecture designed to enhance interpretability. The first model predicts the presence of individual clinical concepts directly from images, while the second model takes these predicted concepts and outputs the final diagnostic label (similar to our recent work in [1]), the BI-RADS score. In addition, causal structures will be defined to reflect expert understanding of the relationships between imaging features and diagnostic outcomes, as specified in the BI-RADS lexicon. The resulting causal graphs will encode how specific imaging features (e.g., calcifications) causally contribute to BI-RADS scores, distinguishing them from mere associations.Finally, to evaluate the proposed solution robustness to misdiagnosis a generative AI pipeline will be developed to produce synthetic mammograms and MRIs that simulate real-world diagnostic uncertainties and misinterpretations (similar to our work for chest X-rays in [2]). These adversarial cases will be generated by perturbing the concept vectors - modifying the presence, absence, or expression of clinical features based on clinical input. These altered vectors will be used to produce synthetic reports describing the perturbed findings. These reports, in turn, will condition image generation models to produce corresponding synthetic scans, ensuring coherence between the report and the visual content.Translational Potential and Expected ImpactThe project aims to deliver a trustworthy, interpretable, and robust AI system for breast cancer diagnosis, co-designed with clinicians and validated on real patient data. This has the potential to greatly improve healthcare by enabling faster, more accurate diagnoses, reducing patient wait times, and easing the burden on radiologists. Currently, two specialists are required per mammogram; the proposed solution could reduce this to one without compromising safety, thanks to its transparent, explainable outputs and ability to triage and highlight abnormalities. This will allow radiologists to focus on complex cases, reduce diagnostic errors, and generate significant operational and economic benefits.Training and Development Outcomes for the StudentThe PhD will train the student in developing transparent, interpretable, and robust AI for breast cancer diagnosis. They will gain expertise in machine learning, explainable AI, causal inference, and generative modelling for synthetic medical images. Training includes medical image analysis, ethical data governance, and interdisciplinary collaboration with clinicians to ensure clinical relevance. The student will develop strong research, communication, and project management skills through publications, presentations, and teamwork with NHS and academic partners. By completion, they will be equipped to lead research in trustworthy AI for healthcare, bridging technical innovation and clinical translation.References1. Amy Rafferty, Rishi Ramaesh, and Ajitha Rajan. Explainability Through Human-Centric Design for XAI in Lung Cancer Detection. The 34th International Joint Conference on Artificial Intelligence (IJCAI-25), Human-Centred AI track.2. Amy Rafferty, Rishi Ramaesh and Ajitha Rajan. CoRPA: Adversarial Image Generation for Chest X-rays Using Concept Vector Perturbations and Generative Models. In 13th IEEE International Conference on Healthcare Informatics (ICHI 2025) . (15) Quantifying dementia progression and emotion recognition in a virtual reality environment Dementia is a progressive syndrome affecting memory, cognition, and spatial navigation, reducing quality of life for people living with dementia (PlwD) and placing strain on carers and healthcare systems. This project aims to improve the quantification of dementia progression, as well as emotion recognition, using data collected in a virtual reality environment that engages PlwD in personalised navigation tasks. Leveraging self-supervised deep learning and explainable artificial intelligence, the student will identify interpretable navigational biomarkers of disease progression and emotional reactions. The findings hold promise for earlier detection, personalised interventions, and scalable, cost-effective support to improve outcomes for PlwD.Supervisory teamArno Onken and Vito De FeoProject PartnerBike Labyrinth Bike Labyrinth will provide biomedical relevance in the form of use cases for virtual environments to assist in improving spatial navigation and memory abilities. The company will also provide access to their development and production facilities for conveying product development needs, and coach the student during monthly meetings.Project backgroundDementia is experienced as an ongoing decline in brain functions, including reasoning, memory, spatial navigation, and keeping track of time. Consequently, People living with Dementia (PlwD) tend to have additional difficulties affecting their cognitive, mental and physical abilities, which not only impacts their own quality of life but also poses challenges for their families and carers. For example, in its early stages, Alzheimer's Disease (AD) causes difficulties in dealing with new information. As AD progresses, memory loss affects sufferers’ ability to plan and carry out day-to- day tasks, and problems with spatial navigation make it more difficult for sufferers to reliably find their way back home from familiar places. Physical activity can help remedy some of this decline due to its benefits for brain health; however, taking part in physical activity can be particularly challenging for PlwD. These issues not only reduce the quality of life of PlwD but also lead to unnecessary hospitalisations and delays in hospital discharge, stressing the importance for effective pre-hospitalisation preventive and supportive solutions.Project aimsThe aim of this project is to improve the quantification of dementia disease progression and emotional recognition using data collected from a virtual reality (VR) environment. To this end, the student will leverage the latest developments in self-supervised deep learning and explainable artificial intelligence to find navigational features that best characterise disease progression and emotional reactions.Data and MethodologyDementia is a multi-dimensional syndrome that originates in the brain, affecting different functionalities, such as memory, cognition, spatial and temporal orientation, and emotional regulation. People living with dementia often have multiple comorbidities affecting their physical functioning as well as their overall quality of life and social health.The external industry partner Bike Labyrinth is developing a VR-enhanced exercise bike, an easy-to-use, engaging and safe training environment. PlwD can train their spatial navigation and memory abilities by moving in a 3D virtual environment simulating their local city. This device does not require permanent involvement of specialist personnel and can be used to evaluate the performance of subjects without the need to create separate physical and cognitive measures. It can be personalised to any specific city and the individual needs of subjects, allowing person-centred training and real-time adaptation. While the bike is still at the prototype stage, the VR environment can already be explored using a simulator.AI systems have been used to assess the state and predict the progression of cognitive decline (Jiang et al., 2020). Using the VR environment allows us to assess navigational and memory performance of subjects. There are suitable features that have already been used in early diagnosis of AD, namely average steps and path-efficiency (Jiang et al., 2020). We will build on these insights and enhance them using data-driven modelling. We will use self-supervised deep learning techniques to model subject navigation in the virtual environments and use explainable AI techniques such as LIME and SHAP (Hassija et al., 2024, Vimbi et al., 2024) to find interpretable features that characterise progression of dementia. This will allow us to accurately and automatically quantify changes in the performance of the subject as related to pathogenesis of the disease.Translational Potential and Expected ImpactThere are ~50 million people living with dementia worldwide and this is predicted to rise to 152 million by 2050. In the UK, 700,000 family carers look after the 850,000 people living with dementia, and this is expected to rise to 1.6 million by 2040. Current UK costs of dementia for older people are £34.7 billion a year, including healthcare (£4.9 billion), social care (£15.7 billion) and unpaid care (£13.9 billion). Total UK dementia care costs are projected to increase to £94.1 billion by 2040. Better quantification of disease progression holds promise to improve quality of life of PlwD.ReferencesJiang, J., Zhai, G., & Jiang, Z. (2020, June). Modeling the self-navigation behavior of patients with Alzheimer’s disease in virtual reality. In International Conference on VR/AR and 3D Displays (pp. 121-136). Singapore: Springer Singapore.Hassija, V., Chamola, V., Mahapatra, A. et al. Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence. Cogn Comput 16, 45–74 (2024). https://doi.org/10.1007/s12559-023-10179-8 Vimbi, V., Shaffi, N. & Mahmud, M. Interpreting artificial intelligence models: a systematic review on the application of LIME and SHAP in Alzheimer’s disease detection. Brain Inf. 11, 10 (2024). https://doi.org/10.1186/s40708-024-00222-1 (16) AI-based assessment and validation of brain mineral deposition in its different forms detected from routine clinical brain magnetic resonance images This project will develop an AI-based method and tool for segmenting iron and calcium accumulation throughout the whole brain and in its different forms (tissue deposition, brain microbleeds, superficial siderosis, and haemorrhagic transformations from ischaemic lesions) in a large sample of MRI images acquired from different patient groups, assess the degree of mineral accumulation in the areas segmented offering a proxy for insoluble iron/calcium concentration and degree of aggregation (i.e., clustering) in different subregions (also using AI methods), and validate the AI-based imaging computational assessments using complementary biomedical analysis methods in a sample of individuals with brain MRI, retinal images, and tissue samples.Supervisory teamMaria Valdés Hernández, Blanca Diaz-Castro and Miguel O. Bernabeu LlinaresProject PartnersPharmatics LtdProject BackgroundIron is involved in oxygen transport and is essential for maintaining a healthy body’s function. But an excess of it can lead to oxidative stress damage to biomolecules, as well as cellular dysfunction. This process is apparent with increasing age, where iron gets accumulated in the brain, and it increases the risks of neurodegenerative diseases. Overall, it is the strongest factor influencing cognitive decline in normal ageing. Although this process mainly occurs gradually and silently, it can be detected using magnetic resonance images (MRI) even in the preclinical stages when minor cognitive concerns are starting to occur and before any other clinical symptom appears. In normal ageing this toxic iron accumulation mainly occurs in the globus pallidus, a subregion at the centre of the brain. In individuals with neurodegenerative diseases it has different spatial distributional patterns. We previously developed an automatic method to identify and segment the areas in normal ageing MRI scans and validated it with a physical phantom. But we could not establish the degree of mineral accumulation in the segmented areas, most important for predictive medicine. Moreover, our method was only limited to a small brain region, given the computational power available at the time.Project AimsThis project will develop an AI-based method and tool for segmenting iron and calcium accumulation throughout the whole brain and in its different forms (tissue deposition, brain microbleeds, superficial siderosis, and haemorrhagic transformations from ischaemic lesions) in a large sample of MRI images acquired from different patient groups, assess the degree of mineral accumulation in the areas segmented offering a proxy for insoluble iron/calcium concentration and degree of aggregation (i.e., clustering) in different subregions (also using AI methods), and validate the AI-based imaging computational assessments using complementary biomedical analysis methods in a sample of individuals with both brain MRI and tissue samples.Data and MethodologyThe student will use well-phenotyped data with carefully generated ground truth from studies conducted at the Centre for Clinical Brain Sciences (1,2) to develop the iron deposition assessment method, which will give as output differential probabilistic masks of various forms of iron deposits throughout the whole brain. The breadth of data available for the project includes routine clinical MRI, vascular function and blood-brain-barrier permeability measurements, clinical,demographic, and cognitive information from each of the studies’ participants (approximately 1200). Tissue samples from which derive iron concentration curves are from ~20 brains from the study on cognitive ageing which also has brain MRI acquired in different (i.e., five) assessment waves every three years (2). The tissue samples were imaged at 7T MRI and the co-supervisor has aligned both modalities (3). The co-supervisor of the project has experience in proteomics analyses in relation to small vessel disease, to discern which imaging phenotypes involve different forms of iron deposition.Therefore, preliminary data held by the co-supervisor of the project may be useful in further validating the developed method. More tissue-MRI pair samples have been also acquired from tissue banks, reaching a total of 80 samples (3).Once the AI assessment method is validated using the in-house data from different studies, MRI data from online repositories will be downloaded to test and re-train the AI model for increased robustness and reduced bias. Finally, and given the strong association between these deposits and dementia progression, we will upload the model to the National Safe Heaven to apply it to the National Scottish Registry MRI data to estimate dementia prediction accuracy, over the estimation achieved using currently available methods.(1) Clancy et al 2021 https://doi.org/10.1177/2396987320929617 (2) Taylor et al 2018 https://doi.org/10.1093/ije/dyy022 (3) Humphreys et al 2019 https://doi.org/10.1177/1747493018799962 Translational Potential and Expected ImpactThis project offers a rare opportunity of working in a clinically relevant theme to address a clinical need and work with a breadth of data from different modalities and nature. It goes beyond the conventional use of computational descriptors for validating the AI- based method, to use clinically relevant data to ensure its impact and further applicability in clinical research and practice. The project might enhance existing MRI instruments and methods by integrating Artificial Intelligence and has substantial scientific and commercial potential.Training and Development Outcomes for the StudentThe student will be trained on MRI mineral deposition identification and in all the current knowledge around it and exposed to real-world clinical and research neuroimaging work, as well as laboratory (biological, proteomics) work underpinning the clinical neuroimages. The student will also be trained in medical image processing methods, and exposed to commercial (industry), translational and research environments with emphasis in fair AI. At the end of the PhD it is expected that the student acquires a high level of knowledge on the theme and has developed a prototype that can be commercially viable as an add-on module for clinical and research MRI platforms.References(1) Ji Y, Zheng K, Li S, et al. Insight into the potential role of ferroptosis in neurodegenerative diseases. Front Cell Neurosci. 2022 Oct 27;16:1005182. https://doi.org/10.3389/fncel.2022.1005182 (2) Valdés Hernández M, Allerhand M, Glatz A, et al. Do white matter hyperintensities mediate the association between brain iron deposition and cognitive abilities in older people? Eur J Neurol. 2016 Jul;23(7):1202-9. https://doi.org/10.1111/ene.13006(3) Valdés Hernández, M., Ritchie, S., Glatz, A. et al. Brain iron deposits and lifespan cognitive ability. AGE 37, 100 (2015). https://doi.org/10.1007/s11357-015-9837-2 (4) Valdés Hernández Mdel C, Glatz A, Kiker AJ, et al. Differentiation of calcified regions and iron deposits in the ageing brain on conventional structural MR images. J Magn Reson Imaging. 2014 Aug;40(2):324-33. https://doi.org/10.1002/jmri.24348 (5) Glatz A, Bastin ME, Kiker AJ, Deary IJ, Wardlaw JM, Valdés Hernández MC. Automated segmentation of multifocal basal ganglia T2*-weighted MRI hypointensities. Neuroimage. 2015 Jan 15;105:332-46. https://doi.org/10.1016/j.neuroimage.2014.10.001 (6) Clancy U, Garcia DJ, Stringer MS, Thrippleton MJ, Valdés-Hernández MC, Wiseman S, Hamilton OK, Chappell FM, Brown R, Blair GW, Hewins W, Sleight E, Ballerini L, Bastin ME, Maniega SM, MacGillivray T, Hetherington K, Hamid C, Arteaga C, Morgan AG, Manning C, Backhouse E, Hamilton I, Job D, Marshall I, Doubal FN, Wardlaw JM. Rationale and design of a longitudinal study of cerebral small vessel diseases, clinical and imaging outcomes in patients presenting with mild ischaemic stroke: Mild Stroke Study 3. Eur Stroke J. 2021 Mar;6(1):81-88. https://doi.org/10.1177/2396987320929617 (7) Taylor AM, Pattie A, Deary IJ. Cohort Profile Update: The Lothian Birth Cohorts of 1921 and 1936. Int J Epidemiol. 2018 Aug 1;47(4):1042-1042r. https://doi.org/10.1093/ije/dyy022 (8) Humphreys CA, Jansen MA, Muñoz Maniega S, González-Castro V, Pernet C, Deary IJ, Al-Shahi Salman R, Wardlaw JM, Smith C. A protocol for precise comparisons of small vessel disease lesions between ex vivo magnetic resonance imaging and histopathology. Int J Stroke. 2019 Apr;14(3):310-320. https://doi.org/10.1177/1747493018799962 Document Research Overview Slides (2.72 MB / PDF) (17) Developing Novel Data-Driven Tools and Methodologies to Understand Inequalities in Maternity Vaccination Uptake in Scotland This PhD project aims to investigate inequalities in maternity vaccine uptake in Scotland using advanced health data science methods. Leveraging national electronic health records and the DataLoch Respiratory Registry, the study will develop computational models to identify patterns in vaccine delivery and access across demographic and clinical factors. Complementary qualitative research will explore maternal attitudes and healthcare delivery models, particularly among underserved groups. By integrating biomedical, clinical, and behavioural data, the project will create a novel, data-driven framework to inform targeted public health interventions and improve maternal and neonatal health outcomes across diverse populations.Supervisory teamTing Shi, Louisa Pollock and Cheryl GibbonsProject PartnersPublic Health ScotlandProject BackgroundVaccination during pregnancy is a well-established public health strategy that protects both mother and infant from potentially severe infectious diseases. Vaccines such as influenza, pertussis, and - most recently - respiratory syncytial virus (RSV) have been shown to significantly reduce disease burden in the infant period. In the UK, these vaccines are recommended for all pregnant individuals as part of routine antenatal care. However, uptake remains inconsistent. In Scotland, uptake varies substantially between NHS Health Boards and across demographic groups. Notably, the uptake of the maternal RSV vaccine was approximately 50% during its first year of introduction but ranged from 38.2% uptake in the most deprived group compared to 56.1% in the least deprived, a 17.9 percentage point difference, and ranging by NHS Health Board from 23.1% (lowest) to 57.6% (highest). This could highlight variation or gaps in vaccine delivery and engagement.Understanding the factors driving this variation is crucial for improving health outcomes and ensuring equitable access to care. Yet these drivers remain poorly understood, partly due to challenges in how maternity vaccination data are captured, standardised, and made available for analysis. Data are often fragmented across systems, with limited integration between vaccination records and clinical or sociodemographic information. Additionally, differences in service delivery models, access to care, and patient attitudes likely contribute to disparities but have not been comprehensively studied. This project addresses these gaps through the development of new tools and methodologies to enable more effective analysis and targeted intervention.Project AimsThis PhD project will develop innovative tools and methodologies to explore and explain variation in maternity vaccination uptake, with a focus on health inequalities and access. Specifically, our objectives are to:1. Design a new methodological pipeline that integrates and links multiple data sources, including health board-level vaccination records, service delivery data, and patient-level EHRs.2. Develop the visualisation tools to identify gaps in access and inform targeted interventions.3. Use qualitative methodologies, including interviews and focus groups, to capture maternal perceptions, beliefs, and barriers to vaccination, particularly in underserved groups.Data and MethodologyThis project will adopt a mixed-methods approach, integrating advanced computational modelling with qualitative research to investigate inequalities in maternity vaccine uptake. The primary quantitative component will involve the use of routinely collected, linked electronic health records (EHRs) accessed through Scotland’s robust data infrastructure, including the DataLoch Respiratory Registry and the electronic Data Research and Innovation Service (eDRIS).These datasets will be linked at the patient level using the Community Health Index (CHI) number to enable population-scale analyses of vaccine uptake in relation to gestational age, appointment timing, healthcare setting (e.g. hospital vs community), delivery staff (maternity vs immunisation teams), and key socio-demographic factors such as age, ethnicity, deprivation, and rurality. Machine learning models and statistical techniques (e.g. logistic regression, clustering, classification algorithms) will be used to identify patterns, high-risk subgroups, and potential intervention points.To complement the quantitative analysis, qualitative methods will explore maternal attitudes and barriers to vaccination. This will involve interviews and/or focus groups with pregnant individuals and healthcare providers, with a particular focus on underserved groups. Insights from this component will help interpret data patterns and support the development of equitable, context-sensitive recommendations.Patient and public involvement (PPI) will be integrated throughout the project. PPI members will contribute to shaping the research questions, ensuring relevance to public health priorities, and co-developing effective dissemination strategies to reach diverse audiences, including the general public, clinicians, and policymakers.Translational Potential and Expected ImpactThis project will deliver a replicable, data-driven framework for understanding and addressing inequalities in maternity vaccine uptake. By integrating computational modelling with qualitative insights and lived experience, the findings will inform tailored public health strategies and support NHS and government efforts to optimise vaccine delivery. The methodologies developed will be scalable across the UK and internationally, enabling targeted interventions for underserved groups. Embedding Patient and Public Involvement (PPI) ensures real-world relevance and impact. Ultimately, the project will contribute to more equitable, efficient, and responsive maternal vaccination programmes and inform national vaccination policy.Training and Development Outcomes for the StudentCore technical areas of learning will include epidemiology, health data science, computational modelling, machine learning and computational approaches in using large datasets from different sources and modalities as well as qualitative research, and science-policy translation. The student will develop or extend their programming expertise in programming languages, such as R or Python. We will encourage developing and sharing code for the wider scientific community through platforms such as GitHub. The student will have the opportunity to learn in an academic and public health setting, understanding the applied aspects and context of epidemiology and data analyses. Soft skills in scientific communication and collaboration will be fostered via the interdisciplinary supervisory team and participation in different conferences and through publications. (18) Efficient design of binders using surrogate models This PhD project addresses the complexity of designing therapeutic protein binders by developing a novel, interpretable abstract representation of proteins informed by Molecular Dynamics (MD) simulations. Current inverse design methods are challenged by the vast sequence-structure space and computational cost. Our primary aim is to create an abstract framework that significantly streamlines the design process, allowing for fast exploration of the protein sequence space to achieve target properties, including specific binding, stability, and desired immunogenicity. This data-driven abstraction, grounded in molecular dynamics, will capture essential features for accurate and efficient design. The approach will be validated using the therapeutically relevant PD-1 system with AstraZeneca. The aim is to overcome limitations in current design methodologies, paving the way for innovations in targeted drug delivery and biosensing.Supervisory teamKartic Subr and Chris Wood Project PartnersAstraZenecaProject BackgroundThe design of proteins using inverse methods plays a pivotal role in developing therapeutic binders—proteins engineered to attach to specific drug targets with high affinity. This PhD project aims to enhance binder design by integrating molecular dynamics (MD) simulations with simplified, interpretable representations of proteins to design highly specific binders. Protein folding, dictated by amino acid sequences, presents significant challenges due to its complexity, especially when designing proteins for specific interactions.To facilitate the design of large protein complexes with targeted binding capabilities, this research focuses on creating and validating simplified, interpretable representations of proteins informed by molecular dynamics simulations. These models will streamline the design processes, making it feasible to design sequences that achieve desired binding properties at a fraction of the computation. The project will utilize computational approaches to optimize these sequences, ensuring that the resulting proteins exhibit the necessary stability, specificity, and immunogenicity profiles. Ultimately, this work intends to produce novel protein binders with significant applications in therapeutics, diagnostics, and synthetic biology. By advancing the methodologies for protein design, this project seeks to overcome limitations in creating efficient binders, paving the way for innovations in targeted drug delivery and biosensing technologies.Project AimsThe primary aim for this project is to develop an abstract representation for proteins that will enable the design of specific therapeutic binders. The representation will be interpretable and explainable while also enabling fast exploration of the design space, based on target properties such as shape, dynamics, and functionality. By integrating molecular dynamics with data-driven abstraction, the framework will capture essential features required for accurate and efficient binder design. The approach will be tested using the PD-1 system, a well-known immune pathway with established therapeutic relevance in cancer immunotherapy.Data and MethodologyThe proposed methodology will focus on developing an abstract representation for protein binder design, using an anti-PD1 as a model system targeting the PD-1 receptor involved in immune checkpoint pathways.The research will collect a dataset of known PD-1 structures and existing binders, including peptide sequences, 3D structures, and binding affinities. The dataset will serve as the basis for training machine learning models. The approach involves developing simplified protein representations through techniques like autoencoders, which will distil essential structural features critical for effective binding. Molecular dynamics simulations will validate these models, testing their capacity to predict how novel sequences fold and interact with the PD-1 receptor. The simulations will assess the dynamics and potential energy landscapes of candidate peptides designed using this abstract representation.The project will develop search algorithms based on genetic algorithms to explore the design space, identifying peptide sequences optimized for binding PD-1. This computational search will focus on sequences predicted to exhibit high specificity, stability and immunogenicity profiles.For validation, the project will perform in silico experiments to screen these candidates, employing binding free energy calculations and docking simulations to estimate their affinity for the PD-1 receptor. Successful candidates will move to experimental validation, where peptides will be synthesized and tested at AstraZeneca, Cambridge (external partner) using Surface Plasmon Resonance and isothermal titration calorimetry, measuring real-world binding affinities and kinetics.The feedback loop, incorporating these experimental findings, will continuously refine the model and search algorithms. By integrating computational insights with empirical data, this iterative process aims to enhance the precision and effectiveness of designing PD-1 binders, contributing to advancements in cancer immunotherapy.Translational Potential and Expected ImpactThe proposed methodology offers a general framework for designing therapeutic protein binders using abstract representations. By representing proteins through simple fragment-based abstraction, the method enables broader exploration of conformational space while retaining interpretability. The added interpretability allows for a greater understanding of the mechanisms underlying binding and enables steering the design towards desired pharmacological profiles. Using PD-1 as a motivating example, the framework will demonstrate how simplified, explainable models can accelerate the discovery of selective immune checkpoint binders. Ultimately, this approach aims to shorten the path from computational design to effective therapeutics, improving outcomes for patients.Training and Development Outcomes for the StudentThe student will gain expertise in computational modelling, specifically in protein design and molecular dynamics simulations, enhancing their technical skills. They will develop proficiency in machine learning techniques for data analysis and representation design, directly applicable to modern drug discovery.Additionally, the student will learn to conduct interdisciplinary research, bridging computational methods with experimental validation. This includes collaborating with laboratories for in vitro testing, providing practical insights into experimental protocols.The student will also cultivate strong problem-solving abilities and the capacity to translate scientific findings into real-world applications, particularly in drug design and therapy development. Effective communication skills will be enhanced through reporting and presenting research findings to diverse audiences. Furthermore, the student will gain a thorough understanding of ethical research practices, preparing them for responsible scientific leadership in their future career.ReferencesCastorina LV, Wood CW, Subr K. From Atoms to Fragments: A Coarse Representation for Functional and Efficient Protein Design. bioRxiv; 2025. DOI: 10.1101/2025.03.19.644162.Crowdsourced Protein Design: Lessons From the Adaptyv EGFR Binder CompetitionTudor-Stefan Cotet, Igor Krawczuk, Filippo Stocco, Noelia Ferruz, Anthony Gitter, Yoichi Kurumida, Lucas de Almeida Machado, Francesco Paesani, Cianna N. Calia, Chance A. Challacombe, Nikhil Haas, Ahmad Qamar, Bruno E. Correia, Martin Pacesa, Lennart Nickel, Kartic Subr, Leonardo V. Castorina, Maxwell J. Campbell, Constance Ferragu, Patrick Kidger, Logan Hallee, Christopher W. Wood, Michael J. Stam, Tadas Kluonis, Süleyman Mert Ünal, Elian Belot, Alexander Naka, Adaptyv Competition OrganizersbioRxiv 2025.04.17.648362; doi: https://doi.org/10.1101/2025.04.17.648362North B, Lehmann A, Dunbrack RL Jr. A new clustering of antibody CDR loop conformations. J Mol Biol. 2011 Feb 18;406(2):228-56. doi: 10.1016/j.jmb.2010.10.030. Epub 2010 Oct 28. PMID: 21035459; PMCID: PMC3065967. (19) Embedded AI for neurodegenerative disease monitoring Accurate tracking of symptoms and progression of multiple sclerosis is essential for drug discovery and disease management. However, current measurement tools are tedious, prone to bias, and do not reflect what people with MS experience. Many symptoms remain invisible and unrecognised. Of these, fatigue is the most debilitating and reported by most patients. Yet no methodology to measure and tackle fatigue exists.Supervisory teamPaul Patras and Thanasis TsanasProject PartnersHoffmann-La RocheProject BackgroundMultiple Sclerosis (MS) is a neurodegenerative autoimmune disease that affects approximately 3 million people worldwide. The disease primarily affects the central nervous system, with the immune system attacking the myelin sheath around nerve cells. The symptoms of MS very broadly and can have debilitating effects. This includes loss of vision, numbness, mobility problems, cognitive decline, etc., which increase as the disease progresses. Recent studies report that the annual cost of MS-related disability exceeds per capita gross domestic product (GDP), which confirms the major societal cost of this condition.Project Aims1) develop new fine-grained data-driven fatigue monitoring methods that build upon detailed telemetry that will be gathered using wearable devices and personal living space sensors (motion, pressure, LiDAR, etc.)2) data labelling using correlation analysis with image biomarkers, fluid biomarkers, and patient-reported outcomes3) baselining with volunteer patients using medical-grade wearable devices4) develop deep learning models that can analyse multi-modal spatio-temporal data to detect early disease-specific symptoms, health improvements, or decline.5) develop lightweight compact/approximate data structures and deep learning models that can be deployed on computationally-constrained devicesTranslational Potential and Expected ImpactUltimately the outcomes of the project seek to improve fatigue management and potentially the efficacy assessment of new drugs. Long-term, the methods developed may assist consultant neurologists in exploring personalised treatment and improve long-term patient outcomes. (20) Using Data-Driven Methods and AI to Uncover the Causes and Consequences of Youth Depression Trajectories through Longitudinal Data Depression is a complex, multifaceted disorder with profound developmental, social, and biological impacts. This project applies data-driven and AI methods to longitudinal youth data to uncover key factors shaping short- and long-term depression pathways. It tackles three challenges: (1) the heterogeneity of depression, including distinct subtypes such as persistent and adolescent-onset forms; (2) identifying the most important predictors among vast biological, psychological, and social data; and (3) capturing short-term changes using ecological momentary assessment and wearable technology. Through AI and machine learning, the project aims to characterise heterogeneity, identify predictive factors, and inform personalised, real-time interventions for youth depression.Supervisory teamAlex Kwong, Ahmar Shah and Heather WhalleyProject PartnerseMoodieeMoodie is a digital mental health research platform which enables clinical and academic research groups to conduct ecological momentary assessment studies coupled with passive sensing of behavioural markers. The student will work with data scientists and digital engineers at eMoodie to 1) develop methods and analytical techniques in signal processing and deep learning, 2) derive digital biomarkers for prediction and machine learning and 3) develop algorithms that could be implemented in later digital interventions.Project BackgroundDepression is a complex, multi-dimensional disorder with profound social, psychological, and biological impacts across the lifespan. Despite major research efforts, its prevalence continues to rise, reflecting key challenges: defining depression consistently, identifying the most relevant risk and protective factors, and distinguishing between long- and short-term trajectories. This project applies artificial intelligence (AI) and data-driven methods to longitudinal youth data to uncover how genetic, environmental, and psychological factors shape depression pathways over both years and days.The research addresses three key challenges. First, depression is heterogeneous, comprising distinct subtypes—such as persistent and adolescent-onset forms—with differing causes and outcomes. Longitudinal data can capture these life-course trajectories, highlighting how genetic vulnerability, adversity, and socioeconomic disadvantage contribute to persistence. Second, vast neurobiological, psychological, and social datasets remain underutilised; AI methods can identify which factors best predict specific depression subtypes, supporting personalised prevention and treatment. Third, depression fluctuates over short timescales, which traditional studies rarely capture. Integrating ecological momentary assessment (EMA) via smartphones and wearable technologies allows real-time tracking of symptoms, sleep, and behaviour.By combining AI with intensive longitudinal and EMA data, this project aims to advance understanding of youth depression, enhance prediction, and support development of personalised, time-sensitive interventions.Project Aims1. Predict different long-term depression trajectories from a wealth of neuro-bio-psycho-social life course data using AI and ML, followed by replication and testing across multiple datasets. See figure 1.2. Predict which factors influence short-term depression trajectories and specific depression symptoms from data collected on smartphones and wearables (sleep, exercise, screen time and more) using AI and ML.3) Use information from aims 1 and 2 to either 1) test new data/models and 2) develop a digital intervention to be implemented into a follow up of participants (i.e., a digital intervention that targets sleep, or screen time that can be implemented in the EMA design).Data and MethodologyThis PhD will use three existing datasets that have rich neuro-bio-psycho-social data spanning childhood to early adulthood. They include: the Avon Longitudinal Study of Parents and Children (ALSPAC; n=14K), the Adolescence Brain Cognitive Development Study (ABCD; n=12K) and the Twins Early Development Study (TEDS; n=11K). These are secondary data studies with data available to analyse now.In ALSPAC and TEDS, we have collected daily mental health and behavioural data (via smartphone/wearables) in a sub-sample of 800 individuals (with all the previous life-course data) which are now available for analyses. In 2026, we will complete a follow up of in a smaller sub-sample of these individuals using smartphones/wearables, with the option of either testing new data/models or implement a a data-driven informed intervention, both derived from the earlier results and collaboration with the study partner eMoodie.The PhD will use innovative data-driven methods including machine learning, deep learning and AI prediction analyses to identify the most robust markers and best combinations of markers for both long- and short-term depression trajectories. These could then be adapted or tested in a digital intervention using the EMA/wearables design in a follow up study. The student will also work with patients with lived experience to ensure results are transferable and having the most impact possible.The PhD will be highly interdisciplinary, covering themes from epidemiology, statistics, social science, bioinformatics, genetics, public health and computational psychology. The student will build upon ongoing work in this area.Translational Potential and Expected ImpactThis project has strong translational potential to transform how depression is understood, monitored, and treated. By combining AI, ecological momentary assessment, and wearable technologies, it will identify digital biomarkers and predictive algorithms for personalised, real-time interventions. Collaboration with eMoodie Labs enables direct application in digital health, developing scalable, data-driven tools. Working closely with individuals with lived experience ensures the research remains ethical, person-centred, and clinically relevant. Insights from large longitudinal datasets will inform prevention, screening, and early intervention, bridging academia, technology, and healthcare to advance youth mental health globally.Training and Development Outcomes for the StudentThis project offers extensive interdisciplinary training across institutions, including genomics, trajectory modelling, epidemiology, MRI processing, AI, and data science, with opportunities for lab visits and collaboration. Students will use R and Python with a focus on reproducibility, open science, and co-production with people with lived experience. Working with unique large-scale datasets, the student will gain transferable skills in data science, digital interventions, and digital engineering. Collaboration with an industry partner will provide hands-on experience developing and testing digital mental health tools and applying advanced data-driven models to complex longitudinal data, preparing the student for impactful interdisciplinary research careers.References1. Malhi, G.S. and J.J. Mann, Depression. Lancet, 2018. 392: p. 2299-2312.2. Thapar A, et al., Depression in young people. Lancet, 2022. 400(10532): p. 617-631.3. Nguyen, T.D., et al., Genetic heterogeneity and subtypes of major depression. Mol Psychiatry, 2022. 27(3): p. 1667-1675.4. Grimes, P.Z., et al., Genetic Architectures of Adolescent Depression Trajectories in 2 Longitudinal Population Cohorts. JAMA Psychiatry, 2024.5. Kwong, A.S.F., et al., Genetic and Environmental Risk Factors Associated with Trajectories of Depressive Symptoms from Adolescence to Young Adulthood. JAMA Netw Open, 2019. 2(6).6. Zhao, Y., et al., The brain structure, immunometabolic and genetic mechanisms underlying the association between lifestyle and depression. Nature Mental Health, 2023. 1(10): p. 736-750.7. Xiang, Q., et al., Prediction of the trajectories of depressive symptoms among children in the adolescent brain cognitive development (ABCD) study using machine learning approach. J Affect Disord, 2022. 310: p. 162-171.8. Jiang, T., J.L. Gradus, and A.J. Rosellini, Supervised Machine Learning: A Brief Primer. Behav Ther, 2020. 51(5): p. 675-687.9. Speyer, L.G., et al., The role of moment-to-moment dynamics of perceived stress and negative affect in co-occurring ADHD and internalising symptoms. J Autism Dev Disord, 2023. 53(3): p. 1213-1223. (21) Development of data analysis pipeline and investigation of neurocognitive mechanisms underlying gaming and gamification in order to improve mental health and well-being Online gaming has become increasingly prevalent, yet research into the effects and impact of long-term gaming on mental health is limited and often lacks an interdisciplinary focus. This project, in collaboration with HealthyGaming, aims to develop a data analysis pipeline that focuses on individual traits and states, game-related decision making, and mental health outcomes. Employing techniques from data science (including machine learning) and neurocognitive sciences (encompassing questionnaires, computational modelling, biometrics, and neuroimaging) we aim to understand what determines the mental health outcomes in gaming and gamification settings. This could lead to proposed interventions with the objective of improving those outcomes.Supervisory teamGedi Luksys and Robin HillProject PartnersHealthyGamingProject BackgroundDue to their increasing popularity, online platforms that act as information gateways across domains such as news, social media and gaming have been gaining prominence in research, helping to better understand decision making patterns, unravel their neurocognitive mechanisms, and determine impact on mental health. Gaming-related decision making takes place at many levels: from a decision to initiate playing a game (and causes as well as triggers of that) to further decisions to continuously invest time, effort, and sometimes money into the play, to decisions within the games such as team interaction and building (as many games are team-based) and many game-specific decisions that have impact on competitive outcomes. Similar dynamics occur on news and social media platforms that employ gamification in a substantial way.In order to understand how such decision making links to mental health, computational psychiatry focuses on building models of decision making, fitting them to the observed behaviours and linking parameters and variables of such models to biometric markers (e.g. emotional expressions), neuroimaging markers (e.g. brain activity patterns) and responses from standard personality and mental health questionnaires. Such approach, if effective, can predict neuropsychiatric conditions in a more cost efficient way than standard clinical assessments.Project AimsWe aim understand how individual traits and states can drive gaming-related decisions, what is the impact of competitive feedback in driving continued involvement, and how all these actions can lead to beneficial or detrimental mental health outcomes in the medium and the long term for individuals. In addition, we aim to understand neurocognitive mechanisms underlying such decisions and what is referred to as “suspension of disbelief”. Finally, we want to find strategies to improve mental health outcomes which could include advising both individuals on better paths for them as well as communities on more sustainable recruitment and engagement strategies.Data and MethodologyBuilding on the supervisors’ experience with news-related decision making and human information processing (through the development of MyNewsScan news aggregator platform, mynewsscan.eu), computational modelling and cognitive science, and in conjunction with the industrial partner’s (HealthyGaming) experience with health-related gaming, this project will investigate the impact of gaming and gamification on mental health. In particular, it will focus on incentivisation and decision making (both in-game and in digital adjacent gaming environments).Our research will involve case studies of gaming that may include both amateur and professional gamers, in single and multiplayer games, as well as gamification in non-gaming platforms, such as MyNewsScan. In cooperation with HealthyGaming, the student will analyse core game and gamification aspects and dynamics, as well as the structure of selected gaming communities. We will also have gaming and gamification metadata (covering the usage and access of the games) as well as in some cases in-game data which we could analyse using data science approaches, including machine learning.In coordination with mental health professionals and our gaming partners (including but not limited to various esports events around the world), we will use standard personality and mental health questionnaires combined with gaming related questionnaires. We will explore the roles of modulators such as stress, sleep, and motivation on decision making. We will also develop computational psychiatry models (such as reinforcement learning, motivation and drift diffusion models) that could provide insights into key parameters underlying game and gamification-related decision making, and will aim to validate them using mental health datasets. Finally we will recruit a sample of gamers in Edinburgh whose decision making could be studied more in depth in the lab using neurocognitive techniques such as collection of biometrics (e.g. eye tracking, heart rate, pupil dilation, skin conductance and emotional expressions) and neuroimaging (particularly EEG) data.Translational Potential and Expected ImpactOverall, our research effort will be beneficial towards identifying games that can best be used for psychotherapeutic purposes, thereby improving well-being of millions of gamers around the world. Through our computational psychiatry and cognitive neuroscience efforts, we also aim to develop effective methodologies how to use gaming and gamification-related data to predict mental health patterns and outcomes, which could then lead to a set of proposed interventions with the objective of improving those outcomes. Eventually our work may help improve the management of gaming addiction as well as gambling disorders, device, social media addictions, and AI companionship dependency.Training and Development Outcomes for the StudentWe expect that a successful PhD student will develop a data analysis pipeline, making use of existing data and data collection opportunities and collaborations, synthesising different types of data (such as questionnaires, game and gamification metadata, in-game data) and employing effective data analysis tools (machine learning or computational modelling) to gain insights. The student will also be involved in neurocognitive experiments that will aim to link our behavioural data analysis pipeline to neural substrates and mental health patterns. Due to methodologically diverse and highly collaborative nature of the project, it will provide numerous technical development, entrepreneurial training and networking opportunities.ReferencesHuckvale et al., “Toward clinical digital phenotyping: a timely opportunity to consider purpose, quality, and safety”, npj Digital Medicine 2019https://www.nature.com/articles/s41746-019-0166-1 Aeberhard et al., “Introducing COSMOS: a Web Platform for Multimodal Game-Based Psychological Assessment Geared Towards Open Science Practice”, Journal of Technology in Behavioural Science 2019https://link.springer.com/article/10.1007/s41347-018-0071-5 Paquin et al., “Trajectories of Adolescent Media Use and Their Associations With Psychotic Experiences”, JAMA Psychiatry 2024https://jamanetwork.com/journals/jamapsychiatry/fullarticle/2817594 Vosoughi et al., “The spread of true and false news online”, Science 2018https://science.sciencemag.org/content/359/6380/1146.full Kramer et al., “Experimental evidence of massive-scale emotional contagion through social networks”, PNAS 2014https://www.pnas.org/content/111/24/8788 Strasser et al. “Glutamine-to-glutamate ratio in the nucleus accumbens predicts effort-based motivated performance in humans”, Neuropsychopharmacology 2020https://www.nature.com/articles/s41386-020-0760-6 Luksys et al., “Stress, genotype and norepinephrine in the prediction of mouse behavior using reinforcement learning”, Nature Neuroscience 2009https://www.nature.com/articles/nn.2374 (22) Development and validation of homecage behavioural biomarkers to improve preclinical translation in laboratory rodents Preclinical drug research and development relies heavily on animal models, particularly for central nervous system (CNS) indications. However, classic behavioural assays often fail to translate to the clinic. Home Cage Analysis systems have been developed to provide improved welfare (refinement) by replacing stressful isolated tests with continuous measurements. Supported by recent evidence, the key hypothesis behind this research is that longitudinal measures such as those collected in home cage monitoring align more closely with clinical outcomes. This project will explore, develop and evaluate computer vision and machine learning methods to extract robust behavioural biomarkers from large-scale longitudinal rodent home-cage datasets.Supervisory teamDouglas Armstrong and Michael CamilleriProject PartnersActual AnalyticsProject BackgroundDrug research and development relies heavily on the use of laboratory animal models at key stages in the pipeline to test for efficacy and safety. Both of these attempt to translate the physiology including behaviour of the animal model to the clinical condition in humans. Successful translation across species is critically dependent on identifying relevant biomarkers. In some cases molecular biomarkers can be identified and these have led to significant advances in the replacement of animals with innovative in vitro cellular and/or molecular assays. However for CNS indications (diseases) and in CNS safety indications this still relies heavily on behaviour.Traditional behavioural observational assays produce snapshot data in carefully controlled stimulus-response scenarios [1] that does not take into account the entire richness of their behaviour [2], is often influenced by the presence of researchers and does not translate well to behaviour “in the wild”. For this reason, there has been a recent shift towards long-term analysis of animals in their home-cage [3]. For example, in a recent safety pharmacology study, the continuous behavioural measures extracted from home cage data correlated much more closely with adverse clinical outcome than the traditional tests [4].Project Aims- Explore, develop and evaluate methods to extract behavioural biomarkers from longitudinal rodent homecage data. These could be single behaviours eg. grooming, seizure or, more likely, complex interactions of multiple behaviours.- Correlation of behavioural biomarkers with clinical outcomes using a mix of historical end points and collaboration with ongoing research with collaborators.- Test the hypothesis that continuous home cage derived behavioural biomarkers have improved translational accuracy in pharmaceutical R&D.Data and MethodologyWe do not perform laboratory studies ourselves rather we collaborate with end-users. We have a wide range of data in-hand. This includes datasets collected at academic research institutions through to groups in the pharmaceutical industry. We have permission to use these data for research and development, we have the active engagement of these end users and we have the agreement in principle to publish along with these users any new findings. Essentially we have an excess of data in place, freedom to operate and favourable agreements to co-publish. Additional datasets are also now in the public domain with appropriate public licenses.In addition we have well established collaborations with a number of pharmaceutical research companies (e.g. [4]) as well as academic users (e.g. [5]) where we can get access to new datasets and validate progress ‘in the wild’. For all of these collaborations we have general agreements in principle for student/research access and co-publishing of research findings. For very specific data access we may need to update agreements but this is no significant risk to the project.During the project we will explore methods at the intersection of computer vision and machine learning to assess which are best suited to extracting behavioural biomarkers. This is a rapidly moving field but we will build on existing algorithms/methods that we have developed for identifying [6] and classifying high-level behaviours of group housed mice [7].Translational Potential and Expected ImpactThe project and its underpinning hypothesis are fundamentally translational. If successful we will identify new preclinical biomarkers that correlate better with the current traditional measures [8]. While there is room to explore, a possible example would define and validate a CNS safety liability biomarker from continuous home cage data that was more accurate that the classical functional observational battery.Training and Development Outcomes for the Student- The role and application of preclinical research in laboratory animal models.- The application of new methodologies to promote 3Rs, in particular Refinement- Computer vision methods for animal identification and tracking.- Applied AI/ML methodologies for behaviour analytics.- Development of new methodologies and approaches for the definition of digital biomarkers for translational drug research and development.References[1] P. Van Meer and J. Raber, “Mouse behavioural analysis in systems biology.” The Biochemical Journal, vol. 389, no. Pt 3, pp. 593– 610, 2005.[2] A. Gomez-Marin and A. A. Ghazanfar, “The Life of Behavior,” Neuron, vol. 104, no. 1, pp. 25–36, 2019.[3] S.D.M.Brown and M.W.Moore,“The International Mouse Phenotyping Consortium: past and future perspectives on mouse phenotyping.” Mammalian genome : official journal of the International Mammalian Genome Society, vol. 23, no. 9-10, pp. 632–640, 2012.[4] Sillito et al in press. Rodent home cage monitoring for preclinical safety pharmacology assessment: results of a multi-company validation evaluating nonclinical and clinical data from three compounds. Frontiers in Toxicology in press.[5] Bains et al. Analysis of Individual Mouse Activity in Group Housed Animals of Different Inbred Strains Using a Novel Automated Home Cage Analysis System. Front. Behav. Neurosci., 10 June 2016 https://doi.org/10.3389/fnbeh.2016.00106[6] Camilleri, M.P.J., Zhang, L., Bains, R.S. et al. Persistent animal identification leveraging non-visual markers. Machine Vision and Applications 34, 68 (2023).[7] Camilleri, M.P.J., Bains, R.S. & Williams, C.K.I. Of Mice and Mates: Automated Classification and Modelling of Mouse Behaviour in Groups Using a Single Model Across Cages. Int J Comput Vis 132, 5491–5513 (2024).[8] Baran et al. Emerging Role of Translational Digital Biomarkers Within Home Cage Monitoring Technologies in Preclinical Drug Discovery and Development. Front. Behav. Neurosci., 14 February 2022. https://doi.org/10.3389/fnbeh.2021.758274 (23) Generative Models for Off-Target Aware Novel Drug Design Off-target action is a major challenge in novel drug design; molecules that bind with the target protein are also likely to bind with other similar proteins and introduce unintended side effects. In this project, we will develop generative models that incorporate the requirement of not binding with off-target proteins. This model will be combined with a pipeline of interpretable analysis to help medicinal chemists understand structural features that reduce off target binding. The project has been co-developed with Oxford Drug Design, who will be the industrial partner and advisor on the project.Supervisory teamRik Sarkar and Chris WoodProject PartnerOxford Drug DesignProject BackgroundOff-target actions, where a drug binds with proteins in addition to its intended target is a major challenge across multiple domains of drug design as they can cause toxicity and side effects. For example, the inhibition of the hERG ion channels by small molecule drugs is closely associated with lethal cardiac arrhythmia. [1]. Another important and well-known area of off-target action is the family of kinase proteins that have a key role in regulating cellular processes, and are often therapeutic targets to treat diseases from cancer to autoimmune disorders. The similarity of kinase targets makes them particularly susceptible to unintended inhibition e.g. cardiotoxicity mediated by off-target Inhibition of AMP-activated protein kinase by sunitinib [2] or cardiovascular toxicity associated with ponatinib due to off-target kinase inhibition [3].Increasingly, generative models are used to design candidate drug molecules in large numbers, but rather than incorporating off target information in the generative sampling method itself, extensive post-hoc filtering is used to filter them. This process does not ensure that a suitable outcome will be produced. It is also difficult to apply where detailed structures of the off-target proteins are not known.Project AimsThe core challenge can be called a negative design problem – the drug must fit the target, while reducing the likelihood of fit with off-targets. The objective in this project is to incorporate this negative design objective into a generative model itself, to maximize the probability that the generated molecule in fact avoids off-targets. While many models assume static shapes, an accurate design process must take into consideration the flexibility and changing conformations of molecules to minimize the likelihood of off-target actions.Training and Development Outcomes for the StudentThe student in this project will have the opportunity to learn a range of different areas from drug design to Generative AI and topological data analysis. They will have close supervision at Edinburgh, as well as Dr Asaad and Prof. Cooper at Oxford Drug Design, where they will be able to attend an internship and get a closer exposure to industrial research. The research will start with aspects that are easier to implement for the student’s aptitude, and increase in complexity with the student’s experience. The techniques and approaches can be customised to the student's interest.References[1] Michael C. Sanguinetti & Martin Tristani-Firouzi. hERG potassium channels and cardiac arrhythmia. Nature volume 440, pages 463–469 (2006)[2] Kerkela, R., Woulfe, K.C., Durand, J.-B., Vagnozzi, R., Kramer, D., Chu, T.F., Beahm, C., Chen, M.H. and Force, T. (2009), Sunitinib-Induced Cardiotoxicity Is Mediated by Off-Target Inhibition of AMP-Activated Protein Kinase. Clinical and Translational Science, 2: 15-25. https://doi.org/10.1111/j.1752-8062.2008.00090.x[3] Jonathon R. Green, Prathap Kumar S. Mahalingaiah, Sujatha M. Gopalakrishnan, Michael J. Liguori, Scott W. Mittelstadt, Eric A.G. Blomme, Terry R. Van Vleet,Off-target pharmacological activity at various kinases: Potential functional and pathological side effects, Journal of Pharmacological and Toxicological Methods,123, 2023, 107468, https://doi.org/10.1016/j.vascn.2023.107468.[4] Zaixi Zhang et al., FlexSBDD: Structure-Based Drug Design with Flexible Protein Modeling, NeurIPS 2024.[5] Ballester, P.J. and Richards, W.G. (2007), Ultrafast shape recognition to search compound databases for similar molecular shapes. J. Comput. Chem., 28: 1711-1723.[6] Armstrong MS, Morris GM, Finn PW, Sharma R, Moretti L, Cooper RI, Richards WG. ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics. J Comput Aided Mol Des. 2010 Sep;24(9):789-801[7] Ali, Dashti & Asaad, Aras & Jiménez, María & Nanda, Vidit & Paluzo-Hidalgo, Eduardo & Soriano-Trigueros, M. (2023). A Survey of Vectorization Methods in Topological Data (24) SENTINEL: Multi-modal dynamic prediction for proactive, personalised IBD care SENTINEL will integrate electronic health records, patient-reported outcomes, and wearable signals to predict inflammatory bowel disease outcomes, updating in response to new measurements being observed. The student will introduce multi-omics in collaboration with our industrial partner, Nightingale Health, to further improve predictions. Embedded within Edinburgh’s IBD service and the Lees and Vallejos data-science research groups, the work delivers an interpretable pipeline ready to power SENTINEL’s proactive, EHR-adjacent decision support. The models will be trained on population based local NHS data (Lothian IBD Registry fully integrated with DataLoch; n=10,000 IBD patients) and Danish national registries.Supervisory teamCharlie Lees and Catalina VallejosProject PartnersNightingale HealthRole of the external partner- Provide finger-prick sampling kits and high-throughput NMR metabolomics/protein panels for use during SENTINEL onboarding- Advise on assay QC, feature engineering and integration into multi-modal models.- Host a short placement for the student focused on data standards and translational pipelines, and industry experience.- Co-supervise on multi-omic integrationProject BackgroundIBD is associated with unpredictable flares, unscheduled admissions, major surgeries and delays in therapy optimisation. In Lothian, >10,000 IBD patients are registered and two decades of longitudinal biomarker data (such as CRP and faecal calprotectin) are available, alongside rich phenotyping data, yet these data are siloed and rarely available at the point of clinical care. We have a world-leading IBD service, yet all-too-often IBD care is reactive and crisis-driven.SENTINEL is our clinician-led response: an EHR-adjacent service that ingests live laboratory feeds to deliver risk predictions and prompts for nurse-led triage and treatment optimisation. A patient companion app will show both historical and recent trends in disease behaviour, whilst predicting future disease course and collecting patient reported outcomes and hyper-personalised information and sign-posting.The analytics are being built by three full-time post-doctoral data scientists within the Lees IBD research team and the Vallejos biomedical data-science group, using a landmarking framework for dynamic risk prediction, with latent‑class mixed models to capture heterogeneous biomarker trajectories.Our prior work shows that longitudinal faecal calprotectin/CRP profiles characterise disease-course heterogeneity and predict disease progression, outlining modern principles for dynamic IBD monitoring.This PhD addresses a translational gap: integrating additional modalities such as patient-reported outcomes, wearable streams and finger-prick multi-omics (Nightingale Health) - to enhance prediction of flares, admission and surgery, and to extend the modelling to non-IBD outcomes with health-system relevance.Embedded in Edinburgh’s IBD service and the Lees and Vallejos groups, the student will deliver an interpretable pipeline to assist proactive clinical decision support.Project AimsDevelop an interpretable, EHR-adjacent pipeline that integrates longitudinal laboratory test results (such as CRP and faecal calprotectin), prescribing, patient-reported outcomes, wearables and finger-prick multi-omics to generate dynamic risk predictions for flare, admission and surgery.Quantify incremental value, calibration, fairness and utility of each modality and algorithmic choice, and externally validate across Lothian and Danish datasets.Extend the pipeline to non-conventional IBD outcomes (e.g., cardiovascular events, venous thromboembolism, mental-health crises), and produce a decision-curve playbook mapping risk thresholds to nurse-led actions in SENTINEL for equitable, safe deployment.Data and MethodologyDatasets. The project will use (i) the Lothian IBD Registry (LIBDR) with two decades of routine laboratory data (CRP and faecal calprotectin, among others), prescribing and outcomes, linked via DataLoch; (ii) patient-reported outcomes and wearables collected through the SENTINEL app (pilot phase begins January 2026); and (iii) external validation cohorts from Danish national registries curated within PREDICT.Finger-prick multi-omics (Nightingale Health NMR metabolomics/proteins) will be layered as samples are processed under existing collaborations.Core modelling. We will implement dynamic prediction via landmarking coupled to latent-class mixed models (LCMMs) to capture individual and subgroup trajectories in longitudinal biomarker measurements, then augment with additional clinical features and patient-reported outcomes. A generic landmarking framework is already built by our team as a ready-to-use software which accommodates flexible modelling strategies, including the option to incorporate modern deep-learning based approaches. However, it lacks functionality for multimodal data integration.End points. Primary: time-updated risk of flare, unplanned admission and IBD surgery within one year from the prediction time-point; Secondary: steroid exposure, quality of life decline and non-IBD outcomes (cardiovascular events, venous thromboembolism, mental-health crises).Evaluation. Internal-external cross‑validation using metrics of discrimination (time‑dependent AUC/PR‑AUC), calibration, and net benefit (decision curves). We will quantify the incremental value of each modality (e.g. change in time-dependent AUC or Brier score) and differential performance by age, sex, deprivation and ethnicity within a ML fairness framework.Deployment-readiness. Models will be packaged as services with audit logging, uncertainty quantification, data-shift monitoring and fallback rules for missingness, so outputs can flow to the EHR-adjacent SENTINEL portal for nurse-led triage and clinical review. Generic code (e.g. to incorporate the fairness evaluation) will be added to our landmarking software as an open-source tool.Translational Potential and Expected ImpactOutputs will be models and code that plug into SENTINEL’s EHR-adjacent portal to drive risk-prioritised lists and nurse-led triage. In Lothian, the service aims to increase 12-month flare-free status by ≥15 percentage points, cut IBD bed-days by ≥20%, and halve time to treatment optimisation, with projected net savings of ~£800 to £2000 per patient-year.Because the pipeline relies on routine labs that are widely available in healthcare settings. Through the use of optional patient inputs, it can also benefit those who rarely engage with the health system directly while improving with patient-reported outcomes (e.g. IBD symptoms; depression and anxiety scores) and wearables.External validation and the Nightingale Health partnership provide a path from PhD outputs to adoption.Training and Development Outcomes for the StudentThe student will gain skills in: statistical and ML methods for longitudinal and time-to-event data analysis (landmarking, LCMMs, joint models); multi-omic integration; predictive modelling and evaluation, and ML fairness; and post-deployment monitoring and model updating. Clinical immersion will occur within the Edinburgh IBD service and weekly meetings with the Lees and Vallejos groups (co-supervision/mentorship).They will complete CDT training, present at IBD/data-science meetings, and pursue papers and software releases. A placement with Nightingale Health will provide exposure to high-throughput NMR workflows and interfaces between omics and care pathways, aligned to SENTINEL.ReferencesPlevris N, Lees CW. Disease Monitoring in IBD: Evolving Principles and Possibilities. Gastroenterology 2022. (Framework for integrative monitoring/targets.)Constantine-Cooke N et al. Longitudinal Faecal Calprotectin Profiles Characterize Disease Course Heterogeneity in Crohn’s Disease. Clin Gastroenterol Hepatol 2023. (Dynamic trajectories underpin predictions.)Constantine-Cooke N et al. Large-scale clustering of longitudinal FCP and CRP profiles in IBD. medRxiv 2025. (Joint FCP/CRP modelling.)Ebert AC et al. IBD and risk of >1,500 comorbidities. Am J Gastroenterol 2025. (Motivates non-IBD outcomes.)Hracs L et al. Global evolution of IBD across epidemiologic stages. Nature 2025. (Health-system context.)Elford AT et al. Twenty-Year Trends in Colectomy and Advanced Therapy Prescribing in Lothian. AP&T 2025. (Local real-world trends.) (25) Human behaviour models as digital biomarkers for cognitive function Recent advances in AI-based modelling of human behaviour enable a novel, flexible and potentially more quantitatively precise method for measuring human cognitive function, detecting cognitive decline, and improving assistive technologies. This PhD project combines mechanistic theories of human cognitive function with deep reinforcement learning to develop models capturing how cognitive decline manifests in observable human behaviour, enabling new types of digital biomarker. The project provides the student with cross-disciplinary experience across both cognitive neuroscience and machine learning, as well as direct opportunities for translation and impact.Supervisory teamSubramanian Ramamoorthy, Susan Shenkin and Gustav MarkkulaProject PartnerNHS BordersProject BackgroundResearch has shown that cognitive markers for conditions such as mild cognitive decline and Alzheimer’s disease can be obtained from computerised tests and “serious games” or directly from e.g., in-home activity or smartphone data1–3, but so far the design of these evaluation methods has been largely heuristic. If we had human behaviour models which accurately represent how cognitive decline affects observable behaviour in these various tasks, this could unlock a dramatic improvement in test specificity. In addition, such models would be beneficial in assistive technologies, such as adaptive interaction and dialogue-based prompting systems, to infer cognitive function directly from an unfolding interaction, and to guide the actions taken by a combination of the person and the assistive system.Emerging results in cognitive science and machine learning have recently enabled an approach to modelling of human behaviour with high fidelity across a variety of task contexts, by combining mechanistic modelling of human perceptual, cognitive, and motor limitations with deep reinforcement learning4–6. This approach, with its emphasis on cognitive modelling of human limitations, and its overall flexibility, holds promise for modelling and measuring how human behaviour is affected by cognitive decline, and could thus be used to improve tools for detecting such decline and better supporting individuals experiencing it.Project AimsThe overall aim of this PhD project will be to investigate the use of advanced human behaviour models as a potential marker for useful cognitive function (e.g. executive function and working memory). More specifically, human behaviour models will be developed and tested for a few selected tasks, to investigate to what extent the models can capture empirically observed effects of variations in cognitive function on the human behaviour in these tasks. These models will then be used for estimating cognitive function directly from observed human behaviour in these tasks.Translational Potential and Expected ImpactThe developed models and methods have direct applicability for measuring and monitoring cognitive function. If the results are positive, steps toward translation and impact can be taken.As mentioned, a second potential application is integration of the models in assistive technologies, where the models can be used both to infer user cognitive function, and to optimise adaptive interfaces for prompting and assistance. This is of timely interest to practitioners involved in dementia and other age-related conditions.Training and Development Outcomes for the StudentThis project will allow the student to build a rare cross-disciplinary skillset, across the state of the art in both cognitive neuroscience and machine learning.The student will also benefit from experiencing research which is both cutting edge scientifically, while also having direct links to a specific patient group and development of method which can be of near-term value for these end users.References1. Ding, Z., Lee, T. & Chan, A. S. Digital Cognitive Biomarker for Mild Cognitive Impairments and Dementia: A Systematic Review. Journal of Clinical Medicine 11, 4191 (2022).2. Chen, Y., Gerling, K., Verbert, K. & Vanden Abeele, V. Video Games and Gamification for Assessing Mild Cognitive Impairment: Scoping Review. JMIR Ment Health 12, e71304 (2025).3. Park, J.-H. Discriminant Power of Smartphone-Derived Keystroke Dynamics for Mild Cognitive Impairment Compared to a Neuropsychological Screening Test: Cross-Sectional Study. J Med Internet Res 26, e59247 (2024).4. Gershman, S. J., Horvitz, E. J. & Tenenbaum, J. B. Computational rationality: A converging paradigm for intelligence in brains, minds, and machines. Science 349, 273–278 (2015).5. Oulasvirta, A., Jokinen, J. P. P. & Howes, A. Computational Rationality as a Theory of Interaction. in Proceedings of the 2022 CHI Conference on Human Factors in Computing Systems 1–14 (Association for Computing Machinery, New York, NY, USA, 2022). doi:10.1145/3491102.3517739.6. Wang, Y., Srinivasan, A. R., Lee, Y. M. & Markkula, G. Modeling Pedestrian Crossing Behavior: A Reinforcement Learning Approach With Sensory Motor Constraints. IEEE Transactions on Intelligent Transportation Systems 1–12 (2025) doi:10.1109/TITS.2025.3581693.7. O’Reilly, R. C. & Frank, M. J. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia. Neural Computation 18, 283–328 (2006).8. Hazy, T. E., Frank, M. J. & O’Reilly, R. C. Computational Neuroscientific Models of Working Memory. (2021).9. Yoo, A. H. & Collins, A. G. E. How Working Memory and Reinforcement Learning Are Intertwined: A Cognitive, Neural, and Computational Perspective. Journal of Cognitive Neuroscience 34, 551–568 (2022).10. Ahmed, S., Lytton, W. & Crystal, H. Computational Models of Age-associated Cognitive Slowing and Memory Loss (P6-9.010). Neurology 102, 4069 (2024). AI4BI-SIDB Projects Our CDT has partnered with Simons Initiative for Developing Brain (SIDB) to offer additional PhD studentships focused on understanding the neurological basis of and testing new therapies for monogenic forms of autism and intellectual disability, such as Fragile X Syndrome (FXS), SYNGAP1 haploinsufficiency and CDKL5 deficiency disorder (CDD). Check out available projects below.Find out more about SIDB (26) Automated data collection and analysis for deep behavioural phenotyping of rat models of neurodevelopmental disorder in a complex housing environment Supervisory teamRaven Hickson, Peter Kind, Marino Pagan, Angus Chadwick Project BackgroundThe richness and flexibility of the rat behavioural repertoire make them well suited as models of the cognitive and social aspects of neurodevelopmental disorders (NDDs). Standard laboratory housing drastically reduces opportunities for rats to express natural behaviours, therefore vastly diminishing the behavioural repertoire available to study1. The Habitat was designed to provide an environment that more closely aligns with the ecology of the Norway rat to address this mismatch and provide opportunities to observe the development of adaptive behaviours in their functional environment. Housing in the Habitat results in observable effects on the transcriptome, and living in the Habitat has different behavioural effects in two different models as compared with living in standard housing. Habitat housing also appears to alter behaviour at a micromovement scale, as captured by RatSeq (unpublished data, method described here2). However, little is known about what aspects of the Habitat experience may contribute to these effects on behaviour. The Habitat allows the capture of multiple modes of data (RFID tracking, video, audio, etc.) for characterising the animal’s behavioural repertoire with the ultimate goal of predicting genotype. The overarching goal is to generate testable hypotheses about circuit-level differences between models of NDD and wild-types.Project AimsDevelop a data analysis pipeline that allows for integration of multimodal (RFID tracking, video, audio, etc.) individual and group-level data collected from the Habitat.Apply the latest AI and machine learning technologies to the problem of behaviour analysis on individual, dyadic and group-levels at both long and short time-scales.Build behavioural models that allow for individuals to be clustered based on behavioural patterns in the Habitat and test whether these ‘behavioural profiles’ explain variability (inter- and/or intra-) in empirical behavioural tasks.Training OutcomesBe able to critically examine and synthesize literature from multiple fields (ex. ecology, neuroscience, psychology, information theory, machine learning) to develop novel approaches to experimental design, data collection, and analysis.Be able to apply knowledge of machine learning and/or AI technologies to behavioural data collection, integration, and analysis in multiple modalities.Be able to communicate complex data effectively to colleagues and third-party stakeholders across a range of disciplines to facilitate collaboration.References1Shemesh, Y., & Chen, A. (2023). A paradigm shift in translational psychiatry through rodent neuroethology. Molecular Psychiatry, 1–11. https://doi.org/10.1038/s41380-022- 01913-z2Wiltschko, A. B., Tsukahara, T., Zeine, A., Anyoha, R., Gillis, W. F., Markowitz, J. E., Peterson, R. E., Katon, J., Johnson, M. J., & Datta, S. R. (2020). Revealing the structure of pharmacobehavioral space through motion sequencing. Nature Neuroscience, 23(11), 1433–1443. https://doi.org/10.1038/s41593-020-00706-3 (27) Discovering biomarkers for sensory sensitivities in Fragile x Syndrome and SYNAGAP1 Supervisory teamAndrew Stanfield, Leena Williams, Peggy SeriesProject BackgroundSensory perceptual disturbances are a widespread feature of neurodevelopmental disorders. They are linked to stress and maladaptive coping mechanisms, and are a prevalent understudied feature leading monogenically inherited forms of intellectual disability and autism, such as Fragile X Syndrome (FXS) and SYNGAP1. Current strategies for their identification and quantification are primarily based on informant reporting, therefore subjective and vulnerable to misclassification, and there are no available treatments. What are then the key objective biomarkers for sensory sensitivities and what are the neuronal circuit impairments underpinning these features in sensory cortices? Presently, we are conducting a human study testing the use of electroencephalogram and a commercially available Brain Gauge tactile stimulator device to objectively quantify tactile impairments in FXS. We also have EEG data from auditory and visual stimuli in both FXS and SYNGAP1 individuals. This AI4Bi PhD project aims to accelerate transformation of our human data into a novel biomarker of sensory sensitivities by (Aim 1) using machine learning, trained on clinical, behavioural and EEG data to develop a classifier. (Aim 2) Then refine the machine learning classifier to determine which features are most predictive and how acquisition time can potentiallty be shortened for future clinical trials. (Aim 3) In parallel, use the framework of Bayesian inference and computational psychiatry to build mathematical and computer models to identify key diagnostic markers of tactile sensitivities in FXS and identify novel objective bio markers for sensory sensitivities. The novel biomarkers for sensory sensitivities uncovered could then be employed in planned therapeutic research, and the findings may be applicable to similar and associated conditions, increasing impact.Project AimsAim 1: Pre-process EEG and employ the brain activity along with clinical / behavioural data (tasks and questionnaires) to develop an explainable machine learning classifier. Aim 2: Machine learning classifier refinement and inputs needed to shorten acquisition.Aim 3: Using the framework of Bayesian inference and computational psychiatry, build mathematical and computer models to identify key diagnostic markers of sensory sensitivities in FXS & SYNGAP1.Training OutcomesStrong experience in machine learning applied to experimental sensory-evoked EEG and behavioral data. Strong experience using a Bayesian inference framework for building mathematical and computer models to address the outlined aims. Experience designing and analysing data from psychophysical, imaging, and electrophysiological experiments in humans and mice. Expertise in neuroscience, sensory processing, neuronal circuits in sensory cortices, and perceptual learning. ReferencesN. A. J. Puts, E. L. Wodka, M. Tommerdahl, S. H. Mostofsky, and R. A. E. Edden, ‘Impaired tactile processing in children with autism spectrum disorder’, Journal of Neurophysiology, vol. 111, no. 9, pp. 1803–1811, May 2014, doi: 10.1152/jn.00890.2013.Jeste SS, Nelson CA 3rd. Event related potentials in the understanding of autism spectrum disorders: an analytical review. J Autism Dev Disord. 2009 Mar;39(3):495-510. doi: 10.1007/s10803-008-0652-9. Epub 2008 Oct 11. PMID: 18850262; PMCID: PMC4422389.Espenhahn S, Godfrey KJ, Kaur S, Ross M, Nath N, Dmitrieva O, McMorris C, Cortese F, Wright C, Murias K, Dewey D, Protzner AB, McCrimmon A, Bray S, Harris AD. Tactile cortical responses and association with tactile reactivity in young children on the autism spectrum. Mol Autism. 2021 Apr 1;12(1):26. doi: 10.1186/s13229-021-00435-9. PMID: 33794998; PMCID: PMC8017878.P. Karvelis, A. Seitz, S. Lawrie and P. Series (2018). Autistic traits, but not schizotypy, predict overweighting of sensory information in Bayesian visual integration, eLife, 7:e34115.P Seriès (Ed.). (2020). Computational psychiatry: A primer. MIT Press. (28) Machine Learning approaches for identification of connectopathies and vulnerable cell groups in neurodevelopmental disorders using barcoded anatomy Supervisory teamGulsen Surmeli, Sara Wade, Steven McDonaghProject PartnersDr Xiaoyin Chen, The Allen InstituteProject BackgroundAt the core of the behavioural abnormalities that manifest in ASDs are deficits in brain-wide connectivity. A lack of coordinated activity has been demonstrated using low resolution functional and structural imaging technologies. The specific manifestations of connectivity deficits at the level of individual neurons and brain areas remain largely unknown, hindering mechanistic understanding of a wide range of neurodevelopmental disorders. A major obstacle to progress is the challenge of investigating neural connectivity and vulnerable neuronal populations in a sufficiently high throughput manner. While the technologies for collecting high-throughput data are now available, application to autism research requires optimization of analytical tools. This project will address this challenge by tailoring AI and machine learning based approaches to develop analysis tools for barcoded anatomy. By applying these to datasets acquired from mouse models of the Fragile X disorder we aim to identify changes in the transcriptomic and projectomic characteristics of neurons.Project AimsDevelop machine learning approaches for data extraction (cell segmentation, base calling) from complex microscopy images for barcoded anatomy data (BARseq) with improved accuracy.Develop probabilistic multiview clustering framework that builds on our recently developed analysis pipleline, HBMAP to link transcriptomically defined cell types with projection profiles.Analytical strategies for integration of projection motif and cell type classification. Develop modelling based statistical approaches that use machine learning strategies for probabilistically identifying projection motifs and enable group comparisons between wildtype and mutant animal cohorts. Training OutcomesThe student will gain expertise in:Developing, training, and optimizing AI approaches for image processing tasks critical in spatial genomics and anatomy, including cell segmentation and base calling from complex microscopy data (e.g., BARseq).Statistical Modelling and Machine Learning (ML) Implementation: Designing and applying modelling-based statistical approaches and ML strategies for probabilistically identifying projection motifs and performing group comparisons (wildtype vs. mutant cohorts).Framework Development: Creating and extending complex analytical pipelines, specifically developing a probabilistic multiview clustering framework that links multi-modal biological data (transcriptomics and projectomics) and integrates with existing tools like HBMAP.High-Throughput Data Management and Processing: Handling, cleaning, and processing high-throughput datasets acquired from advanced technologies, ensuring data quality and readiness for complex analytical pipelines.The student will master skills specific to the project's biological domain:Barcoded Anatomy (BARseq) Data Analysis: Deep understanding and practical experience in analyzing data generated by barcoded anatomy technologies, including the interpretation of transcriptomic and projectomic characteristics of individual neurons.Integration of Multi-Modal Data: Proficiency in analytical strategies for the integration of projection motif and cell type classification, effectively combining spatial, genetic (transcriptomic), and connectivity (projectomic) information to define neuronal populations.Neurodevelopmental Disorder Context: A comprehensive understanding of the deficits in brain-wide connectivity associated with Autism Spectrum Disorders (ASDs) and the specific application of research to Fragile X disorder mouse models.Biological Interpretation: The ability to translate complex statistical and ML findings (e.g., changes in projection motifs or cell type characteristics) back into mechanistic understanding relevant to neurodevelopmental disorders.ReferencesWade, S., Agboraw, E., Liu, J., Zhang, H. & Surmeli, G. HBMAP: Bayesian inference of neural circuits from DNA barcoded projection mapping. bioRxiv (2025) Chen X, Sun YC, Zhan H, Kebschull JM, Fischer S, Matho K, Huang ZJ, Gillis J, Zador AM. High-Throughput Mapping of Long-Range Neuronal Projection Using In Situ Sequencing. Cell. 2019 Oct 17;179(3):772-786.e19. doi: 10.1016/j.cell.2019.09.023. PMID: 31626774; PMCID: PMC7836778. (29) Investigating network alterations underpinning altered multisensory integration in neurodevelopmental disorders Supervisory teamEmma Wood, Matthias Hennig, Adrian DuszkiewiczProject BackgroundAutism spectrum disorder and intellectual disability (ASD/ID) are comorbid conditions characterized by abnormalities in early cognitive development that persist into adulthood. In many cases, their symptoms are linked to de novo or inherited mutations in genes involved in neuronal function (Manoli and State, 2021). It is currently not well understood how such genetic changes are mechanistically related to the circuit dynamics, computations, and behavior that together constitute ASD phenotypes. In this project the head-direction (HD) system, a well conserved network in mammals that computes an animal heading direction, will be analysed to address this question.In mammals, information about heading direction is maintained by a network of neurons across multiple brain regions known as the head-direction system (Laurens and Angelaki, 2018). Head-direction neurons combine two main sources of information: signals from the inner ear and body that track self-motion, and external cues such as visual landmarks that provide orienting reference points. Recent research indicates that the brain integrates these cues in a near-optimal way by giving more weight to the more reliable source at any moment. Recent work from our group shows that in a rat model of Fragile X syndrome, a neurodevelopmental disorder characterised by intellectual disability with a high prevalence of autism in humans, this balance is disrupted. Their head-direction system relies more heavily on external visual landmarks and too little on self-motion. This causes the internal representation of direction to become overly dominated by the visual landmarks, losing the normal partial adjustment that reflects balanced cue integration. This PhD project will use computational modelling in combination with data from real neural recordings to understand why this imbalance arises. By combining concepts from neuroscience and artificial intelligence, the student will explore how changes in network connectivity or plasticity could explain the altered information weighting seen in the mutant animals.Project AimsAnalyse existing neural recordings from the head direction (HD) system in Fmr1 KO and control animals to determine neural and circuit-level differences.Build a computational model of the rat head-direction network, based on a ring attractor circuit, that can combine visual and self-motion cues, to replicate the experiments, and explore hypotheses to explain the differences observed in Fmr1 animals.Use a data-driven approach to rule in and rule out hypotheses, and to generate testable predictions for new experiments.Use the modelling approach to analyse differences in the development of the HD system in Fmr1 animals. Training OutcomesThis position will offer comprehensive training at the intersection of basic and clinical/translational neuroscience, including exposure to related research across SIDB. Interdisciplinary working is at the core of the centre, and the student will learn to communicate research and findings to different audiences including experimental neuroscientists, clinicians, computational scientists, and non-specialist audiences such as patients and their carers. The student will gain skills in data-driven computational modelling in neuroscience, and in data analysis in neuroscience, including processing of neural/behavioral data. The project will promote reproducible and open science and will offer ample opportunities for training in this area.ReferencesHulse, B. K., & Jayaraman, V. (2020). Mechanisms underlying the neural computation of head direction. Annual Review of Neuroscience, 43(1), 31-54.Laurens, J., & Angelaki, D. E. (2018). The brain compass: a perspective on how self-motion updates the head direction cell attractor. Neuron, 97(2), 275-289.Manoli, D. S., & State, M. W. (2021). Autism spectrum disorder genetics and the search for pathological mechanisms. American Journal of Psychiatry, 178(1), 30-38.Redish, A. D., Elga, A. N., & Touretzky, D. S. (1996). A coupled attractor model of the rodent head direction system. Network: Computation in Neural Systems, 7(4), 671. (30) Neuronal encoding of contextual information in the visual cortex of mouse models of autistic spectrum disorders Supervisory teamNathalie Rochefort, Arno OnkenProject BackgroundChildren on the autism spectrum (ASD) differ from typically developing children in many aspects of their processing of sensory stimuli. One proposed mechanism for these differences is an imbalance in higher-order feedback to primary sensory regions, leading to an increased focus on local object features rather than global context(1, 2).The aim of this project is to use mouse models to reveal the neuronal encoding of contextual information in the visual cortex. We will determine how visual feedback processing may be disrupted in mouse models of ASD.This project will leverage artificial intelligence tools to uncover how natural stimuli statistics are encoded and how local neuronal populations form both reliable and context-dependent representations of natural scenes in the primary visual cortex. In order to investigate these mechanisms, we will use a combination of large-scale high throughput recordings of neuronal activity and artificial neuronal networks. By using electrophysiological recordings with high density silicon probes(4), we will record neuronal responses to natural scenes in all layers of the primary visual cortex (V1), in awake head-fixed adult mice. We will use a recently developed modelling framework to generate optimized surround images and movies in order to systematically investigate the rules that determine contextual excitation versus inhibition in a naturalistic setting(5,6). Such closed-loop paradigm was developed through a collaboration between the two co-supervisors of this project: the team of Dr Arno Onken, School of Informatics and Dr Nathalie Rochefort, CDBS. The approach is based on a new type of deep learning data-driven model that can accurately predict V1 responses to new (unseen) stimuli (5,6). The experimental design integrates large-scale neuronal recordings, a model capable of accurately predicting responses to diverse natural stimuli, the in silico optimization of non-parametric images and movies, and in vivo validation of the predictions.Project AimsDetermine the spatio-temporal features of contextual modulation of visual responses in primary visual cortexCausally test the impact of feedback inputs from higher visual area (LM) on the contextual modulation of visual responses in primary visual cortexWe will systematically compare the results obtained in Syngap heterozygous mice and wild-type littermate controls. Depending on the results, we will be able to use our artificial neural network model to generate a database of visual stimuli, aimed at specifically testing contextual perception in individuals affected by neurodevelopmental disorders. This will be done in collaboration with the team of Dr Andrew Stanfield (NHS, Division of Psychiatry, University of Edinburgh).Training OutcomesIn vivo recordings in awake behaving mice: training in neuropixels recordings; In vivo surgery, viral injections in mouse brain.Computational methods: model-based analysis of the data, computational modeling of neural circuits; programming skills in Python.Data Management: managing and analyzing large datasets.Research Ethics, animal research regulations.Presentation of data, written and orally.ReferencesEmily J. Knight, Edward G. Freedman, Evan J. Myers, Alaina S. Berruti, Leona A. Oakes, Cody Zhewei Cao, Sophie Molholm, John J. Foxe, Severely Attenuated Visual Feedback Processing in Children on the Autism Spectrum, Journal of Neuroscience, 2023, 43 (13) 2424-2438Smith D, Ropar D, Allen HA (2015) Visual integration in autism. Front Hum Neurosci 9:387.Walker EY, Sinz FH, Cobos E, et al. Inception loops discover what excites neurons most using deep predictive models. Nat Neurosci. 2019;22(12):2060-2065.Bimbard C, Takács F, Catarino JA, et al. An adaptable, reusable, and light implant for chronic Neuropixels probes. 2024, eLife. https://doi.org/10.7554/eLife.98522.1.Li B, Cornacchia I, Rochefort N, Onken A. V1T: large-scale mouse V1 response prediction using a Vision Transformer. Transactions on Machine Learning Research. 2023. https://openreview.net/pdf?id=qHZs2p4ZD4Bryan M. Li, Wolf De Wulf, Danai Katsanevaki, Arno Onken, Nathalie L. Rochefort, Movie-trained transformer reveals novel response properties to dynamic stimuli in mouse visual cortex, bioRxiv 2025.09.16.676524; doi: https://doi.org/10.1101/2025.09.16.676524 This article was published on 2025-11-05