Cloud & High Performance Computing – CompBioMed Conference

Invited Speakers

Dr Alexandre Bonvin, Utrecht University
Dr Andrew Grant, Atos
Dr Wolfgang Gentzsch, UberCloud

Symposium Chair

Marco Verdicchio, IT Consultant, SURFsara

Symposium Description

High fidelity biomedical simulation addresses systems of immense complexity, in three dimensional space as well as time. This frequently calls for access to very powerful computers, including the most powerful ones available globally today. For applications in clinical medicine, modelling and simulation based on personal data must be conducted not only with high fidelity, and in a secure manner to address information governance regulations, but also rapidly since the outcomes should be actionable, in the sense of providing real time decision support for clinicians.

Topic for consideration include, but are not limited to, high performance computing (HPC) requirements for biomedical applications, such as software development, scalability of codes and other associated performance metrics, as well as support for access mechanisms which are unconventional by traditional HPC centre standards (such as advance reservation and urgent computing), the use of visualisation and computational steering along with secure data staging and storage. Modern commercial cloud environments can be brought increasingly seamlessly into juxtaposition with HPC resources, and the two technologies further aligned through new technologies such as containerisation. Cybersecurity and quality of service are key metrics for assessing the reliability and performance of these technologies.

Thursday 26th September, 15:30 – 17:00; Watson Watt Room
and
Friday 27th September, 10:15 – 12:00; Watson Watt Room

Time	Speaker	Title
15:30	Wolfgang Gentzsch (Invited Speaker)	Advancing Personalized Healthcare with High-Performance Cloud Computing for the Living Heart Project Abstract preview Major factors contributing to the acceleration of personalized healthcare in recent years come from advances in high-performance computing (HPC), data analytics, machine learning, and artificial intelligence, enabling scientists now to perform the most sophisticated simulations, in genomics, proteomics, and many other fields, using methods like genome analysis, molecular dynamics, and more general computer aided analysis methods widely applied and proven in other areas of scientific and engineering modelling. To just select one, we demonstrate the impact of computer simulations on personalized health care and present a research project aiming at living heart simulations, which has recently been rewarded with several prestigious international awards. Full Abstract
15:50	Stefan Zasada	Large Scale Binding Affinity Calculations on Commodity Compute Clouds Abstract Preview In recent years it has become possible to calculate the binding affinities of compounds bound to proteins via rapid, accurate, precise and reproducible free energy calculations. The ability to do so has applications from drug discovery to personalised medicine. This approach is based on molecular dynamics simulations and draws on sequence and structural information of the protein and compound concerned. Free energies are determined by ensemble averages of many molecular dynamics replicas, each of which requires hundreds of CPU or GPU cores, typically requiring a supercomputer class resources to be calculated on a tractable timescale. In order to perform such calculations there are also requirements for initial model building and subsequent data analysis stages. Full Abstract
16:05	Piotr Nowakowski	Processing Complex Medical Workflows in the EurValve Environment Abstract Preview In this paper we present the outcome of three years of development work in the EurValve project [1] which resulted in the creation of an integrated solution for medical simulations referred to as the Model Execution Environment (MEE). Starting with a definition of the problem (which involves simulating valvular heart conditions and the outcomes of treatment procedures) we provide a description of the high-performance computational environment utilized to process retrospective use cases in order to create a knowledge base which underpins the EurValve Decision Support System (DSS). We also provide specific examples of MEE usage and the corresponding statistics. Full Abstract
16:20	Terry Sloan	The HemeLB Offloader Abstract Preview is paper outlines an approach for enabling access to HPC applications as Software as a Service (SaaS) on conventional high-end HPC hosts such as EPCC’s Cirrus and cloud providers such as Microsoft’s Azure service. The focus for the approach is enabling access to the HemeLB application with the Polnet biomedical workflow. The paper reports on an implementation of this approach that allows Polnet workflows to run on the Cirrus and LISA supercomputing services at EPCC and SURFsara respectively. Full Abstract
16:35	Alexandre Bonvin (Invited Speaker)	Structural biology in the clouds: Past, present and future Abstract Preview Structural biology deals with the characterization of the structural (atomic coordinates) and dynamic (fluctuation of atomic coordinates over time) properties of biological macromolecules and adducts thereof. Gaining insight into 3D structures of biomolecules is highly relevant with numerous application in health and food sciences. Since 2010, the WeNMR project (www.wenmr.eu) has implemented numerous web-based services to facilitate the use of advanced computational tools by researchers in the field, using the grid computational infrastructure provided by EGI. These services have been further developed in subsequent initiatives under the H2020 EGI-ENGAGE West-Life project and the BioExcel Center of Excellence for Biomolecular Computational Research (www.bioexcel.eu). The WeNMR services are currently operating under the European Open Science Cloud with the H2020 EOSC-Hub project (www.eosc-portal.eu), with the HADDOCK portal (haddock.science.uu.nl) sending >10 millions jobs and using ~2700 CPU years per year. In my talk, I will summarize 10 years of successful use of e-infrastructure solutions to serve a large worldwide community of users (>13’500 to date), providing them with user-friendly, web-based solutions that allow to run complex workflows in structural biology. Full Abstract
16:55	END OF SESSION AND DAY 2

Time	Speaker	Title
10:15	Tomasz Piontek	Supporting advanced HPC/HTC scientific workloads with QCG services Abstract preview All researchers agree that efficiency, flexibility and ease of execution of scientific computation workloads were always key requirements of in-silico experiments. Nowadays, when the computational facilities reach exascale and the complexity of applications increases, the situation hasn’t changed. There are still the same questions and unresolved practical problems how to perform computations easily and effectively. Full Abstract
10:30	Christos Kotsalos	Digital Blood in Massively Parallel CPU/GPU Systems for the Study of Platelets deposition Abstract Preview We propose a novel high-performance computational framework for the simulation of fully resolved whole blood flow. The framework models blood constituents like red blood cells (RBCs) and platelets individually, including their detailed non-linear elastic properties and the complex interactions among them. This kind of simulations are particularly challenging because the large number of blood cells (up to billions) stand in contrast with the high computational requirement of individual constituents. While classical approaches address this challenge through simplified structural modelling of the deformable bodies (e.g., through mass-spring systems), the present framework guarantees accurate physics, desirable numerical properties through a fully featured FEM model and computational efficiency at the same order as the more simplified state-of-the-art models. The required numerical performance is achieved through a hybrid implementation, using CPUs for the blood plasma and GPUs for the blood cells. Full Abstract
10:45	Craig Lucas	The POP Centre of Excellence – Improving Parallel Codes Abstract Preview The Performance Optimisation and Productivity (POP) Centre of Excellence [1] is funded through Horizon 2020, like CompBioMed, and is made up of eight partners across Europe [2]. Our remit is to improve the performance of both academic and commercial parallel codes. Working with developers and users we promote a methodology for understanding a code’s performance which helps us go on to improve it. Full Abstract
11:00	Narges Zarrabi	Secure Processing of Sensitive Data on shared HPC systems Abstract Preview In this work we present a novel method for creating secure computing environments on traditional multi-tenant high-performance computing clusters. Typically, current HPC clusters operate in a shared and batch mode, which can be incompatible with the security requirements set by data providers for processing their highly sensitive data. We propose a solution using hardware and network virtualization, which runs on an existing HPC cluster, and at the same time, meets strict security requirements. We show how this setup was used in two real-world cases. The solution can be used generally for processing sensitive data. Full Abstract
11:15	Shantenu Jha	Zettascale Computing on an Exascale Platform Abstract Preview We outline the vision of “Learning Everywhere” which captures the possibility and impact of learning methods coupled to traditional HPC methods. A primary driver of such coupling is the promise that learning will give major effective performance improvements for traditional HPC simulations. We will discuss how learning methods and HPC simulations are being integrated, and provide representative examples. We will discuss specific applications and software systems developed for ML driven MD simulations on Summit at Oak Ridge. Full Abstract
11:30	Phil Tooley	Parallelising Image Registration and the HPC Porting Journey Abstract Preview Image registration is widely used in many areas of computational biomedicine, both as a research tool and a component in work flows providing clinical decision support. There exists a wide range of both open-source and commercial tools for performing image registration based on a variety of dierent methods.[1] However, these tools are designed to be run on a single machine, with the associated limitations of computational performance and available memory of the system, placing a limit on the maximum size of images which can be handled. A key application for image registration at the University of Sheeld is strain measurement of bone samples using digital volume correlation (DVC).[2] This makes use of tomographic imaging from synchrotron light sources, which can be many tens to hundreds of gigabytes in size \| too large to be handled at full resolution by these existing codes. The solution is therefore to create a parallelised image registration code, capable of leveraging HPC infrastructure to register images of such sizes using the memory and computational capacity of multiple HPC nodes. Full Abstract
11:45	Andy Grant (Invited Speaker)	Integrating HPC and Deep Learning in converged workflows Abstract Preview In the last few years the use of AI and specifically deep learning techniques has emerged as a key pillar in scientific discovery. While many of the underlying techniques have been around for some time the computational power and data volumes required to make them effective have only recently become available. Deep Learning provides new methods to improve predictive accuracy, response times and insight into new phenomena, often using data sets that would previously have been considered unmanageable. Full Abstract
12:05	LUNCH