2/2/2021 Daniel Inafuku for Illinois Physics
Given their background in big data analytics, particle physicists like Illinois Physics Professor Mark Neubauer are well-primed to study COVID-19's biology. Neubauer is currently serving on the executive boards of three groups that bring together scientists of diverse backgrounds to confront the pandemic: the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP), Open Science Grid, and Science Responds. These groups are coordinating their efforts and contributing valuable computing resources to understand the virus at the biological level.
Written by Daniel Inafuku for Illinois Physics
Big data problems need big data solutions
With the number of COVID-19 cases mounting worldwide, the need to understand its biology, spread, and behavioral effects is now greater than ever. Many aspects of the novel coronavirus SARS-CoV-2 that causes COVID-19 are still not well understood and good treatments remain elusive. At the same time, responses to its spread need to be informed by accurate models. To address these problems, physicists at the University of Illinois at Urbana-Champaign are contributing to our understanding of the disease, as well as helping to organize efforts on the ground to provide aid.
Illinois Physics Professor Mark Neubauer is among those leading efforts within the physics community to confront the pandemic. Neubauer is an experimental particle physicist who uses data from the Large Hadron Collider (LHC) to study the fundamental interactions of elementary particles and to search for new physics that lie beyond the standard model. Neubauer is a member of the ATLAS Experiment at the LHC, which searches for elementary particles produced when other particles collide with each other at high energies. The experiment yields extremely large datasets: Neubauer and his colleagues are equipped with a variety of tools to analyze big data, such as data mining, machine learning, and data visualization. Perhaps not surprisingly, these tools are precisely those that can help scientists probe the complexities of COVID-19.
In tackling the SARS-CoV-2 virus, biologists are particularly interested in the proteins that the virus possesses. Proteins, which are essentially long strings of chemicals known as amino acids, carry out many biological functions, including those in humans. The functionality of proteins depends on their ability to fold into complex shapes. In the case of SARS-CoV-2, proteins stud the surface of the viral particle and allow it to invade cells, giving the virus access to those cells' machineries to make more viruses. Being able to predict viral protein shapes could give insight into the design of potential treatments and effective therapeutics. However, protein folding is a notoriously complex and computationally expensive problem because of the large number of conformations each protein string can adopt. Viral biology, like particle physics, requires access to large amounts of computing power and the tools to properly harness this computing power.
Given their background in big data analytics, particle physicists like Neubauer are well-primed to study COVID-19's biology. Neubauer is currently serving on the executive boards of three groups that bring together scientists of diverse backgrounds to confront the pandemic: the Institute for Research and Innovation in Software for High Energy Physics (IRIS-HEP), Open Science Grid, and Science Responds. These groups are coordinating their efforts and contributing valuable computing resources to understand the virus at the biological level.
Illinois Physics Professor Mark Neubauer stands in the Illinois Campus Cluster (ICC) at the U of I Advanced Computation Building on the Urbana campus. Neubauer is the principal investigator for the Midwest Tier-2 Computing Center, which uses the ICC to process LHC data and to run numerical simulations in collaboration with the National Center for Supercomputing
IRIS-HEP and the Open Science Grid
IRIS-HEP brings together researchers from across the U.S. and around the globe to address computing challenges in particle physics—physicists, data scientists, and computer scientists who analyze data coming from facilities such as the LHC. IRIS-HEP’s primary mission is to support the search for exotic physics beyond the standard model and to understand the ramifications of such new discoveries through the creation of new and improved data analysis software and algorithms.
Working in close collaboration with IRIS-HEP is the Open Science Grid (OSG). With funding from the National Science Foundation and the US Department of Energy, OSG supports computational scientists in various fields—including particle physics, protein chemistry, and materials science—by giving users access to tailored software and a computational grid that enables remote collaboration while simultaneously conducting numerical simulations. This works by spreading data analysis over a large network of computers, each one working independently.
With the COVID-19 pandemic raging on, IRIS-HEP and OSG expanded their work to include the study of SARS-CoV-2’s biology. The two groups have contributed to a variety of different COVID-19-related projects, including the development of critical infrastructure for ventilator monitoring software and the adaptation of existing algorithms for epidemiological efforts. In particular, Illinois Physics postdoctoral researcher and ATLAS physicist Matthew Feickert has created special programs called Docker images that facilitate the execution of software on different operating systems. Docker images are often useful when dealing with a computational grid whose nodes each have different types of hardware. The Docker images that Feickert created contain a special software called Folding@Home, a project that allocates computational resources from citizen scientists to solve problems in protein folding. Placing Folding@Home within these Docker images allows users to run Folding@Home on OSG sites using donated resource time from ATLAS. Moreover, users—professional and citizen scientists alike—can run Folding@Home software efficiently on their own computing devices without needing detailed familiarity with Folding@Home beforehand.
“Placing software into Docker images builds a program that contains versions of that software in a closed-off environment. This makes the software ‘think’ that it’s running on a native Linux machine regardless of what computer it is actually running on. This idea is perfect for running Folding@Home at computing sites around the world because it abstracts away the operating system and dependency requirements, placing immediate focus instead on matching available computing resources with Docker images and data,” says Feickert.
Science Responds
Another important effort in the battle against COVID-19, Science Responds is an organization that helps connect the science community internationally, promoting collaboration between its members on projects ranging from estimating the numbers of available hospital resources to developing applications that can assess a person's epidemiological susceptibility. This organization was born out of the recognition that scientists in non-medical fields have know-how and tools that can substantially contribute to the fight against COVID-19.
Science Responds provides easy access to publicly available biological and epidemiological data and allows anyone with interest to see active, open, and ongoing projects. Projects are organized both by type—simulations/modeling, public health, etc.—and required skills—data visualization, web development, or statistics—organizing scientists into communities with common goals. One advantage of this organization is that any researcher can quickly join a project that fits their unique skill set. Science Responds enables rapid connectivity among physicists at a time when in-person collaborations have all but disappeared. Since its inception in March 2020, Science Responds’ Slack channel has garnered nearly 250 members and counting.
Taken together, these new initiatives and large-scale collaborations are affording physicists whose traditional expertise may not lie in the life sciences the opportunity to apply their skills to COVID-19 research.
Feickert notes, “Physicists love to see if they can jump into new areas and collaborate across fields—often to the point of being accused of hubris. But for something like COVID-19, the stakes are too high and we want to let the epidemiologists do their jobs without adding noise. At the same time, physicists have a wealth of experience in attacking tricky problems with global-scale distributed computing. We saw an area where we could apply our computational skill sets to support and accelerate the work of other scientists. It was a great fit and the obvious way to responsibly use our skills to help.”
Champaign Urbana (CU) Mutual Aid
University of Illinois research programmer Ben Galewsky is also attacking the pandemic from several different angles. In collaboration with the Champaign PTA Council and a handful of local community groups, Galewsky has helped organize the Champaign Urbana (CU) Mutual Aid, a website that mobilizes community volunteers and addresses issues raised by the pandemic in the CU region. CU Mutual Aid has led efforts along many different fronts, including the delivery of critical supplies such as food and medicine to those in need; the distribution of crucial pandemic information in multiple languages; and the creation of cloth masks. Galewsky's work involves adapting data from the CU Health District to provide CU Mutual Aid with information on food distribution volunteers.
In addition to his community work, Galewsky has facilitated the operation of a simulation program called COVID-19 Mesa. This program was developed at the National Center for Supercomputing Applications and models the community spread of COVID-19 by accounting for a number of different social factors. Using publicly available data, COVID-19 Mesa is an agent-based model that suggests potential approaches to limit the spread of the disease in the CU region. To ensure that the Mesa code runs efficiently, Galewsky employs a high-performance serving system called funcX, an initiative sponsored by the University of Illinois, the University of Chicago, and Argonne National Laboratory. The funcX system enables data analysis workloads to be spread across an array of computing units—from smaller ones such as laptops to larger ones such as campus clusters and supercomputers.
The battle rages on
As the global community continues to face worldwide public health, economic, and social restrictions, the search for solutions to these problems has never been more urgent or pressing. Neubauer and his research group remain committed to applying their unique skills to support their colleagues in the fight against the pandemic.
“In March of 2020, as the COVID-19 pandemic began to take hold in the US, I along with many of my colleagues in the ‘big data’ scientific realm and computational science community organized grassroots efforts aimed at leveraging our large, motivated, and organized community with certain technical and scientific capabilities to help out in the COVID-19 effort,” Neubauer sums up. “As overcoming COVID-19 will be a marathon rather than a sprint, our group will continue efforts such as these and, along with the broader particle physics community, pursue new opportunities to support research aimed at confronting this global crisis.”
If you'd like to learn more about or to volunteer with the organizations above, please visit
-
IRIS-HEP COVID-19 Response: https://iris-hep.org/covid-19.html
-
Open Science Grid COVID-19 Response: https://opensciencegrid.org/covid-19.html
-
Science Responds: https://science-responds.org/
-
CU Mutual Aid: https://www.cu-mutual-aid.org/