Modeling the Human Immune System – Sponsor Content
When the Human Genome Project mapped and sequenced the entire set of DNA instructions in a human cell—a 13-year, $2.7 billion global endeavor that completed in 2003—it produced one of humanity’s greatest accomplishments. The project was a historic milestone for dramatically accelerating biomedical research and forever altered the way we practice medicine. But it didn’t fully answer the question of why we get sick, since genetics only accounts for roughly 20 percent of diseases. Social determinants of health—such as economics, education, the physical environment, racism, and sexism—explain some of that. Yet even people facing similar circumstances respond differently when exposed to the same immune system challenge, such as a pathogen or a vaccine. What accounts for the extreme variation in human immune response to the various threats and insults of life?
Those differences are embedded in one of the most complex systems in life science: the human immune system. Twenty years after the human genome was sequenced, a new moonshot is taking shape. It’s many orders of magnitude more complex than the genome and could, in turn, be orders of magnitude more useful than the genome for understanding human health and disease. The next moonshot in life sciences is decoding the human immune system.
Unlocking the secrets of our immune system offers the tantalizing possibility that we can one day understand who will get sick, how their disease will progress, and which interventions will work best for each individual. Imagine that doctors could predict who will get cancer, whether it will metastasize, and how to intervene with early treatment or even preventative measures.
“Fifteen years ago, that would have sounded like science fiction,” says Shai Shen-Orr, founder and chief scientist of CytoReason, a company building computational models to connect the molecular features of patients’ lab work to their clinical outcomes. But now, thanks to AI, it is within reach.
So far, machine learning has largely been applied to specific healthcare tasks using one type of data—for instance, retinal scans, which are being used to predict and diagnose a host of medical conditions ranging from kidney disease and heart attacks to Alzheimer’s disease. The same is true of electrocardiograms, which when boosted with deep learning are capable of determining age and sex and detecting cardiac dysfunction, anemia, and more. Computer vision is increasingly allowing us to characterize the molecular basis of tumors, which can guide the clinician in treating tissue cancers.
But as the wave of AI breaks over life sciences, biotech, and medicine, it’s time to move beyond two-dimensional mapping into three-dimensional modeling and systems thinking—and to take those capabilities into the fourth dimension by modeling people over time, from a newborn’s developing immune system to an elderly person’s failing one. That’s a giant leap, and it requires a step back from specific tasks to think more globally about the astoundingly complex system that is the human immunome.
*
The immunome includes all the molecules, proteins, cells, tissues, and organs of the immune system, as well as all their interactions with the body’s other biological systems, such as the genome, epigenome, microbiome, and metabolome. And it includes the exposome, which are external factors (or inputs) like stress, pollution, and diet.
Can AI sort it out? Leading experts think so. “What’s striking is that now we’re seeing self-supervised learning and unsupervised learning,” says Eric Topol, founder of the Scripps Research Translational Institute and author of the book Deep Medicine: How Artificial Intelligence Can Make Healthcare Human Again. “We’re putting essentially everything we know about the immunome into these large language models, which can help sort and interpret generally—and not just the code, but the context and the ideas. This is a very auspicious time for AI.”
But while the AI technology is ripening, the immunological data to feed it is missing. Currently, hospitals, clinical research organizations, and medical practices generate mountains of health data on a daily basis, but detailed information about the immune system is lacking. Wearable sensors, smartphones, and other self-tracking devices and apps are contributing more health data than we’ve ever had, but even all of that will not provide enough usable information of sufficient resolution to fuel models that would explain the human immunome.
To generate the data required, researchers would need to analyze all the cells and cell types in the immune system and determine how many cells there are, what state they’re in, and the molecular signals that determine that state. Doing that requires measuring the signaling proteins and metabolites, including hormones. Even with recent developments in multiplexing, that’s still a lot of very advanced and expensive scientific testing.
Beyond data collection, there are a host of data structure issues that need to be addressed: ownership, privacy, and security considerations, and the need for informed consent. Other ethical concerns include the need to have a representative sampling of the entire human population—not just those who have access to and can afford the tests, a problem that plagued the original human genome sequencing effort.
And that’s just the beginning of the challenge. Once researchers have the data, they’ll still have to figure out how to build a model that can process a wide range of inputs, including numerical data, text-based data, and still and moving images. How do you label and tag that data in consistent and meaningful ways? Can you get sample sizes that are large enough to be statistically significant? The ultimate challenge will be to build models that not only generate predictions but also explain the underlying mechanisms at work.
While large language models are very good at predicting what the next word in a sentence will be, the extent to which they can identify some of the intricate underlying biological causal structures is not yet clear. “They may be able to predict what the next state of the immune system will be, but it’s much more difficult to know how to intervene to alter the future trajectory of the immune system so that this person will be healthy again rather than continuing down the path of disease to a worsening state,” says John Tsang, founding director of the Yale Center for Systems and Engineering Immunology.
Researchers, inspired by large language models, are building foundation models of genes and cells, akin to words and paragraphs in natural language. As more single-cell data sets become available, researchers are able to use them to help the models learn complex relationships between cells and gene expression. The models can, for instance, predict how a particular T cell would respond to an external signal, or what state the cell will be in and how it functions.
“We need to learn the language of the immune system, particularly its cells and how the cells communicate,” Tsang says. “And that’s something that we are really just at the infancy of understanding.”
*
In the summer of 2023, the Human Immunome Project (HIP), an organization I chair, unveiled a new research plan: a comprehensive, scalable open-science effort to collect the advanced cellular and molecular data needed and feed that data into building machine learning models. No matter how the endpoint is defined, the project will catapult the field of immunology into the machine learning age, creating many new possibilities for drug discovery and development, and the diagnosis and treatment of disease.
“Decoding and modeling the human immunome is a scientific and medical feat of unmatched proportions—and has the opportunity to truly transform how we think about and practice medicine,” says Hans Keirstead, a pioneering neuroscientist, a serial biotech entrepreneur, and the CEO of HIP. “We can change the trajectory of global health for the better.”
Keirstead’s plan, developed with CytoReason’s Shen-Orr and Yale’s Tsang, is to select regional scientific centers on six continents, each of which will sample and analyze a comprehensive and diverse population ranging from young to old, sick to healthy, and of multiple racial, ethnic, and socioeconomic backgrounds. To harmonize all the data captured, the team is developing a proprietary multi-modal immune monitoring toolkit that will be engineered to withstand difficult environments and support privacy, secu-rity, cost, standardization, and interoperability optimization.
The project is also working to “avoid the pitfalls of the Human Genome Project” by maximizing representation and minimizing bias, according to Shirin Heidari, a researcher at the Graduate Institute of International and Development Studies in Geneva and a former virologist. Heidari, who has consulted with HIP, says these measures are “essential not only to ensure better population diversity but also to consider the validation, optimization, and standardization of these assays across different sexes and populations to accurately capture variations.”
The only way this plan can work is if the HIP plays a significant role in directing and coordinating the project, centralizing logistics and scientific oversight, and housing administrative functions.
“This effort is so grand on the operational side of it, and the technology and the science that needs to be solved, that no one lab can do this and no one company—this is greater than all of those together. It requires a new type of thinking and a new kind of collaborative science,” says Shen-Orr.
HIP’s leadership takes inspiration from the European Organization for Nuclear Research model of global collaboration, which asks that normally competitive physicists from across the world, including countries hostile to each other in other settings, put aside politics and competition because access to the particle accelerators is so valuable. The diverse immunological database HIP is building and the AI models that will result will be a comparable asset—the largest collection of open-source, standardized, state-of-the-art immunological data in the world. “If we can pull the right people and approaches together, magic is going to happen,” Tsang says. “Things are emerging already and just need the right catalyst to make it come together.”
It’s hard to imagine getting to real precision medicine without this foundation.
Jane Metcalfe is chair of the Human Immunome Project, CEO and founder of proto.life, and co-chair of the Council of the Focused Ultrasound Foundation. She is also co-founder and former president of Wired magazine.
link