On a chilly January morning in Baja California, in a temporary camp surrounded by scrub, an occasional cactus and a distant ring of mountains, a group of biologists, geologists and geneticists are gathered in a rough semicircle, some on camp chairs, others perched on coolers and plastic containers. They are here to solve an enduring mystery that may shed light on the fundamental rules of life. But first, coffee. Afterward, they will spend the morning collecting tissue samples from six target species, including sour pitaya cactus, brittlebush, the Baja California spiny lizard and the packrat. In the afternoon, they will scramble over rocky terrain to check out unusual geologic features. From dawn to dusk and under the stars, they are talking, teaching each other key concepts in evolutionary biology, sedimentology, volcanology, molecular ecology and comparative genomics. Then they will hit the road and do it all again tomorrow.
Like academics anywhere, these scientists like to go to a white board to illustrate the trickier concepts. Out here, that role is played by the hood and side panels of their (white) pickup. For six days in early 2020, the inaugural field expedition of the Baja GeoGenomics consortium rolled along the back roads of the peninsula in a large Ford 4x4 that ended up covered in red and black tectonic diagrams and phylogenetic trees — a scientific Mystery Machine chasing answers through Deep Time.
The point was to gather data for experiments, but the project itself is a meta-experiment. Multimillion-dollar funding from the National Science Foundation is meant in part to test whether serious interaction across disciplines can lead to new insights into complex problems.
"These projects are like a diamond. The geologists and biologists start in the same place, then diverge, and my job as leader of the consortium is to make sure that we are able to come back together again. It’s like Sudoku in a million dimensions, and it involves people as well as data, which is even harder."
Looking back for answers in Baja
The Baja peninsula is some 750 miles long, but the consortium is going much further: a few million years and thousands of generations, at least. Although dozens of species of plants and animals have ranges that extend up and down the peninsula, there is a stark divide between the genomes of Northern individuals and their Southern cousins, and understanding the ancient landscape is the key to it all. The North-South genetic divergence has been observed in more than 60 species, from cactus to kangaroo rats, brittlebush to brush lizards.
There is a leading hypothesis, one that biologists continually refer to in their papers: A seaway must have cut the peninsula in two millions of years ago, where there is a topographic low today. That would have separated the species and could have caused the genetic variations. Geologists, in their studies, have not turned up evidence for this seaway, however. And there are competing theories. During times when glaciers changed temperature patterns on the peninsula, species may have retreated toward the mountains in the North and South and into isolated breeding pockets. Or differences in modern climate might be responsible; the rains come in the winter in northern Baja and in the summer in the South.
To solve the mystery, “you have to rewind which parts were there and which were not until you get to the point of this divergence,” said Greer Dolby, Ph.D., assistant professor in the Department of Biology, who joined UAB from Arizona State University in 2022. Dolby is principal investigator of the Baja GeoGenomics consortium, which is funded by a five-year, $2.6 million grant from the National Science Foundation. An evolutionary biologist by training, Dolby has been working in Baja since she was a graduate student, for going on 14 years.
“I want to understand how changes in Earth’s surface shape speciation broadly,” Dolby said. “Mountains are built and erode, river networks develop, the climate changes. You have all these characteristics that have different effects. Then there is the evolutionary part — species have different dispersal abilities, different generation times. Before I retire, my goal is to have developed a better working theory about the types of Earth processes that most shape speciation and their limits. What are the things that matter, and can we quantify those?”
What is geogenomics?
"To me, the simplest definition of geogenomics is ‘a deep integration of geologic data and genomic data,’” Dolby said. “I have no further constraints than that."
Dolby believes these answers can emerge from geogenomics, a nascent field that traces its origins to a 2014 paper and the rapidly falling cost of gene sequencing. The vision as initially stated was to “use large genomic data to answer geological questions,” Dolby said. By collecting many samples of a species across varying terrain and comparing their gene sequences to a reference genome, you could estimate the timing back to a major Earth event whose age has been difficult to pin down with geological tests alone. Two examples from this hemisphere are the emergence of the Isthmus of Panama connecting North and South America and the rise of the Andes. These topographical changes should have left a record in the diverging genetic sequences of related species. But Dolby casts a wider net. “To me, the simplest definition of geogenomics is ‘a deep integration of geologic data and genomic data,’” she said. “I have no further constraints than that.”
“Deep integration” means developing new tools and new ways of seeing, Dolby says. It is not just a matter of adding the biologist’s genome-scale sequences to the radiometric dating, stratigraphy, pollen records, tree rings and other methods that geologists already use to measure time. “Most often, you have to go to math or physics” for fresh approaches to these problems, Dolby said. She sees her lab as “a test bed to try to understand how we can use these approaches to learn something new about the natural world.”
This involves time travel in more ways than one. “Basically, it is what Darwin did,” Dolby said. “Darwin came up with the theory of atoll formation, and a lot of his friends were geologists. But there is a lot of specialization that is needed today,” and that means biologists and geologists no longer share a common language.
Reconnecting the Earth and life sciences
In a 2022 paper, Dolby and the other members of the Baja GeoGenomics consortium wrote, “Future breakthroughs toward understanding the ‘unity of nature’ will require weaving together independent knowledge domains to reconnect the Earth and life sciences …. We are now poised to re-integrate these disciplines with a mechanism-focused perspective, inspired by the questions of past generations and the technology revolution of the present.”
The main challenge, the authors continue, “is the deep communication required across fields with different nomenclature and historical norms.” And the best way to kickstart the deep communication, they believe, is through shared fieldwork. “This knowledge integration requires the adoption of ways of seeing the world different from our own, or a transdisciplinary approach,” they wrote. “Within this framework, researchers learn to approach the questions of their field by seeing them from the angles of their collaborators.” Which is why the consortium launched its work with that six-day Baja trip in January 2020. “There is no substitute for shared discovery, particularly through joint fieldwork,” the researchers wrote in their 2021 paper.
“We are trying to bridge two completely different fields,” Dolby said. “This really is a new frontier; we don’t know what will work and what we will find.”
Convergence research among the snakes
Massive complexity and mutual incomprehensibility are not just a problem for biologists and geologists. Increasingly, breakthroughs will require transdisciplinary, rather than simply interdisciplinary, insights, or what the National Science Foundation calls “convergence research.” Which is another reason that the NSF is supporting the Baja GeoGenomics consortium work, along with a separate project that Dolby co-leads that is investigating drought tolerance in three pairs of rattlesnake species in the southwest United States.
The snake project could eventually inform conservation efforts in an era of climate change. The challenge is to integrate whole-genome, epigenome and transcriptome sequencing with behavioral and physiological data — when and how the snakes are active above ground. Dolby and her collaborators are comparing the different populations but also comparing which genes are turned on and off within individual snakes across seasons. “All of these can both directly and indirectly control a phenotype” (that is, observed behavior), Dolby said. “We look at coding sequence changes and we say, ‘These happen so slowly. How are species going to adapt to climate change?’ Over the last five to 10 years, though, we have learned that genes can be turned off transgenerationally by methylating them.” These are epigenetic changes. Biologists now understand that the levels of proteins actually made from a genetic blueprint — captured in transcriptome sequencing — vary seasonally as well.
“There are all these fine-tuning mechanisms, and we have no idea what their role is in natural environments,” Dolby said. This project, like the Baja GeoGenomics consortium study, “has a similar theme of adding complexity that we know is true and putting together the pieces we are learning in other fields. Because these snakes diverged at different times, do you have a shift in causal control? It could be behavioral adaptation first, then epigenetic, then as mutations accumulate, you see changes in the actual DNA sequence.”
Coping with a data explosion
The work is computationally intensive. “We get terabytes of data back” after samples are sent off for sequencing, Dolby said. Supercomputing clusters do the initial heavy lifting, and then a distilled version of the data is incorporated in statistical models.
"This knowledge integration requires the adoption of ways of seeing the world different from our own, or a transdisciplinary approach …. Within this framework, researchers learn to approach the questions of their field by seeing them from the angles of their collaborators."
— From "Integrating Earth-life systems: a geogenomic approach,” by Dolby et al., 2022
A data explosion in biology and other fields means that researchers are finding intriguing patterns all around. But which ones are causal, and which are mere associations, or just statistical noise? It is an issue called pseudocongruence. “You observe a pattern and infer a process, and you might incorrectly infer that process because there are multiple possible causes,” Dolby said. Biologists observe genetic variance and hypothesize increased rainfall, say, as a cause. “Oftentimes you have changes in climate but also the emergence of a river at the same time,” Dolby said. “You know all these things are happening. Are all of them important? Do they have an additive effect? To better understand evolution, it becomes necessary to tease them apart and test individually.” In practice, this might involve comparing gene sequencing data with a series of statistical simulations of various hypotheses to determine which of those simulations most closely matches experimental results.
“These projects are like a diamond,” Dolby said. “The geologists and biologists start in the same place, then diverge, and my job as leader of the consortium is to make sure that we are able to come back together again. It’s like Sudoku in a million dimensions, and it involves people as well as data, which is even harder.”
But the effort is worth it, Dolby believes. “Why are we doing these studies? To understand rules of life,” she said. “Eventually, all of this will translate to a local study system — freshwater or maybe cave systems,” in Alabama, Dolby added. “One of the reasons I came here was because I like the diversity — it is a culturally and demographically diverse place. There is a lot to draw from.”