Julie E. Elie and Boaz Styr, UC Berkeley

Non-Gene Regulatory DNA Identified via Artificial Intelligence also Associated with Autism in Humans

The vocalizations of humans, bats, whales, seals, and songbirds are vastly different from each other. Humans and birds, for example, are separated by some 300 million years of evolution. Still, scientists studying how these animals learn to “speak” have time and again seen surprising similarities in the connections in brain regions that support this vocal learning.

In a paper in the prestigious journal Science today, a multi-institutional team led by scientists at Carnegie Mellon University and University of California at Berkeley found parts of the genome, both within genes and outside of them, that evolved and are associated with vocal learning across mammals.

Employing a machine learning approach called TACIT (Tissue-Aware Conservation Inference Toolkit) on the Pittsburgh Supercomputing Center’s Bridges-2 system, the laboratory of Andreas Pfenning, an Associate Professor in the Ray and Stephanie Lane Computational Biology Department at Carnegie Mellon University and affiliated the Neuroscience Institute and Department of Biological Sciences, identified 50 gene regulatory elements from the brains of humans, bats, whales, and seals that have a much stronger relationship to vocalization. Regulatory elements are DNA sequences outside the actual genes that direct what genes are active in which tissues.Scientists have come to understand that regulatory elements play a big role in the evolution of behaviors. But studying them has been much more difficult than studying the genes.

“New artificial intelligence methods were needed to help find evolutionary signals in regulatory elements across hundreds of genomes,” said Pfenning, a corresponding author in the new study. “We’re entering an exciting era where AI is improving our ability to trace human evolutionary history.”

Studying the gene regulatory elements requires building a map of which ones are active in the relevant brain region of species with vocal learning behavior. That relevant brain region was found through experiments conducted in the laboratory of Michael Yartsev, another corresponding author, at UC Berkeley. They found evidence that a specific part of the Egyptian fruit bat brain has similar neural connections to the part of the human brain that controls speech production.

“The types of cells that form long range connections in the human and bat brain are the same ones that we discovered as most relevant to vocal learning based on the genetic analysis,” Pfenning said. “The anatomy and genetics are both pointing to the same mechanism underlying the evolution of vocal learning across mammals and speech production in humans.”

Both vocal learning-associated genes and the gene regulatory elements discovered in this study tend to be in parts of the genome related to autism spectrum disorder. This suggests that studying the evolutionary history of the human genome can provide clues how they influence human health.

You can find the paper here.

The CMU School of Computer Science’s announcement is here.

The Computational Biology Department’s announcement is here.