Skip to main content

Study provides strong framework for 1 billion years of green plant evolution

International collaboration generates gene sequences from more than 1,100 plant species
The 1KP study produced a data-rich framework for green plant evolution.
The 1KP study produced a data-rich framework for green plant evolution.

A billion years ago, an ancestral algal species split in two, starting the evolution of green plants that has led to the nearly half million diverse species we have today. Now, an international consortium of plant scientists has generated thousands of gene sequences from each of more than 1,100 plant species, providing the most data-rich framework to date for understanding the evolutionary history of this green tree of life.

Norman Wickett, a conservation scientist at Northwestern University and the Chicago Botanic Garden, is a longtime member of the One Thousand Plant Transcriptomes Initiative (1KP), the consortium of close to 200 researchers that conducted the nine-year study. 

The researchers sequenced tens of thousands of genes for species distributed across all major groups of land plants and their green algal relatives. Their findings were published as the cover article in the Oct. 31 print issue of the journal Nature. Wickett Nature Cover4

One key finding of the study is that fundamental events in evolutionary history, such as the colonization of land by plants, may have followed a different set of steps than was previously thought. Another important discovery is that gene and genome duplications may have been important for the challenge of colonizing land and for adapting to new environments as the environments on land changed over time.

“Determining how plants -- or any organisms -- are related to each other provides the foundation for understanding the timing and significance of major events in the history of life on Earth,” said Wickett, a co-author of the study. “The more data we have to figure out these relationships, the more precisely we can do this. Our project gave us an unprecedented data set for not only putting together the evolutionary tree of plants but also for understanding the diversity and evolutionary complexity of genes at different times in their history.”

Wickett is an adjunct professor in the Program in Plant Biology and Conservation in Northwestern’s Weinberg College of Arts and Sciences and an associate conservation scientist at the Chicago Botanic Garden. His contributions to the 1KP study include helping determine how today’s plants and algae are related to one another and developing certain data processing methods necessary for making comparisons that use such a large amount of data.

Gane Ka-Shu Wong, a professor at the University of Alberta, is the collaboration’s lead investigator and a co-corresponding author of the study. The other co-corresponding author is James Leebens-Mack, a professor of plant biology at the University of Georgia. 

Flowers and fruit and other genetic innovations

“In the tree of life, everything is interrelated,” Wong said. “And if we want to understand how the tree of life works, we need to examine the relationships between species. That’s where genetic sequencing comes in.”

The findings reveal the timing of whole genome duplications and the origins, expansions and contractions of gene families contributing to fundamental genetic innovations enabling the evolution of green algae, mosses, ferns, conifer trees, flowering plants and all other green plant lineages. The history of how and when plants secured the ability to grow tall and make seeds, flowers and fruits provides a framework for understanding plant diversity around the planet, including annual crops and long-lived forest tree species. 

“Our inferred relationships among living plant species inform us that over the billion years since an ancestral green algal species split into two separate evolutionary lineages, one including flowering plants, land plants and related algal groups and the other comprising a diverse array of green algae, plant evolution has been punctuated with innovations and periods of rapid diversification,” Leebens-Mack said. 

New data and computational tools needed

“In order to link what we know about gene and genome evolution to a growing understanding of gene function in flowering plant, moss and algal organisms, we needed to generate new data to better reflect gene diversity among all green plant lineages,” Leebens-Mack said. 

The study inspired a community effort to gather and sequence diverse plant lineages derived from terrestrial and aquatic habitats on a global scale. More than 100 taxonomic specialists contributed material from field and living collections that include the Central Collection of Algal Cultures, Royal Botanic Gardens, Kew, Royal Botanic Garden Edinburgh, Atlanta Botanical Garden, New York Botanical Garden, Fairylake Botanical Garden, Shenzhen, The Florida Museum of Natural History, Duke University, University of British Columbia Botanical Garden and The University of Alberta.

By sequencing and analyzing genes from a broad sampling of plant species, researchers are better able to reconstruct gene content in the ancestors of all crops and model plant species and gain a more complete picture of the gene and genome duplications that enabled evolutionary innovations. 

The massive scope of the project demanded development and refinement of new computational tools for sequence assembly and phylogenetic analysis.

“New algorithms were developed by software engineers at BGI to assemble the massive volume of gene sequence data generated for this project,” Wong explained.

Founder professor of computer science Tandy Warnow, of the University of Illinois at Urbana-Champaign, and Siavash Mirarab, assistant professor of electrical and computer engineering at the University of California San Diego, developed new algorithms for inferring evolutionary relationships from hundreds of gene sequences for more than 1,000 species, addressing substantial heterogeneity in evolutionary histories across the genomes.

Focus on genome duplications

The timing of 244 whole genome duplications across the green plant tree of life was one of the interrelated research foci of the project.

“Perhaps the biggest surprise of our analyses was the near absence of whole genome duplications in the algae,” said Mike Barker, associate professor of ecology and evolutionary biology at the University of Arizona.

“Building on nearly 20 years of research on plant genomes, we found that the average flowering plant genome has nearly four rounds of ancestral genome duplication dating as far back as the common ancestor of all seed plants more than 300 million years ago,” he said. “We also find multiple rounds of genome duplication in fern lineages, but there is little evidence of genome doubling in algal lineages.” 

In addition to genome duplications, the expansion of key gene families has contributed to the evolution of multicellularity and complexity in green plants.

“Gene family expansions through duplication events catalyzed diversification of plant form and function across the green tree of life,” said co-author Marcel Quint, professor of crop physiology at Halle University, Germany. “Such expansions unleashed during terrestrialization, or even before, set the stage for evolutionary innovations including the origin of the seed and, later, the origin of the flower.” 

“The view of evolutionary relationships provided by 1KP has led to new hypotheses about the origins of key structures and processes in green plants,” said co-author Pam Soltis, of the Florida Museum of Natural History, University of Florida.

Nearly a decade ago, Wong organized private funding through the Somekh Family Foundation as well as support from the Government of Alberta and a sequencing commitment from BGI in Shenzhen, China, to launch 1KP. Once the project was operational, additional resources came from other ongoing projects, including iPlant (now CyVerse) funded by the U.S. National Science Foundation. 

The paper is titled “One Thousand Plant Transcriptomes and Phylogenomics of Green Plants” and was published online Oct. 23 in Nature. Sequences, sequence alignments and tree data are available through the CyVerse Data Commons.