Using Case Studies with Large Classes

Why Use Case Studies?

Case studies are powerful tools for teaching. They explore the story behind scientific research to understand the phenomenon being studied, the question the scientist asked, the thinking they used to investigate it, and the data they collected to help students better understand the process and content of science.

A strength of this approach is that it gives students the chance to consider how they would investigate a topic. Their answers are often similar to what the researchers being studied did. But students also come up with novel perspectives and unique approaches to the problems.

Many BioInteractive resources lend themselves to a case study approach. In most instances, what I ultimately decide is to convert the resource into a case study. For example, the video Animated Life: Mary Leakey is an excellent tool to get students thinking about the logic scientists use to study fossils and extinct species. Data Point resources are also a rich source of figures and questions that can be copied and pasted into a presentation to provide a brief case study that introduces a topic.

The Challenge for Large Classes

Many BioInteractive activities are structured in a way that they are particularly useful for smaller groups and classes. And by smaller, I am thinking of fewer than 50 students. To some colleagues, that may seem to be a large class size. Indeed, in many instances, it probably is more than is optimal.

However, when I refer to large classes, what I am thinking of are the large introductory classes encountered in many colleges and universities in which enrollment can range from 100 to 500 or more depending on the institution. Classes of this size present instructors with the dual challenges of not just numbers but also anonymity. It’s logistically unmanageable to share and distribute printed copies of handouts or worksheets.

How to Scale Up

So how can an instructor promote the interaction that is essential to the success of these types of case study activities in such a large group? These are issues I grappled with when I went from teaching at a small liberal arts college where my classes were smaller than 30 to teaching at a large university with classes of several hundreds. I have found what I think are four parts to an effective solution.

1. Define a learning objective.

First and foremost, whether I have 30 or 300 students, I try to think about why I want to use a particular BioInteractive resource. I consider what it is that I want the students to do or think about while using the resource. How do I want them to be different after completing the assignment? In essence, I define the learning objective so I can determine the most effective platform and approach to deliver the lesson utilized in the resource.

2. Create presentations with strategic pause points.

PowerPoint is a common tool for delivering material in large classrooms. It is quite easy to take images and questions from BioInteractive resource PDFs and insert them into slides. After reading the teaching notes and text in the student handouts, it’s relatively simple to develop the story that weaves the slides together in an interrupted case study. This is a style of case study that progressively leads students through the information with carefully planned “reveals” of information and strategically placed questions as stopping points to ponder the material along the way.

Videos are also fabulous resources to use during interrupted case studies in class. For example, I regularly use the video Niche Partitioning and Species Coexistence , which describes Dr. Rob Pringle’s work on niche partitioning in the savanna, as the core of a video case study in class. After the class watches the video for a few minutes, I stop and ask students about the phenomenon being studied and approaches that could be used to answer different questions.

I often use the following questions/prompts:

  • Why would anyone care about factors shaping species presence or absence?
  • Think about what factors could be important influences on shaping species richness in a community.

How can we use modern techniques to study what an animal is eating when we can’t watch the animal eat? The video does an excellent job of addressing these topics and showing how researchers developed a creative approach to applying molecular techniques to answer ecological questions. How awesome is it that one video can help students tie together the central dogma, ecological theory, and community concepts! Depending on how much an instructor wants to structure the video case study in advance, it is even possible to embed small video clips and questions directly into a PowerPoint presentation.

3. Have students use clickers.

How should we tell the scientific story to large numbers of students and engage them in it? Clickers are a particularly helpful tool for asking questions about experiments, concepts, or results, because they present students with a specific moment when they need to choose among different options for a survey of their opinion or decide among right and wrong answers in a multiple-choice question.

For example, I typically start a case study with survey questions asking students to identify what they think is the most important item on a list of potential phenomena or to give their feedback about an issue in a Likert-scale response. Later, as the case study develops, I ask more specific questions about the experiment that require students to predict experimental outcomes or interpret a figure. For example, when I use the video The Effects of Fungicides on Bumble Bee Colonies , I show students several bar graphs with possible outcomes for the experiment and have them pick which they think the researchers will observe. After revealing the actual results, I ask them questions about interpreting the results and whether the results support the experimental hypothesis. I always allow students to talk and help one another during clicker questions to enhance their interaction and give them a choice to go along with a group opinion or answer based on their individual thinking.

4. Flip the classroom.

Another effective way to use BioInteractive resources in large classes is to use videos to flip a class session. BioInteractive animations and short films are rich with information that can pique interest, start discussions, or provide fundamental information. For example, I recently had my students watch the Genes as Medicine short film outside of class time. I asked them to then imagine they were an alien that found this video clip and to consider what information it would give them about life on Earth. This sparked a lively discussion about what life is to start the next class meeting that was more interesting than me going through a checklist of terms and definitions. Students had to uncover the characteristics of life from the video for themselves.

Benefits and Takeaways

What I hope these hints and suggestions from my own experiences show is how relatively simple it can be to scale up these resources to engage a class of any size. When they first encounter case studies, students can be a little unsure about this approach that requires them to talk to one another in a setting where they are expecting to be a face in the crowd. However, after they experience one or two case studies, I can see groups of students talking and exchanging ideas about the case. They are no longer passive listeners sitting in a room but instead have become active problem solvers seeking answers together. I can leave the stage and mingle through the room to listen to their discussions and encourage them as they develop their answers. This also gives me an opportunity to interact with students besides those sitting in the front row and to further develop a sense of community and connection, solving one of the challenges with big classes: anonymity.

It has been my experience that students quickly adapt to and begin to enjoy this approach. Rather than sitting in class watching yet another series of PowerPoint slides flash by, they are thinking and talking about science with one another. After my students talk things through with their neighbors and “shoulder buddies” during a case study, I find that they are more likely to speak up in class during the case study and at other points during the course.

At the beginning of the semester, I can barely get anyone to answer a question. After a few case studies, students begin asking and answering questions (even when we aren’t doing case studies), and the level of participation by different students in the room is noticeably higher. So in addition to case studies being a more interesting way for me as the teacher to present material to students and explore different biological topics, this approach also has the added benefits of helping build confidence within individual students and community among students, which makes a more rewarding and exciting learning environment for everyone.

Educational case studies based on examples of simulated or real research data can engage students in the process of thinking like a scientist, even when it is not possible to get into the field or laboratory to actually run an experiment. They can help overcome the challenges of data analysis and interpretation that are at the core of science education experiences. The collections of different resources available through HHMI BioInteractive provide a menu of modules for instructors to choose from that do just that. They get students to explore important biological topics from a variety of different approaches and look at the world through the lenses of different scientists. Regardless of what the actual format of a resource is when I encounter it, I know that it is possible to scale it up in some way to meet the needs of my classes.

Come join a  conversation  about this blog post at our Facebook group!

Phil Gibson is a professor at the University of Oklahoma, where he enjoys teaching his students that learning a little botany never hurt anyone and is probably good for them in the long run. When he’s not thinking about new resources to use in class, he enjoys hiking with his family, listening to music, and cooking outrageously large breakfasts on the weekends.

Related Articles

Cindy Gay presents BioInteractive's short video clip demonstrating the fruit fly courtship dance and its role in selection. She describes how she uses the clip in her AP® Biology curriculum, and how it connects to topics ranging from animal behavior to sympatric speciation.

Kim Parfitt describes two activities (now merged into the activity “Scientific Inquiry and Data Analysis Using WildCam Gorongosa”) associated with the WildCam Gorongosa project. She also discusses a short film on lion populations in Gorongosa that she uses to introduce the topic.

In this video Educator Voices post, hear from St. John Fisher College professor Kaitlin Bonner about how she uses a publicly available data set, along with BioInteractive’s elephant resources, to have her students investigate data.

Want to create or adapt books like this? Learn more about how Pressbooks supports open publishing practices.

5.1 Case Study: Genes and Inheritance

Created by: CK-12/Adapted by Christine Miller

Case Study: Cancer in the Family

People tend to carry similar traits to their biological parents, as illustrated by the family tree. Beyond just appearance, you can also inherit traits from your parents that you can’t  see.

Rebecca becomes very aware of this fact when she visits her new doctor for a physical exam. Her doctor asks several questions about her family medical history, including whether Rebecca has or had relatives with cancer. Rebecca tells her that her grandmother, aunt, and uncle — who have all passed away — had cancer. They all had breast cancer, including her uncle, and her aunt also had ovarian cancer. Her doctor asks how old they were when they were diagnosed with cancer. Rebecca is not sure exactly, but she knows that her grandmother was fairly young at the time, probably in her forties.

Rebecca’s doctor explains that while the vast majority of cancers are not due to inherited factors, a cluster of cancers within a family may indicate that there are mutations in certain genes that increase the risk of getting certain types of cancer, particularly breast and ovarian cancer. Some signs that cancers may be due to these genetic factors are present in Rebecca’s family, such as cancer with an early age of onset (e.g., breast cancer before age 50), breast cancer in men, and breast cancer and ovarian cancer within the same person or family.

Based on her family medical history, Rebecca’s doctor recommends that she see a genetic counselor, because these professionals can help determine whether the high incidence of cancers in her family could be due to inherited mutations in their genes. If so, they can test Rebecca to find out whether she has the particular variations of these genes that would increase her risk of getting cancer.

When Rebecca sees the genetic counselor, he asks how her grandmother, aunt, and uncle with cancer are related to her. She says that these relatives are all on her mother’s side — they are her mother’s mother and siblings. The genetic counselor records this information in the form of a specific type of family tree, called a pedigree, indicating which relatives had which type of cancer, and how they are related to each other and to Rebecca.

He also asks her ethnicity. Rebecca says that her family on both sides are Ashkenazi Jews (Jews whose ancestors came from central and eastern Europe). “But what does that have to do with anything?” she asks. The counselor tells Rebecca that mutations in two tumor-suppressor genes called BRCA1 and BRCA2 , located on chromosome 17 and 13, respectively, are particularly prevalent in people of Ashkenazi Jewish descent and greatly increase the risk of getting cancer. About one in 40 Ashkenazi Jewish people have one of these mutations, compared to about one in 800 in the general population. Her ethnicity, along with the types of cancer, age of onset, and the specific relationships between her family members who had cancer, indicate to the counselor that she is a good candidate for genetic testing for the presence of these mutations.

Rebecca says that her 72-year-old mother never had cancer, nor had many other relatives on that side of the family. How could the cancers be genetic? The genetic counselor explains that the mutations in the BRCA1 and BRCA2 genes, while dominant, are not inherited by everyone in a family. Also, even people with mutations in these genes do not necessarily get cancer — the mutations simply increase their risk of getting cancer. For instance, 55 to 65 per cent of women with a harmful mutation in the BRCA1 gene will get breast cancer before age 70, compared to 12 per cent of women in the general population who will get breast cancer sometime over the course of their lives.

Rebecca is not sure she wants to know whether she has a higher risk of cancer. The genetic counselor understands her apprehension, but explains that if she knows that she has harmful mutations in either of these genes, her doctor will screen her for cancer more often and at earlier ages. Therefore, any cancers she may develop are likely to be caught earlier when they are often much more treatable. Rebecca decides to go through with the testing, which involves taking a blood sample, and nervously waits for her results.

Chapter Overview: Genetics

At the end of this chapter, you will find out Rebecca’s test results. By then, you will have learned how traits are inherited from parents to offspring through genes, and how mutations in genes such as BRCA1 and BRCA2 can be passed down and cause disease. Specifically, you will learn about:

  • The structure of DNA.
  • How DNA replication occurs.
  • How DNA was found to be the inherited genetic material.
  • How genes and their different alleles are located on chromosomes.
  • The 23 pairs of human chromosomes, which include autosomal and sex chromosomes.
  • How genes code for proteins using codons made of the sequence of nitrogen bases within RNA and DNA.
  • The central dogma of molecular biology, which describes how DNA is transcribed into RNA, and then translated into proteins.
  • The structure, functions, and possible evolutionary history of RNA.
  • How proteins are synthesized through the transcription of RNA from DNA and the translation of protein from RNA, including how RNA and proteins can be modified, and the roles of the different types of RNA.
  • What mutations are, what causes them, different specific types of mutations, and the importance of mutations in evolution and to human health.
  • How the expression of genes into proteins is regulated and why problems in this process can cause diseases, such as cancer.
  • How Gregor Mendel discovered the laws of inheritance for certain types of traits.
  • The science of heredity, known as genetics, and the relationship between genes and traits.
  • How gametes, such as eggs and sperm, are produced through meiosis.
  • How sexual reproduction works on the cellular level and how it increases genetic variation.
  • Simple Mendelian and more complex non-Mendelian inheritance of some human traits.
  • Human genetic disorders, such as Down syndrome, hemophilia A, and disorders involving sex chromosomes.
  • How biotechnology — which is the use of technology to alter the genetic makeup of organisms — is used in medicine and agriculture, how it works, and some of the ethical issues it may raise.
  • The human genome, how it was sequenced, and how it is contributing to discoveries in science and medicine.

As you read this chapter, keep Rebecca’s situation in mind and think about the following questions:

  • BCRA1 and BCRA2 are also called Breast cancer type 1 and 2 susceptibility proteins.  What do the BRCA1 and BRCA2 genes normally do? How can they cause cancer?
  • Are BRCA1 and BRCA2 linked genes? Are they on autosomal or sex chromosomes?
  • After learning more about pedigrees, draw the pedigree for cancer in Rebecca’s family. Use the pedigree to help you think about why it is possible that her mother does not have one of the BRCA gene mutations, even if her grandmother, aunt, and uncle did have it.
  • Why do you think certain gene mutations are prevalent in certain ethnic groups?

Attributions

Figure 5.1.1

Family Tree [all individual face images] from Clker.com used and adapted by Christine Miller under a CC0 1.0 public domain dedication license (https://creativecommons.org/publicdomain/zero/1.0/).

Figure 5.1.2

Rebecca by Kyle Broad on Unsplash is used under the Unsplash License (https://unsplash.com/license).

Wikipedia contributors. (2020, June 27). Ashkenazi Jews. In  Wikipedia.  https://en.wikipedia.org/w/index.php?title=Ashkenazi_Jews&oldid=964691647

Wikipedia contributors. (2020, June 22). BRCA1. In Wikipedia . https://en.wikipedia.org/w/index.php?title=BRCA1&oldid=963868423

Wikipedia contributors. (2020, May 25). BRCA2. In  Wikipedia.  https://en.wikipedia.org/w/index.php?title=BRCA2&oldid=958722957

Human Biology Copyright © 2020 by Christine Miller is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License , except where otherwise noted.

Share This Book

American Society for Microbiology

  • Browse By Content Type

Browse By Content Type Case Study

Uncover interesting and unusual findings in the microbiology laboratory by browsing case studies, shared by your clinical and public health microbiology colleagues. Cases can be used as a teaching tool or to further your individual knowledge of the field.

  • Sign up for CPHM Virtual Journal Club.
  • Learn ASM’s position on the VALID Act.
  • Apply for a CPEP Fellowship.
  • Read top clinical microbiology textbooks online with ClinMicroNow.
  • Get ASM journal articles on CPHM straight to my inbox.

ASM Microbe 2024 Registration Now Open!

Discover asm membership, get published in an asm journal.

Let your curiosity lead the way:

Request Info

  • Arts & Sciences
  • Graduate Studies in A&S

Case Studies in Biology: Climate and Health Exploration Course

This is an online course offered in the summer and open to current high school students.

As scientists, we take a lot of STEM classes, including biology, chemistry, physics, and math. But we often don’t have time to connect all of this information together. That’s where case studies are so incredibly helpful especially to organizations such as the CDC and World Health Organization. This course will use real world examples to help teach students about the scientific process and how theories and hypotheses are developed. Sometimes the answers aren’t clear, and even experts can’t agree. Using case studies focused on climate and it's connection to health, we will analyze data and apply biology concepts to learn about how to form a solid argument, supported by evidence from published research. This is your chance to learn how to conduct systematic literature reviews and meta-analyses to analyze scientific controversies and develop your own theories. Students with an interest in both biology and environmental science are welcome.

Prerequisite : none

Experience WashU From Home

Our online Exploration Courses are ideal for students looking for the flexibility of an online experience to build college readiness skills. Courses provide students an introduction to many of the majors, fields of study, and interdisciplinary programs offered by the College of Arts & Sciences. 

woman walking on path in the woods

  • Introduction to Environmental Science Exploration Course
  • Environmental Studies Institute

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • View all journals
  • Explore content
  • About the journal
  • Publish with us
  • Sign up for alerts
  • Perspective
  • Published: 13 May 2024

Integrating population genetics, stem cell biology and cellular genomics to study complex human diseases

  • Nona Farbehi   ORCID: orcid.org/0000-0001-8461-236X 1 , 2 , 3   na1 ,
  • Drew R. Neavin   ORCID: orcid.org/0000-0002-1783-6491 1   na1 ,
  • Anna S. E. Cuomo 1 , 4 ,
  • Lorenz Studer   ORCID: orcid.org/0000-0003-0741-7987 3 , 5 ,
  • Daniel G. MacArthur 4 , 6 &
  • Joseph E. Powell   ORCID: orcid.org/0000-0002-5070-4124 1 , 3 , 7  

Nature Genetics volume  56 ,  pages 758–766 ( 2024 ) Cite this article

2706 Accesses

13 Altmetric

Metrics details

  • Population genetics
  • Transcriptomics

Human pluripotent stem (hPS) cells can, in theory, be differentiated into any cell type, making them a powerful in vitro model for human biology. Recent technological advances have facilitated large-scale hPS cell studies that allow investigation of the genetic regulation of molecular phenotypes and their contribution to high-order phenotypes such as human disease. Integrating hPS cells with single-cell sequencing makes identifying context-dependent genetic effects during cell development or upon experimental manipulation possible. Here we discuss how the intersection of stem cell biology, population genetics and cellular genomics can help resolve the functional consequences of human genetic variation. We examine the critical challenges of integrating these fields and approaches to scaling them cost-effectively and practically. We highlight two areas of human biology that can particularly benefit from population-scale hPS cell studies, elucidating mechanisms underlying complex disease risk loci and evaluating relationships between common genetic variation and pharmacotherapeutic phenotypes.

This is a preview of subscription content, access via your institution

Access options

Access Nature and 54 other Nature Portfolio journals

Get Nature+, our best-value online-access subscription

24,99 € / 30 days

cancel any time

Subscribe to this journal

Receive 12 print issues and online access

195,33 € per year

only 16,28 € per issue

Buy this article

  • Purchase on Springer Link
  • Instant access to full article PDF

Prices may be subject to local taxes which are calculated during checkout

case study on biology

Similar content being viewed by others

case study on biology

Identifying proteomic risk factors for cancer using prospective and exome analyses of 1463 circulating proteins and risk of 19 cancers in the UK Biobank

case study on biology

In vitro reconstitution of epigenetic reprogramming in the human germ line

A deep catalogue of protein-coding variation in 983,578 individuals.

Thomson, J. A. Embryonic stem cell lines derived from human blastocysts. Science https://doi.org/10.1126/science.282.5391.1145 (1998).

Takahashi, K. & Yamanaka, S. Induction of pluripotent stem cells from mouse embryonic and adult fibroblast cultures by defined factors. Cell 126 , 663–676 (2006).

Article   CAS   PubMed   Google Scholar  

Takahashi, K. et al. Induction of pluripotent stem cells from adult human fibroblasts by defined factors. Cell 131 , 861–872 (2007).

Liu, G., David, B. T., Trawczynski, M. & Fessler, R. G. Advances in pluripotent stem cells: history, mechanisms, technologies, and applications. Stem Cell Rev. Rep. 16 , 3–32 (2020).

Article   PubMed   Google Scholar  

Efrat, S. Epigenetic memory: lessons from iPS cells derived from human β cells. Front. Endocrinol. 11 , 614234 (2020).

Article   Google Scholar  

Anderson, R. H. & Francis, K. R. Modeling rare diseases with induced pluripotent stem cell technology. Mol. Cell. Probes 40 , 52–59 (2018).

Article   CAS   PubMed   PubMed Central   Google Scholar  

Spitalieri, P., Talarico, V. R., Murdocca, M., Novelli, G. & Sangiuolo, F. Human induced pluripotent stem cells for monogenic disease modelling and therapy. World J. Stem Cells 8 , 118–135 (2016).

Article   PubMed   PubMed Central   Google Scholar  

Passier, R., Orlova, V. & Mummery, C. Complex tissue and disease modeling using hiPSCs. Cell Stem Cell 18 , 309–321 (2016).

Warren, C. R., Jaquish, C. E. & Cowan, C. A. The NextGen genetic association studies consortium: a foray into in vitro population genetics. Cell Stem Cell 20 , 431–433 (2017).

Visscher, P. M., Brown, M. A., McCarthy, M. I. & Yang, J. Five years of GWAS discovery. Am. J. Hum. Genet. 90 , 7–24 (2012).

Tak, Y. G. & Farnham, P. J. Making sense of GWAS: using epigenomics and genome engineering to understand the functional relevance of SNPs in non-coding regions of the human genome. Epigenetics Chromatin 8 , 57 (2015).

Umans, B. D., Battle, A. & Gilad, Y. Where are the disease-associated eQTLs? Trends Genet. 37 , 109–124 (2021).

Yazar, S. et al. Single-cell eQTL mapping identifies cell type–specific genetic control of autoimmune disease. Science 376 , eabf3041 (2022).

Jerber, J. et al. Population-scale single-cell RNA-seq profiling across dopaminergic neuron differentiation. Nat. Genet. 53 , 304–312 (2021).

Neavin, D. et al. Single cell eQTL analysis identifies cell type-specific genetic control of gene expression in fibroblasts and reprogrammed induced pluripotent stem cells. Genome Biol. 22 , 76 (2021).

Cuomo, A. S. E. et al. Single-cell RNA-sequencing of differentiating iPS cells reveals dynamic genetic effects on gene expression. Nat. Commun. 11 , 810 (2020).

Warren, C. R. et al. Induced pluripotent stem cell differentiation enables functional validation of GWAS variants in metabolic disease. Cell Stem Cell 20 , 547–557 (2017).

Kishore, S. et al. A non-coding disease modifier of pancreatic agenesis identified by genetic correction in a patient-derived iPSC line. Cell Stem Cell 27 , 137–146 (2020).

Magdy, T. et al. RARG variant predictive of doxorubicin-induced cardiotoxicity identifies a cardioprotective therapy. Cell Stem Cell 28 , 2076–2089 (2021).

Bourgeois, S. et al. Towards a functional cure for diabetes using stem cell-derived beta cells: are we there yet? Cells 10 , 191 (2021).

Sharma, A., Sances, S., Workman, M. J. & Svendsen, C. N. Multi-lineage human iPSC-derived platforms for disease modeling and drug discovery. Cell Stem Cell 26 , 309–329 (2020).

Volpato, V. & Webber, C. Addressing variability in iPSC-derived models of human disease: guidelines to promote reproducibility. Dis. Model. Mech. 13 , dmm042317 (2020).

Banovich, N. E. et al. Impact of regulatory variation across human iPSCs and differentiated cells. Genome Res. 28 , 122–131 (2018).

Kilpinen, H. et al. Common genetic variation drives molecular heterogeneity in human iPSCs. Nature 546 , 370–375 (2017).

Panopoulos, A. D. et al. iPSCORE: a resource of 222 iPSC lines enabling functional characterization of genetic variation across a variety of cell types. Stem Cell Rep. 8 , 1086–1100 (2017).

Article   CAS   Google Scholar  

Chen, G., Ning, B. & Shi, T. Single-cell RNA-seq technologies and related computational data analysis. Front. Genet. 10 , 317 (2019).

Elorbany, R. et al. Single-cell sequencing reveals lineage-specific dynamic genetic regulation of gene expression during human cardiomyocyte differentiation. PLoS Genet. 18 , e1009666 (2022).

Ward, M. C., Banovich, N. E., Sarkar, A., Stephens, M. & Gilad, Y. Dynamic effects of genetic variation on gene expression revealed following hypoxic stress in cardiomyocytes. eLife 10 , e57345 (2021).

Shi, Z.-D. et al. Genome editing in hPSCs reveals GATA6 haploinsufficiency and a genetic interaction with GATA4 in human pancreatic development. Cell Stem Cell 20 , 675–688 (2017).

Strober, B. J. et al. Dynamic genetic regulation of gene expression during cellular differentiation. Science 364 , 1287–1290 (2019).

González, F. et al. An iCRISPR platform for rapid, multiplexable, and inducible genome editing in human pluripotent stem cells. Cell Stem Cell 15 , 215–226 (2014).

Barbeira, A. N. et al. Exploiting the GTEx resources to decipher the mechanisms at GWAS loci. Genome Biol. 22 , 49 (2021).

Hamazaki, T., El Rouby, N., Fredette, N. C., Santostefano, K. E. & Terada, N. Concise review: induced pluripotent stem cell research in the era of precision medicine. Stem Cells 35 , 545–550 (2017).

Cuomo, A. S. E. et al. CellRegMap: a statistical framework for mapping context-specific regulatory variants using scRNA-seq. Mol. Syst. Biol. 18 , e10663 (2022).

Cuomo, A. S. E., Nathan, A., Raychaudhuri, S., MacArthur, D. G. & Powell, J. E. Single-cell genomics meets human genetics. Nat. Rev. Genet. 24 , 535–549 (2023).

Mirauta, B. A. et al. Population-scale proteome variation in human induced pluripotent stem cells. eLife 9 , e57390 (2020).

Findley, A. S. et al. Functional dynamic genetic effects on gene regulation are specific to particular cell types and environmental conditions. eLife 10 , e67077 (2021).

Kimura, M. et al. En masse organoid phenotyping informs metabolic-associated genetic susceptibility to NASH. Cell https://doi.org/10.1016/j.cell.2022.09.031 (2022).

Llufrio, E. M., Wang, L., Naser, F. J. & Patti, G. J. Sorting cells alters their redox state and cellular metabolome. Redox Biol. 16 , 381–387 (2018).

Shen, S. et al. Integrating single-cell genomics pipelines to discover mechanisms of stem cell differentiation. Trends Mol. Med. https://doi.org/10.1016/j.molmed.2021.09.006 (2021).

van der Wijst, M. et al. The single-cell eQTLGen consortium. eLife 9 , e52155 (2020).

Soskic, B. et al. Immune disease risk variants regulate gene expression dynamics during CD4 + T cell activation. Nat. Genet. 54 , 817–826 (2022).

Daniszewski, M. et al. Retinal ganglion cell-specific genetic regulation in primary open-angle glaucoma. Cell Genomics 2 , 100142 (2022).

Senabouth, A. et al. Transcriptomic and proteomic retinal pigment epithelium signatures of age-related macular degeneration. Nat. Commun. 13 , 4233 (2022).

Benaglio, P. et al. Mapping genetic effects on cell type-specific chromatin accessibility and annotating complex immune trait variants using single nucleus ATAC-seq in peripheral blood. PLoS Genet. 19 , e1010759 (2023).

Baysoy, A., Bai, Z., Satija, R. & Fan, R. The technological landscape and applications of single-cell multi-omics. Nat. Rev. Mol. Cell Biol. 24 , 695–713 (2023).

Weinshilboum, R. M. & Wang, L. Pharmacogenomics: precision medicine and drug response. Mayo Clin. Proc. 92 , 1711–1722 (2017).

Pirmohamed, M. Personalized pharmacogenomics: predicting efficacy and adverse drug reactions. Annu. Rev. Genom. Hum. Genet. 15 , 349–370 (2014).

Nelson, M. R. et al. The support of human genetic evidence for approved drug indications. Nat. Genet. 47 , 856–860 (2015).

Hay, M., Thomas, D. W., Craighead, J. L., Economides, C. & Rosenthal, J. Clinical development success rates for investigational drugs. Nat. Biotechnol. 32 , 40–51 (2014).

Holmgren, G. et al. Long-term chronic toxicity testing using human pluripotent stem cell-derived hepatocytes. Drug Metab. Dispos. 42 , 1401–1406 (2014).

Kim, J.-H., Kang, M., Jung, J.-H., Lee, S.-J. & Hong, S.-H. Human pluripotent stem cell-derived alveolar epithelial cells as a tool to assess cytotoxicity of particulate matter and cigarette smoke extract. Dev. Reprod. 26 , 155–163 (2022).

Sharma, A. et al. High-throughput screening of tyrosine kinase inhibitor cardiotoxicity with human induced pluripotent stem cells. Sci. Transl. Med. 9 , eaaf2584 (2017).

Han, Y. et al. Identification of SARS-CoV-2 inhibitors using lung and colonic organoids. Nature 589 , 270–275 (2021).

Lam, C. K. & Wu, J. C. Clinical trial in a dish: using patient-derived induced pluripotent stem cells to identify risks of drug-induced cardiotoxicity. Arterioscler. Thromb. Vasc. Biol. 41 , 1019–1031 (2021).

Iwata, R. et al. Mitochondria metabolism sets the species-specific tempo of neuronal development. Science 379 , eabn4705 (2023).

Miller, J. D. et al. Human iPSC-based modeling of late-onset disease via progerin-induced aging. Cell Stem Cell 13 , 691–705 (2013).

Hergenreder, E. et al. Combined small-molecule treatment accelerates maturation of human pluripotent stem cell-derived neurons. Nat. Biotechnol. https://doi.org/10.1038/s41587-023-02031-z (2024).

Fowler, J. L., Ang, L. T. & Loh, K. M. A critical look: challenges in differentiating human pluripotent stem cells into desired cell types and organoids. Wiley Interdiscip. Rev. Dev. Biol. 9 , e368 (2020).

Jiang, S., Feng, W., Chang, C. & Li, G. Modeling human heart development and congenital defects using organoids: how close are we? J. Cardiovasc. Dev. Dis. 9 , 125 (2022).

CAS   PubMed   PubMed Central   Google Scholar  

Tremmel, D. M. et al. Validating expression of beta cell maturation-associated genes in human pancreas development. Front. Cell Dev. Biol. 11 , 1103719 (2023).

Washer, S. J. et al. Single-cell transcriptomics defines an improved, validated monoculture protocol for differentiation of human iPSC to microglia. Sci. Rep. 12 , 19454 (2022).

Wolf, F. A., Angerer, P. & Theis, F. J. SCANPY: large-scale single-cell gene expression data analysis. Genome Biol. 19 , 15 (2018).

Wilson, S. B. et al. DevKidCC allows for robust classification and direct comparisons of kidney organoid datasets. Genome Med. 14 , 19 (2022).

Subramanian, A. et al. Single cell census of human kidney organoids shows reproducibility and diminished off-target cells after transplantation. Nat. Commun. 10 , 5462 (2019).

Kammers, K. et al. Gene and protein expression in human megakaryocytes derived from induced pluripotent stem cells. J. Thromb. Haemost. 19 , 1783–1799 (2021).

De Sousa, P. A. et al. Rapid establishment of the European Bank for induced Pluripotent Stem Cells (EBiSC)—the Hot Start experience. Stem Cell Res. 20 , 105–114 (2017).

Morrison, M. et al. StemBANCC: governing access to material and data in a large stem cell research consortium. Stem Cell Rev. Rep. 11 , 681–687 (2015).

The GTEx Consortium The GTEx Consortium atlas of genetic regulatory effects across human tissues. Science 369 , 1318–1330 (2020).

Article   PubMed Central   Google Scholar  

Mitchell, J. M., Nemesh, J., Ghosh, S. & Handsaker, R. E. Mapping genetic effects on cellular phenotypes with ‘cell villages’. Preprint at bioRxiv https://doi.org/10.1101/2020.06.29.174383 (2020).

Neavin, D. R. et al. A village in a dish model system for population-scale hiPSC studies. Nat. Commun. 14 , 3240 (2023).

Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36 , 89–94 (2018).

Wells, M. F. et al. Natural variation in gene expression and viral susceptibility revealed by neural progenitor cell villages. Cell Stem Cell 30 , 312–332 (2023).

Neavin, D. et al. Demuxafy : improvement in droplet assignment by integrating multiple single-cell demultiplexing and doublet detection methods. Genome Biol. 25 , 94 (2024).

Xu, J. et al. Genotype-free demultiplexing of pooled single-cell RNA-seq. Genome Biol. 20 , 290 (2019).

Heaton, H. et al. Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes. Nat. Methods 17 , 615–620 (2020).

Huang, Y., McCarthy, D. J. & Stegle, O. Vireo: Bayesian demultiplexing of pooled single-cell RNA-seq data without genotype reference. Genome Biol. 20 , 273 (2019).

Hindson, B. J. et al. High-throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal. Chem. 83 , 8604–8610 (2011).

Dong, X. et al. powerEQTL: an R package and shiny application for sample size and power calculation of bulk tissue and single-cell eQTL analysis. Bioinformatics https://doi.org/10.1093/bioinformatics/btab385 (2021).

Schmid, K. T. et al. scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies. Nat. Commun. 12 , 6625 (2021).

Camp, J. G., Platt, R. & Treutlein, B. Mapping human cell phenotypes to genotypes with single-cell genomics. Science 365 , 1401–1405 (2019).

Datlinger, P. et al. Pooled CRISPR screening with single-cell transcriptome readout. Nat. Methods 14 , 297–301 (2017).

Dixit, A. et al. Perturb-Seq: dissecting molecular circuits with scalable single-cell RNA profiling of pooled genetic screens. Cell 167 , 1853–1866 (2016).

Rubin, A. J. et al. Coupled single-cell CRISPR screening and epigenomic profiling reveals causal gene regulatory networks. Cell 176 , 361–376 (2019).

Schraivogel, D. et al. Targeted Perturb-seq enables genome-scale genetic screens in single cells. Nat. Methods 17 , 629–635 (2020).

Download references

Acknowledgements

Figures were generated with BioRender.com and further developed by A. Garcia, a scientific illustrator from Bio-Graphics. This research was supported by a National Health and Medical Research Council (NHMRC) Investigator grant (J.E.P., 1175781), research grants from the Australian Research Council (ARC) Special Research Initiative in Stem Cell Science, an ARC Discovery Project (190100825), an EMBO Postdoctoral Fellowship (A.S.E.C.) and an Aligning Science Across Parkinson’s Grant (J.E.P., N.F., D.R.N. and L.S.). J.E.P. is supported by a Fok Family Fellowship.

Author information

These authors contributed equally: Nona Farbehi, Drew R. Neavin.

Authors and Affiliations

Garvan Weizmann Center for Cellular Genomics, Garvan Institute of Medical Research, Sydney, New South Wales, Australia

Nona Farbehi, Drew R. Neavin, Anna S. E. Cuomo & Joseph E. Powell

Graduate School of Biomedical Engineering, University of New South Wales, Sydney, New South Wales, Australia

Nona Farbehi

Aligning Science Across Parkinson’s Collaborative Research Network, Chevy Chase, MD, USA

Nona Farbehi, Lorenz Studer & Joseph E. Powell

Centre for Population Genomics, Garvan Institute of Medical Research, University of New South Wales, Sydney, New South Wales, Australia

Anna S. E. Cuomo & Daniel G. MacArthur

The Center for Stem Cell Biology and Developmental Biology Program, Sloan-Kettering Institute for Cancer Research, New York, NY, USA

Lorenz Studer

Centre for Population Genomics, Murdoch Children’s Research Institute, Melbourne, Victoria, Australia

Daniel G. MacArthur

UNSW Cellular Genomics Futures Institute, University of New South Wales, Sydney, New South Wales, Australia

Joseph E. Powell

You can also search for this author in PubMed   Google Scholar

Contributions

All authors conceived the topic and wrote and revised the manuscript.

Corresponding author

Correspondence to Joseph E. Powell .

Ethics declarations

Competing interests.

D.G.M. is a founder with equity in Goldfinch Bio, is a paid advisor to GSK, Insitro, Third Rock Ventures and Foresite Labs, and has received research support from AbbVie, Astellas, Biogen, BioMarin, Eisai, Merck, Pfizer and Sanofi-Genzyme; none of these activities is related to the work presented here. J.E.P. is a founder with equity in Celltellus Laboratory and has received research support from Illumina. The other authors declare no conflict of interest.

Peer review

Peer review information.

Nature Genetics thanks Kelly Frazer, Gosia Trynka and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary information.

Supplementary Table 1.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article.

Farbehi, N., Neavin, D.R., Cuomo, A.S.E. et al. Integrating population genetics, stem cell biology and cellular genomics to study complex human diseases. Nat Genet 56 , 758–766 (2024). https://doi.org/10.1038/s41588-024-01731-9

Download citation

Received : 24 January 2023

Accepted : 20 March 2024

Published : 13 May 2024

Issue Date : May 2024

DOI : https://doi.org/10.1038/s41588-024-01731-9

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

Quick links

  • Explore articles by subject
  • Guide to authors
  • Editorial policies

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

case study on biology

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • CBE Life Sci Educ
  • v.15(4); Winter 2016

A Case Study Documenting the Process by Which Biology Instructors Transition from Teacher-Centered to Learner-Centered Teaching

Gili marbach-ad.

† College of Computer, Mathematical and Natural Sciences, University of Maryland, College Park, MD 20742

Carly Hunt Rietschel

‡ College of Education, University of Maryland, College Park, MD 20742

Associated Data

A case study approach was used to obtain an in-depth understanding of the change process of two university instructors who were involved with redesigning a biology course to implement learner-centered teaching. Implications for instructors wishing to transform their teaching and for administrators who wish to support them are provided.

In this study, we used a case study approach to obtain an in-depth understanding of the change process of two university instructors who were involved with redesigning a biology course. Given the hesitancy of many biology instructors to adopt evidence-based, learner-centered teaching methods, there is a critical need to understand how biology instructors transition from teacher-centered (i.e., lecture-based) instruction to teaching that focuses on the students. Using the innovation-decision model for change, we explored the motivation, decision-making, and reflective processes of the two instructors through two consecutive, large-enrollment biology course offerings. Our data reveal that the change process is somewhat unpredictable, requiring patience and persistence during inevitable challenges that arise for instructors and students. For example, the change process requires instructors to adopt a teacher-facilitator role as opposed to an expert role, to cover fewer course topics in greater depth, and to give students a degree of control over their own learning. Students must adjust to taking responsibility for their own learning, working collaboratively, and relinquishing the anonymity afforded by lecture-based teaching. We suggest implications for instructors wishing to change their teaching and administrators wishing to encourage adoption of learner-centered teaching at their institutions.

This is the analogy I thought of, the first semester was where you drop a ball on a hard floor, and at first it bounces really high, then the next bounce is a little lower, hopefully it’s going to be a dampened thing, where we make fewer and fewer changes. Alex
It seems to take a village to send a course in a new direction!! Julie

INTRODUCTION

This study documents the process by which instructors transition from teacher-centered instruction to emphasizing learner-centered teaching in an introductory biology course. Weimer (2013 ) defines teacher-centered instruction as lecture-based teaching wherein students are “passive recipients of knowledge” (p. 64). She characterizes learner-centered teaching as “teaching focused on learning—what the students are doing is the central concern of the teacher” (p. 15). Weimer delineates five principles of learner-centered teaching, which are 1) to engage students in their learning, 2) to motivate and empower students by providing them some control over their own learning, 3) to encourage collaboration and foster a learning community, 4) to guide students to reflect on what and how they learn, and 5) to explicitly teach students skills on how to learn. Of note, various terms are used in the literature to refer to strategies that are related to learner-centered teaching (e.g., active learning, student-centered teaching).

The literature suggests that teacher-centered instruction as opposed to learner-centered teaching promotes memorization ( Hammer, 1994 ) rather than desired competencies like knowledge application, conceptual understanding, and critical thinking emphasized in national reports (American Association for the Advancement of Science [AAAS], 2011). Further, lecture-based teaching fails to promote understanding of the collaborative, interdisciplinary nature of scientific inquiry ( Handelsman et al ., 2007 ). Notably, female and minority students have expressed feelings of alienation and disenfranchisement in classrooms using teacher-centered instruction ( Okebukola, 1986 ; Seymour and Hewitt, 1997 ).

A recommended practice that can support implementation of learner-centered teaching is the use of the backward design ( Wiggins and McTighe, 2005 ). The backward design model involves articulation of learning goals, designing an assessment that measures achievement of the learning goals, and developing activities that are aligned with the assessment and learning goals.

Despite robust evidence documenting the superiority of learner-centered teaching over teacher-centered instruction (as reviewed by Freeman et al ., 2014 ), instructors continue to adhere to teacher-centered instruction. A recent study showed that the majority of faculty members participating in professional development programs designed to help them adopt learner-centered teaching practices continue to rely on lecture-based pedagogy as indicated by classroom observational data ( Ebert-May et al ., 2011 ). Possible reasons for such loyalty to lecturing include the following: 1) instructors’ own personal experiences with lecture as undergraduates ( Baldwin, 2009 ); 2) personal beliefs that transmission of knowledge to students through lecture is the best way to teach ( Wieman et al ., 2010 ); 3) the perception that lecture preparation is more time-effective than preparing learner-centered activities ( Dancy and Henderson, 2010 ); 4) student resistance to active learning ( Henderson and Dancy, 2007 ; Seidel and Tanner, 2013 ; Bourrie et al ., 2014 ); 5) initial difficulties are often encountered when transitioning to learner-centered teaching, requiring several iterations to perfect a new teaching style; 6) learner-centered teaching encourages instructors to cover fewer topics in greater depth to promote meaningful learning ( Weimer, 2013 ), and many instructors are uncomfortable with such loss of content coverage ( Fink, 2013 ); and 7) the learner-centered instructor must change his/her role from an expert who delivers knowledge to a “teacher-facilitator,” giving a degree of control over the learning process to students, and many instructors are uncomfortable with the unpredictability and vulnerability that comes with relinquishing control in the classroom ( Weimer, 2013 ). Further, universities oftentimes fail to incentivize and encourage faculty members to prioritize teaching to a similar degree as research ( Fairweather et al ., 1996 ). It has been argued that the professional culture of science assigns higher status to research over teaching, encouraging scientists to adopt a professional identity based on research that typically ignores teaching ( Brownell and Tanner, 2012 ).

Given that many instructors face challenges and intimidation while implementing learner-centered teaching in their classrooms, there is a need to explore their experiences and learn what support instructors need as they engage in the process of transforming their courses. Science education researchers have recently emphasized the critical need “to better understand the process by which undergraduate biology instructors decide to incorporate active learning teaching strategies, sustain use of these strategies, and implement them in a way that improves student outcomes” ( Andrews and Lemons, 2015 , p. 1).

Case studies have been shown as a useful tool to understand change processes ( Yin, 2003 ). A case study approach represents a qualitative method of inquiry that allows for in-depth description and understanding of the experience of one or more individuals ( Creswell, 2003 ; Merriam, 2009 ). Yin (2003 , p. 42) provides a rationale for using single, longitudinal case studies that document participants’ perspectives at two or more occasions to show how conditions and processes change over time. In this study, we used a case study approach to obtain an in-depth understanding of the change process of two university instructors (Julie and Alex) who were involved with redesigning a biology course. The instructors sought to transform the course from a teacher-centered, lecture-style class to one that incorporated learner-centered teaching. We interviewed the two instructors on multiple occasions; we also interviewed a graduate teaching assistant (GTA) and an undergraduate learning assistant (ULA) to gain their perspectives on teaching the course. We explored the motivation, challenges, and thought processes of the instructors during the interviews. We used several data sources in addition to the interviews to build the case study, including class observations by external observers and student feedback data.

Given that faculty members have difficulty changing their teaching, there are recommendations to use theoretical models of change to examine processes of change ( Connolly and Seymour, 2015 ). We looked for theoretical models of change ( Ellsworth, 2000 ; Rogers, 2003 ; Kezar et al ., 2015 ) and found that the innovation-decision model ( Rogers, 2003 ) has recently been used by science education researchers ( Henderson, 2005 ; Bourrie et al ., 2014 ; Andrews and Lemons, 2015 ). Therefore, we decided to use this model to theoretically approach our data. Specifically, we decided to use the adapted model developed by Andrews and Lemons (2015) , which they modified to represent the change process that biology instructors experience when redesigning a course. This model includes the following stages: 1) knowledge, in which the instructor learns about the innovation and how it functions; 2) persuasion/decision, in which the instructor develops an attitude, positive or negative, toward the innovation and decides whether or not to adopt the innovation; 3) implementation, when the instructor behaviorally implements the innovation; and 4) reflection, in which instructor considers the benefits and challenges of using the innovation. On the basis of reflection, an instructor decides to stay with the present version of the implementation or to start the process once again in an iterative manner by seeking new knowledge (see Figure 1 ). According to Rogers (2003) , a condition to begin the change process is that an instructor must be dissatisfied with his or her current teaching approach. Such dissatisfaction is one contributing factor leading an instructor to begin seeking new knowledge about new teaching strategies. Other external and internal factors usually influence an instructor’s decision to change his or her teaching, including release time, institutional commitment, and instructor attitude ( Andrews and Lemons, 2015 ).

An external file that holds a picture, illustration, etc.
Object name is ar62fig1.jpg

Innovation-decision model adapted from Rogers (2003) , Henderson (2005) , and Andrews and Lemons (2015) .

Context of the Study

This study was conducted at a research-intensive university on the East Coast of the United States. The instructors cotaught Principles of Biology III: Organismal Biology (BSCI207). BSCI207 follows two prerequisite courses, BSCI105 and BSCI106. BSCI105 covers molecular and cellular biology, while BSCI106 covers ecology, evolution, and diversity. BSCI207 requires students to synthesize concepts and principles taught in prerequisite courses, apply them across contexts in biology, and generally engage in higher-order learning (e.g., interdisciplinarity, conceptual understanding, quantitative reasoning). The course enrolls between 100 and 200 students per semester.

In Fall 2013, the provost’s office distributed a call for grant proposals encouraging instructors to redesign their courses to incorporate evidence-based teaching approaches. The call specifically required applicants to design experimental studies to evaluate their course redesign approaches in comparison with their usual teaching approaches. Julie and Alex applied for the grant and were funded. Their proposed evidence-based teaching approach was to incorporate a series of small-group active-engagement (GAE) exercises throughout the semester. The traditional section would retain the usual three 50-minute lectures per week schedule. The experimental section would replace one 50-minute lecture with a shortened 20-minute lecture followed by a 30-minute GAE exercise with content matched to the traditional class occurring that day.

The instructors designed the GAEs to accomplish a series of learning goals that were consistent with Weimar’s five principles of learner-centered teaching. For example, one of the GAE goals was to foster collaboration among students in order to mimic the scientific process of inquiry. This goal was in accord with Weimer’s (2013 ) learner-centered teaching principle of collaboration, creating a learning community with a shared learning agenda, and modeling how experts learn. To accomplish this goal, the instructors implemented the GAEs in a small-group setting and required students to exchange ideas and achieve consensus on a single worksheet.

A second goal, which accords with Weimer’s (2013 ) framework, was to engage students in their learning and motivate them to take responsibility and control over their learning process. For example, one of the GAEs asked students to complete a humorous, fictional case study involving a spaceship captain and deadly neurotoxins. In this activity, students needed to use mathematical equations to calculate membrane potentials and to create simulations of conditions that impact membrane potential. Another activity was to collaboratively create a plot of ion transport rate versus concentration. Students were given a computer simulation that they used to generate data; they then entered the data into a Google documents Excel spreadsheet. This created a classroom database that was used to build the plot, which the instructor displayed using the lecture hall projector at the end of class. This activity involved multiple components of learner-centered teaching, including collaboration, student engagement, and student responsibility for learning. Detailed descriptions of a selection of GAEs are published elsewhere ( Carleton et al ., in press, 2017 ; Haag and Marbach-Ad, in press, 2017 ).

The provost grant offered funding that could be used for various purposes. The instructors decided to use the funding for summer salary to develop GAEs and to pay for support from a science education expert. Grant awardees were required to participate in Faculty Learning Communities (FLCs) and teaching workshops arranged by the campus teaching and learning center.

In Fall 2014, the instructors started to implement their experiment. Jeffrey, a third instructor, joined Alex and Julie to teach both sections; each of them was responsible for teaching several topics associated with their specific research expertise. In the GAE class, students were divided into small groups to complete a learning activity pertaining to the course topic. In total, 12 GAE sessions were held during the semester. Both GAE and traditional classes were taught in large auditoriums. For each GAE session, students self-selected into groups of three to five students. Four GTAs circulated among the groups to facilitate group work. Students were asked to leave empty rows around their respective groups to allow GTAs to move throughout the groups. This same topic was covered only by lecture format in the traditional class.

In Fall 2015, the instructors no longer conducted a comparative experiment while teaching. Julie and Alex continued to coteach the course with the GAE format with many modifications to the activities and other aspects of the course (see Results ). Jeffrey continued to teach a different section of the course independently. Henceforth, we will describe the experience of Julie and Alex in their process of transforming the course.

Teaching Staff

Julie and Alex are associate professors. Lisa is a doctoral-level teaching assistant (TA) in the biology department. Lisa was a GTA in the Fall 2014 and Fall 2015 semesters. Jason was a freshman student in the GAE section of the Fall 2014 semester. In Fall 2015, Jason served as a guided study session (GSS) peer leader in BSCI207. GSS leaders are students who have taken a course on implementing evidence-based teaching approaches, and who have also completed the course they are tutoring with a high grade. GSS students are expected to facilitate small-group discussions outside class. Jason also volunteered to attend all GAE sessions to help facilitate.

Data Collection Instruments

Yin (2003) notes that multiple data sources are important in building case studies. As such, we use interview data, class observations, student feedback on the course, and information written in the grant proposal.

Interview Protocol.

Julie and Alex were interviewed independently immediately following Fall 2014 for 20 minutes each. Julie was also interviewed independently in the beginning of Fall 2015 for 1 hour. Julie and Alex were interviewed together immediately following Fall 2015 for approximately 1 hour. Lisa and Jason were also interviewed following Fall 2015 for 20–30 minutes each. We used semistructured interview protocols (see the Supplemental Material) with additional questions to probe for clarification. The questions probed participants’ motivation for change, attitudes toward change, barriers and challenges, administrative supports, details about the implementation, and teaching philosophies.

Class Observations.

Two independent raters conducted class observations. Each year, raters attended six classes. In Fall 2014, they observed GAE class sessions and the parallel, content-matched class sessions that took place in the traditional class (overall 12 sessions). This procedure allowed the raters to compare the class sessions covering the same material but with differing teaching approaches (i.e., learner-centered vs. teacher-centered instruction). The two raters attended each class session together. Once in the class, the raters used a rubric to evaluate the class. In Fall 2014, raters used a rubric based on a previously constructed rubric that was created by the biology department for peer observations ( http://extras.springer.com/2015/978-3-319-01651-1 , in SM-Evaluation of teaching performance.pdf). In Fall 2015, to better document group work, the raters used the rubric developed by Shekhar and colleagues (2015) .

Student Feedback.

Students were invited to reflect on GAEs by providing anonymous written feedback on note cards following the activity. We use some of these data in the present study.

Data Analysis

Interviews were conducted by a science education researcher, audiotaped, and transcribed. A science education researcher and a doctoral student in counseling psychology separately analyzed the interviews and the note cards to define emergent themes. Then, they negotiated the findings until they could agree upon the themes ( Maykut and Morehouse, 1994 ). The instructors were shown the interpretation of data to verify accuracy of interpretations. We present the results in accordance with the adapted Rogers (2003) model presented in Andrews and Lemons (2015) . We slightly adapted the Andrews and Lemons (2015) model to the iterative process through which our instructors progressed to modify the course (see Figure 1 ).

Motivation for Change

Before 2014, the traditional BSCI207 class as taught was a three-credit course with three 50-minute lectures per week. Alex described the traditional course:

Before the GAEs came into being, we taught in the very standard, traditional lecture. We used mostly PowerPoint to show text and images, occasionally we would bring a prop in, like sometimes I would bring a piece of a tree to gesture towards as I was lecturing about water transport or something like that. But it was basically standard lecture.

The instructors were dissatisfied with the traditional lecture format for the following reasons:

  • Evidence for inferiority of teacher-centered instruction compared with learner-centered teaching . The instructors expressed awareness of the empirical data documenting the superiority of learner-centered teaching over teacher-centered instruction, “There’s a lot of research that suggests that [teacher-centered instruction] may not be the best way to help the students understand what we’re trying to get them to understand” (Alex).
  • Lecture hinders understanding of the process of science. The instructors also expressed a desire to get students to learn the process of science early in their education, rather than to passively receive information. “We are being asked as science professors more and more to try and get our students to understand that science is a process, earlier and earlier in their career, and to model what real science is like in their education” (Alex).
  • Lecture promotes overreliance on memorization. The instructors discussed a goal to modify the course so as to decrease focus on memorization and increase emphasis on problem solving and conceptual understanding. Julie described: “BSCI207 is the biology majors’ class, and it’s a lot of what the pre meds are taking, and so, critical thinking I think [is important], we’re constantly trying to get them to not just memorize and regurgitate but to put the ideas together.”
We also rearranged the material. So they [the lectures] used to be in a taxonomic orientation, I would give a whole lecture titled the biology of fungi, and the students complained that this taxonomic focus seemed to resemble the structure of BSCI106 [the prerequisite course]. I decided to explode those taxonomic lectures, and take the bits of content that I still thought were valuable, and spread them into other parts. So for example the stuff on mating types, which is wacky and interesting to me, and I hope to the students, is now in a lecture on sex. And they don’t realize half the lecture is on fungi. So they’re susceptible to packaging I think, and we don’t get the complaint any more that the course is redundant to BSCI106 (Alex).
Organisms don’t care about our disciplinary boundaries of research. The organism doesn’t understand that there’s biophysics, and biochemistry, and evolutionary biology, and ecology, and genetics. All these attributes of their biology have to function simultaneously on several different spatial and temporal scales … if we think they do, then we continually miss things that otherwise would fall out naturally if we were a little less wedded to our disciplines.
Relatedly, the instructors noted that most students enrolled in BSCI207 without having taken introductory physics or chemistry, which they thought was preventing students from drawing upon highly relevant concepts (e.g., thermodynamics) from these courses for biology.
  • Underrepresented groups do poorly in traditional classes. The instructors quantitatively examined student performance for specific student subgroups (i.e., underrepresented minority students, female students) in previous BSCI207 semesters. They observed that there were disproportionate D/F/W grades for underrepresented students. Coupled with the science education literature documenting the ability of active learning to help underrepresented groups ( Preszler, 2009 ; Haak et al ., 2011 ; Eddy and Hogan, 2014 ), the instructors speculated that adding active learning to the traditional class might help underrepresented students.

In Fall 2014, the instructors went through the process of course revision that follows the adapted model by Rogers (2003) and Andrews and Lemons (2015 ; see Figure 1 ). In the following sections, we discuss their progression through the innovation-decision model. Table 1 shows a summary of the change process for the Fall 2014 semester.

First Iteration of the instructors’ change process

Before the Fall 2014 semester, the instructors engaged in several efforts to increase knowledge about evidence-based teaching approaches to modify the course. The knowledge sources were as follows:

I will go ask [physics education professional] questions. When something doesn’t go well I’ll meet with the postdocs [from physics education research group (PERG)] over there and say, what are they not getting here, how can we make this better, so I’m always trying to get resources to help.
  • Reading the science education literature. As a new instructor, Julie participated in the college workshop for new instructors. The workshop was led by the director of the teaching and learning center, who provided several resources for using evidence-based teaching approaches, including an article giving an overview of learning styles ( Felder, 1993 ), a book on teaching tips ( McKeachie and Svinicki, 2006 ), and the book Scientific Teaching ( Handelsman et al ., 2007 ). In her interview, Julie commented, “So I read a lot of books,… I think it was getting students to think about math, I read one of the books [that the director of the college teaching and learning center] had given me [ Scientific Teaching ].”
  • Observing other instructors teaching. The instructors had observed another instructor who implemented evidence-based teaching approaches in a small class of BSCI207 (<40 students). This pilot implementation was successful, and the instructors were interested in investigating whether the learner-centered teaching model used could be scaled up to a large-enrollment class.

Persuasion/Decision.

Following the knowledge-generation phase, the instructors felt prepared to change their teaching to a more learner-centered teaching style. They decided to conduct a comparative experiment during the first implementation of the GAEs (i.e., traditional vs. GAE classes; see Marbach-Ad et al ., in press, 2017 ). Although the instructors were aware of the literature documenting the effectiveness of learner-centered teaching, they had several reasons to execute the experiment:

  • Obtain evidence for overall effectiveness. The instructors were unsure whether their activities were the best way to change the course (e.g., they were unsure of the challenges that would emerge, how the intervention would impact students). The instructors also wished to explore cost-effectiveness, since they knew that changing the course would require a high instructor time commitment.
  • Convince colleagues to adopt learner-centered teaching approaches. The instructors noted that faculty in the department were unconvinced of the superiority of learner-centered teaching approaches, and they thought that a comparison study bringing empirical evidence might demonstrate that changing one’s teaching style is worthwhile. Alex stated, “[A] lot of my motivation for this experiment was to try to provide some evidence that these approaches were worth the effort, and because there is resistance clearly, from some of our colleagues who have been teaching the course for a long time.”
  • Respond to grant award requirements. As mentioned earlier, the institution announced a call for proposals for instructors to revise their teaching. The instructions required applicants to propose comparative experiments during course revision to document effectiveness.

Implementation.

As proposed in the provost grant application, the instructors executed the comparison study. In the traditional class, instructors delivered a 50-minute lecture three times per week. In the GAE class, one lecture was replaced with a GAE. The GAE consisted of a brief 20-minute introductory lecture (a short version of the lecture presented to traditional class students) and a 30-minute group activity. As scientists, the instructors wished to manipulate the addition of the GAE day only and to keep remaining variables constant across classes. Therefore, homework assignments, examinations, optional computer tutorials, and office hours availability were consistent in both classes (see Table 2 ).

Fall 2014 class comparison

In the GAE class, on the day of the GAEs, students were instructed to sit with groups of three to five students (of their own choosing) and to leave empty rows between groups. Students were asked to have at least one laptop per group. As discussed previously, the GAEs were designed to be more learner centered relative to traditional lecture classes. To illustrate this here, we give Alex’s description of the membrane transport GAE: “The students had a little computer simulation, and they used that to generate data that they then entered into a Google docs spreadsheet in real time in the class, and there were enough students in the class that their responses produced this beautiful textbook plot of transport rate versus concentration. They built that relationship in a way that otherwise I would have just told them.”

Reflection.

Following the Fall 2014 semester, the instructors reflected on the various pros and cons of the learner-centered teaching intervention in the interviews. Observers and students also provided feedback that was used by the instructors to reflect on both sections of the course and on the comparative experiment. Several themes emerged from these data:

It’s much less about my spouting facts, it’s about my thinking ahead of time to get them to draw conclusions and get them to cement ideas. My role was partly just to control the chaos sometimes, and to control that the TAs had the information they needed so they could provide guidance to the students.

Importantly, observers noted that the instructors were very actively engaged with student groups throughout the GAEs, helping students to work through problems and understand concepts. Julie also commented that teaching with GAEs requires greater proficiency with material than lecturing: “To use these activities, you have to know the material better than if you’re going to straight lecture. And I think some instructors are maybe still learning BSCI207, what is all the material in it. And until you teach it straight a couple of times you probably don’t have the background to really understand.”

We spent less time talking about dating the origins of life using various methods (fossil record, carbon dating); we got rid of a lecture on prokaryotes and had to shrink some of the nutrient assimilation information from two lectures to one.

The instructors explained that, in order to minimize loss of content coverage, they decided to have a GAE class only once per week and to pick GAEs corresponding to lecture topics for which “there was the least amount of lost material by focusing on a particular exercise” (Alex). An additional solution was to move in-class lectures to online, preclass lectures. Julie described this change: “We also ask students to review some of the material that is lost during lecture time into the prep slides they review ahead of time.” However, Julie wondered whether students would benefit from online lectures to the same degree as in-person lectures: “I am still worried they don’t get so much out of those [online lectures] and so miss much of that information.”

  • Engagement in learning. Overall, the instructors reflected that most GAEs provided a space for students to interact with one another, TAs, and instructors: Julie added, “I think it was nice to see the energy in the class and the way the students took to the activities, it was different for them.” Observers noted that the GAE class treatment condition was usually associated with increased student interactivity. Specifically, they noted that students in the GAE class were not only more engaged in the GAEs, but that they also tended to raise more questions during the PowerPoint presentations relative to students in the traditional class. Students reflected on their note cards following GAEs, and in the end-of-semester survey, noting that they felt that many of the GAEs were engaging (see Marbach-Ad et al ., in press, 2017 ).
  • Giving students control over learning. The instructors noted, “The GAEs represented a chance to turn the class over to the students for some part of the time, where they could do something actively, instead of just sitting there listening to us” (Alex).
It’s actually a bit more how real science works, right, even as somebody who runs a lab, I don’t go into my lab and sit there and talk to my graduate students for four hours, I mean we have a brief conversation about how they should tackle something, and then they go off and work more on it. So it’s more of a checking in and then separating again. That’s kind of how this class works, the GAEs do give the students a little more of a feel of how collaborative real science works, and how no one person is sort of dictating everything, everyone needs to be a bit independent. … I think that this active model gives the students, for the first time, a real taste of how a real scientist would approach a problem.

Students commented on the opportunity afforded by GAEs to take an active role in their learning: “I learned how to apply what we learn in lecture class to actual problems”; “I kind of felt like a real scientist since I was put in a situation in which I had to make a hypothesis myself.”

  • Disengagement. The instructors noted that, for some GAEs, students were disengaged. For example, in the GAE on stress and strain, two students were doing measurements in front of the class for 10–15 minutes, and the remaining students were instructed to input data into Excel files. These data were then used to make calculations. Students also expressed their dissatisfaction with this activity on the note cards that they handed in to the instructors: “I feel I understood the concept well once Dr. Julie wrote the plots on the board. This activity was more tedious and like busy work”; “ We could have easily compared values without experimentally finding them. I didn’t feel this deepened my understanding of concepts.”
  • Insufficient time for reflection. The instructors noted that most exercises were too long, which did not leave sufficient time for reflection. Alex noted, “Well I think also making sure that if we get the exercise done in the right, short amount of time, then that does give us time to add a reflection at the end. Connecting the results of our exercise back to some larger idea.”
  • Student preparation. The instructors felt that students would gain more from the exercise, if they were to come to GAE classes with better understanding of concepts relevant to the GAE. Then, more time could also be allotted for summary and reflection on important concepts. Alex commented, “We probably will need the students to do a bit of preparation before they come in to these active exercises, so that we can spend less time setting it up, and more time summing it up.”
  • Assessments and grading misaligned with GAEs. In this implementation, instructors kept the same assessment plan for both the traditional class and the GAE class in order to compare achievement across classes. This resulted in a mismatch between the course activities and the assessments in the GAE section. For example, there were no final examination questions specifically covering GAE material. Of note, the instructors analyzed their final examination questions before conducting the experiment and saw that the questions required students to demonstrate high levels of thinking ( Bloom and Krathwohl, 1956 ; e.g., knowledge application, quantitative analysis), and they believed the GAEs would improve students’ abilities in these areas. Further, the instructors did not count GAE participation toward final grades, which instructors and observers believed had a detrimental effect on GAE attendance. Julie noted that “on the GAE days, only 60% of the students would come. That was partly because they wouldn’t get any credit for it, and they weren’t seeing that it was helping them learn the material better.” Analyses showed that students with higher grade point averages (GPAs) were those who chose to attend on the GAE days (see Marbach-Ad et al ., in press, 2017 ). Given this, the instructors felt that attendance should be incentivized in future implementations of the learner-centered teaching intervention to motivate and benefit a wider range of students.
  • Resistance to learner-centered activities. The instructors felt that students’ low attendance specifically on GAE days may also have been because the students did not perceive the benefit of GAEs for their learning. “I feel sort of parental here, maybe the GAEs are like broccoli and brussels sprouts, they need them, they just don’t know it yet” (Alex).
  • Group dysfunction. The instructors and observers noted several issues with the groups. Some groups were not engaged, and some students were not participating within their groups (e.g., one student would be left out). In some activities, some groups would finish the activity very quickly and would subsequently appear bored and waiting for further summary or instruction. Julie was frustrated with these occurrences and noted, “People would be sitting there on their phones.” One reason for student disengagement could be that students groups were unassigned and could include different students each week: students “would sit and associate with whoever was around them” (Julie).
  • Auditorium-setting challenges. The instructors commented on the difficulty of doing GAEs in the large auditorium: “It’s still tricky to think about how you actually stage all of this, there is a bit of theater to running a large class with 200 students, how you move from one aspect of the process to another [lecture to group activities] quickly, without losing people, without too much noise and disturbance” (Alex).
  • Little impact on grade distributions. Alex and Julie were hopeful that the GAEs would lead to large improvements in students’ grades as compared with traditional learning. However, the effect of GAEs was very small. Alex commented, “This was the biggest outcome from my perspective, and it drove much of the revisions for 2015. This is interesting, as it shows that even though we were unable to realize a big payoff in the first year, we nevertheless saw something that we thought was worth keeping and hopefully improving upon.”
  • TA training required. The instructors reflected that they did not provide adequate TA preparation for the GAEs: “We hadn’t really prepared the GAEs enough ahead of time so that we could talk about them with the TAs. The TAs at times were really clueless about what was supposed to be happening” (Julie). TAs, although instructed to guide and facilitate groups, apparently lacked the skills to engage students, as observers noted that most of them passively waited for students to ask questions rather than actively approaching students with questions, instructions, etc.

On the basis of their reflection, Julie and Alex decided to continue teaching with GAEs and to seek new knowledge to improve GAEs. In the following sections, we discuss their continued progression through the innovation-decision model (see Figure 1 ). A summary of the change process in Fall 2015 is shown in Table 3 .

Second iteration of the instructors’ change process

  • Learn about methods to form successful groups. The instructors reviewed the literature and consulted with the director of the teaching and learning center and other faculty members in the department to form new strategies on building effective groups in auditorium settings. The literature shows that groups work best when they are permanent and students are held accountable to other group members ( Michaelsen and Black, 1994 ; Michaelsen et al ., 2004 , 2008). The literature also shows that taking student diversity into account is important in creating successful groups ( Watson et al ., 1993 ). For example, Watson and colleagues (1993) reported that, although it takes time, heterogeneous groups outperformed homogeneous groups on several performance measures, including generating perspectives and alternative solutions. The instructors also learned from the director of the teaching and learning center about the Pogil method ( pogil.org ), in which students are assigned different roles during group work (e.g., recorder, facilitator). They weighed the pros and cons of implementing this method in the classroom.
  • Learn about methods to flip courses. The instructors learned from models of flipped classes ( Hamdan et al ., 2013 ; Jensen et al ., 2015 ), which highlight how to capitalize on out-of-class time to cover material to prepare for face-to-face active learning. In this regard, instructors sought assistance from the information technology office about presentation software (i.e., Camtasia) that can deliver automated lectures effectively.
  • Seek expert guidance. During the summer, the instructors again consulted with science education experts to enhance the GAEs. For example, they consulted with a science education expert on how to revise the concept map assignment. Julie described how this guidance helped her “leave the activity a bit more free form and get the students to make a graphic organizer of their own design rather than trying to fill in some pre-designed boxes.” As another example, the science educator recommended strategies about how to streamline GAEs to maximize time spent on developing conceptual understanding and minimize time spent on the mechanics of exercises.
  • Learn about strategies to enhance TA support. The instructors wished to decrease student to TA ratio. However, GTAs require departmental funding, which was unavailable. The teaching and learning center director and the biological sciences administration offered to involve ULAs who are unpaid but receive alternative benefits, such as leadership and teaching experience and undergraduate course credit. This model was reported to be successful in our university ( Schalk et al ., 2009 ) and in other institutions ( Otero et al ., 2010 ).

Following reflection on the comparative experiment, instructors sought to keep improving the course and decided to make several changes:

  • Teach all sections with learner-centered teaching. Although the instructors reported that keeping the GAE class format requires more time to prepare relative to lecturing and takes time from their research (“fine tuning the GAEs—that took weeks” [Julie]), they decided to implement the GAEs in all sections and to work to improve them.
[Last semester] I had a couple of students up front doing the experiment, and everyone else was kind of twiddling their thumbs while we gathered the data. We talked about the data but we didn’t really have time [to do data analysis and summarize concepts]. I think this year I’m just going to give them last year’s data, and have each group do some analysis.

As another way to modify GAEs, instructors decided to utilize more outside resources such as published, case-based activities. Julie described, “I’d love to come up with some more case studies that we could do. You know the Buffalo site [ http://sciencecases.lib.buffalo.edu/cs/collection ] has all the case studies for all the science classes. So I’m constantly perusing that. A couple of the GAEs that I developed actually come from there.”

I haven’t figured out what the best prep work is. What Alex has been doing is taking the slides he showed last year and just posting them online. I’m not sure that’s the best, or really enough.… But, then they just read. I mean he tries to put more words on them. I tried to find some videos that I thought were appropriate, and I’m not sure that’s any better. I was going to do some of these with Camtasia. In this way you can actually have the slides and actually talk over them and record. But I couldn’t make the software work. I haven’t really gone there yet, I will have to figure that out.

To encourage students to prepare for the GAEs, the instructors decided to give a preclass quiz covering the out-of-class preparatory materials. Julie described,

We’re also doing a quiz this time, we’re giving that preparatory information, they have to have done it by the morning before, they have to take a little 2-point quiz [before class] to show that they’ve covered that material. Then we have the whole class time [for the GAE] so that we’re not so rushed in trying to do to many things at one time.
  • Train the TAs better, add ULAs, and involve both teams in the process of GAE development. The instructors decided to expand the team of assistants to decrease the ratio between students and TAs. Julie described the change from Fall 2014 to Fall 2015: “We have a bigger team. We have two of these ULAs, and then we have three UTAs, and two GTAs. So a team of seven helpers, and each person has a different job. The ULAs are specifically supposed to be trying out the GAEs ahead of time. So we kind of run things past them. And then we meet with all the TAs, and then talk through the GAEs beforehand. They have an assigned part of the class, where each of them is hopefully seeing the same students over and over, and hopefully getting to work with them to develop a rapport, and they go in the middle of the activity, so kind of checking in, so what do you think, kind of getting students to verbalize.” The benefit of this new format, where each TA was responsible for a subsection of the large class, was that it approximated a smaller class discussion session in which students could get to know their TAs more personally.
  • Revise group structure. On the basis of the literature and their previous experiences, the instructors decided to assign permanent, diverse groups of four at the beginning of the semester. They also decided to instruct students on how to sit in the auditorium with their groups (in two rows rather than in a single line, to enhance group communication) and to award points for completing group work exercises.

In the Fall 2015 implementation, there were several changes to the course (for a comparison of 2014 and 2015 GAE classes, see Table 4 ).

GAE class comparison between Fall 2014 and Fall 2015

  • Modify the activities. The instructors devoted a full weekly class period to the GAE instead of 30 minutes. On the basis of their experiences in the previous semester, they revised some GAEs and adapted them to the time frame. Although they had more time for the GAEs, they wished to make them more efficient and interactive: “I think we had to cut some, with the GAEs, because they were taking way too long, but I think in a few cases we simplified them, took out 1/3 of them or something” (Alex). Instead of the 20-minute pre-GAE lecture that was presented in the Fall 2014 implementation, students were asked to prepare for activities at home by watching videos, reviewing lecture slides, and reading textbook materials. In contrast with Fall 2014, the students were awarded three points for participating in the GAE activity and two points for completing a quiz covering preparatory materials that was due before the GAE class. The instructors wished to assign points to these activities in order to “really give them weight” (Julie). “[The activities] formed a large part of the exams as well. So making the activities more integral to the class was a big change” (Julie).
So the next time I drew a map in the room, [which showed] two students in the front, and two students in the back. When you have a very formal auditorium, you have to try and help them assort with each other and talk with each other. The other thing we did was, we were giving each group two copies of the assignment, so they didn’t each have one. So that kind of helped, that kind of had them sharing things.

Finally, TAs and ULAs were assigned to stay with one section of the lecture hall throughout the semester. Thus, TAs and ULAs developed a rapport with a large group of students throughout the semester and were able to learn their names, which facilitated communication.

  • Add more and better-trained TAs. Before Fall 2015, the instructors trained the TAs to better engage with student groups in class. Class observation data showed that, in Fall 2014, some TAs were lacking in their ability to engage actively with students. One observer described, “When I observed the classes last year [Fall 2014], they [TAs] were standing in the side [of the auditorium], and sometimes they got to students, but just students that raised their hands. They weren’t active. They were very passive, most of them, because they didn’t know what to do.” Following the implementation in Fall 2015, the observer noticed a change in TA involvement: “Now, it’s more about instruction, they circulate between groups and encourage them to ask questions, they encourage students that aren’t participating … it’s not enough to throw them [the TAs] in the classroom.”

Overall, instructors noticed improvements in the areas that they targeted to improve, and they also felt there were areas that they wished to continue improving.

  • Student preparation. Julie described that although new techniques were put in place to increase student preparation, students often seemed unprepared for the activities: “And my data for that is essentially for the first 20 minutes of the GAE they would spend saying, what are we doing? There was a lot of flailing. It took them a lot longer to get going with the GAE than I thought, and I’m not sure if that’s because the preparatory material is not really preparing them, or that they just took the online quiz and didn’t really go through the preparatory material.” Julie thought about changing the nature of the preparatory lectures, “I would still like to explore turning those into little online lectures rather than having them read the slides.”
  • Student attendance . Alex commented that the strategy of assigning points to participating in GAEs “made a big difference in attendance […] by incentivizing their attendance, at least on GAE days, they were coming.” The instructors commented that incentivizing participation in the GAEs and the preactivity quizzes increased the amount of student–instructor interaction regarding point grabbing. Alex stated, “The downside of associating points with everything is that I think we spent the largest fraction of our student interaction time dealing with the points related to the GAEs, excused absences, non-excused absences, anxiety about the points, I mean these are tiny amounts of points, but the students took it very seriously. But I think it was one of the top 3 issues that students came up with this semester.”
  • Mechanics of exercises. Julie was very frustrated with how students could not effectively operate Excel software: “And they still don’t know Excel. My biggest frustration was that I thought Excel would make their lives easier, and it made their lives harder. I’m almost ready to go back to pencil and paper, just to get them to plot things and think about things, because they’re not getting back to the scientific inquiry and hard thinking, they’re just so stuck in which box do I click.” Alex added, “My issue is that the preparatory materials do a good job preparing them intellectually for what’s the point, but then they do get stuck on the mechanics, what they’re doing with their hands.”
  • Allocating time for reflection. The instructors described that they improved substantially in the area of summarizing major concepts and timing activities: “I think we did a pretty good job of every 15 or 20 minutes bringing them back together and saying ok, you would have done this by now. There were a couple that worked really well, and a couple where we were still pressed for time. I think that generally it was far improved” (Julie). Because the instructors had the full class period to devote to the GAE and did not need to compare learner-centered teaching with teacher-centered instruction, they felt that the timing of the activities was much improved. Julie noted, however, “I always overestimate what students can do. I’m still adjusting.”
  • Technical issues. There were difficulties with connecting to the wireless Internet in the lecture hall, particularly among students who failed to download the appropriate tools before coming to class. Further, students have different types of computers and software programs and knowledge of software programs required for the course.
The organized approach helped students see the material as well as make a few friends, in fact, I remember coming onto my dorm floor and seeing four people from my class working together, and they were actually in that GAE group, they had made a study group because they were used to working together. One of the aims of this project gets students communicating instead of competing.

Julie commented that there is still room for improvement in the student groups: “I saw a number of groups where at least one person would be left out. I don’t know if that’s a physical orientation, if we could point them toward each other it would be better. Next year one thing we talked about is going to groups of 3, because with 3 you can always get across each other and be more … everybody can talk to each other.” The instructors considered the benefits of the Pogil. Julie explained that they tried to appoint a different group member to act as the scribe each week during GAE activities as a way to increase student participation in groups. The instructors did not strictly enforce this policy, as they were not sure it was beneficial.

And we also had some undergraduates this year, … and I think they were really helpful because they understand what the students are capable of, more than we do … a lot of times they can give you some insight into what’s going on or what classes undergraduates are most likely taking at the same time. It was very helpful.

Finally, Lisa felt that the level of engagement among the teaching staff was higher than for a standard lecture course: “Everyone was very engaged, it’s a unique class to TA for, because I feel like the TAs and the professors are far more engaged than in a standard lecture course, so it was kind of nice.” Alex reflected that, in the future, “It would be even better,” since they will have “a whole floor of ULAs that had us for 207,” and they “will be well-positioned” to assist in the redesigned course.

This case study examines instructor change processes when moving from teacher-centered instruction toward learner-centered teaching. In this study, we examined the change process through the lens of the innovation-decision model ( Rogers, 2003 ; Andrews and Lemons, 2015 ), which recognizes several stages of change: knowledge, decision/persuasion, implementation, and reflection. The model is iterative, recognizing that transforming courses may require multiple revisions as instructors reflect on the inherent challenges and imperfections that arise when changing a course ( Henderson, 2005 ). Consistent with this literature, the first implementation of learner-centered course revision was fraught with imperfections, and the instructors persisted through two rounds of course revision before gaining satisfaction with their teaching approach, although they plan to continue enhancing the course with each semester.

Andrews and Lemons (2015) note that dissatisfaction with one’s current teaching approach is an important motivator leading instructors to change their teaching. Our instructors were dissatisfied with the lecture mode of teaching in their courses due to personal dislike for it, and the sense that it encouraged student reliance on memorization and hindered interdisciplinary thinking. Other motivators for change included 1) awareness of national recommendations to use learner-centered teaching ( AAAS, 2011 ); 2) a hope that underrepresented students would benefit from learner-centered instruction, based on education literature documenting such benefits ( Okebukola, 1986 ; Seymour and Hewitt, 1997 ); and 3) institutional support (i.e., a provost office grant initiative).

These motivations led the instructors to seek new knowledge about learner-centered teaching approaches and how to implement them, which, according to the adapted innovation-decision model ( Andrews and Lemons, 2015 ), is a first step toward changing a biology course. In the present study, knowledge-seeking strategies included consultation with science education experts and information technology experts, reading the empirical literature, observing other faculty members who had adopted evidence-based teaching practices, and involvement with a discipline-based FLC. Following the knowledge stage, the instructors progressed through the decision/persuasion and implementation stages of change. In the reflection stage, the instructors discussed what worked well, challenges, and areas they wished to improve in the subsequent iteration. We present here implications from this study for instructors seeking to change their courses, and also for administrators wishing to promote learner-centered instruction at their institutions.

IMPLICATIONS FOR INSTRUCTORS

Weimer (2013) noted that engaging students in their own learning is messy, unpredictable, and challenging as compared with teacher-centered instruction. The process can be difficult for the faculty members who want to change as well as for the students. First, the instructor must adopt a new role as “instructor-facilitator” ( Weimer, 2013 ), giving up a degree of control to the students to take responsibility for their own learning. Relating to their new role, our instructors reported that, on the one hand, the instructor-facilitator role felt like controlling chaos at times, particularly in the beginning, but that it was markedly beneficial for student learning and for their own teaching. For instance, it gave students an opportunity to be independent learners and to engage with their peers in collaborative problem solving, more closely modeling the process of science. Thus, although it may be intimidating to share control over the learning process with students, it appears that there are benefits for both students and instructors.

Second, learner-centered teaching encourages instructors to cover fewer topics in greater depth, as opposed to more topics in less depth ( Weimer, 2013 ). Despite being uncomfortable with losing content coverage due to the function of BSCI207 as a preparation course for the MCAT and a prerequisite, our instructors decided to remove some course topics and consolidate others into shorter units. Next, they implemented several solutions to the necessary loss of content coverage. First, they moved lecture content to required preclass, online lectures that substituted for in-class content coverage. Second, they were strategic about which course topics they used to redesign as GAEs. Specifically, they selected course topics that were historically conceptually challenging for students (e.g., membrane transport). Our faculty members’ transition process provides an example of how faculty members can identify and implement solutions for concerns about loss of content coverage.

Third, a fundamental principle of learner-centered teaching is to encourage collaboration in the classroom ( Weimer, 2013 ). To this end, our instructors implemented GAEs, a series of group work–based activities. Student collaboration is important, because it promotes sharing of the learning agenda ( Johnson et al ., 1984 ; Weimer, 2013 ), and collaboration is a skill that is essential for the workplace ( Hart Research Associates, 2015 ). Group work is a common and accessible strategy that instructors can use to increase learner-centered teaching in their classrooms. Our instructors experienced various challenges and implemented several revisions to group work activities throughout their change process. The most successful strategies for optimizing group work included 1) increasing the number of TAs and the amount of TA training; 2) creating diverse and permanent student groups to increase accountability ( Michaelsen et al ., 2004 ); 3) assigning grades and preparation assignments for group work activities; and 4) restructuring group work activities to provide more time for whole-class summary and reflection on concepts. Group work is just one type of teaching strategy that can increase learner-centered teaching. Each instructor needs to discover what kinds of approaches are most suitable to increase their level of learner-centered teaching. When selecting and implementing new teaching strategies, it is highly recommended to seek guidance from experts, more experienced faculty members, or from a teaching and learning community.

Transitioning away from lecture-based instruction to learner-centered instruction can be challenging for students as well as instructors. The literature has shown that students resist many learner-centered approaches that require them to engage in the classroom rather than sit anonymously in lecture ( Michaelsen et al ., 2008 ; Shekhar et al ., 2015 ). Our instructors learned about student resistance through several means: 1) student feedback that was collected on note cards at the end of GAE classes, 2) end-of-semester surveys asking students to reflect on each activity, and 3) low attendance on GAE days as compared with lecture class days. It is important for instructors transitioning their courses to monitor student resistance and satisfaction, as our instructors used these data to modify the activities from the first to second iteration.

The instructors used several strategies to reduce student resistance. First, through student feedback, instructors learned that they needed to provide students with better explanations for the purpose of doing GAEs as opposed to sitting in lecture class. Weimer (2013) emphasizes the importance of providing students explicit instruction on how to best learn. Therefore, at the second iteration of the learner-centered implementation, the instructors were explicit about the rationale for the GAEs. At various points throughout the semester, the instructors explained how the GAEs were helpful in enhancing skills (e.g., critical thinking, problem solving, collaboration, understanding the interdisciplinary nature of science, relating course material to everyday life and to scientific research) that are recommended by national organizations ( AAAS, 2011 ) and employers ( Hart Research Associates, 2015 ). Second, instructors awarded class participation points for completing GAE exercises and grades for completing the preclass online quiz. This strategy resulted in better alignment between requirements of students and course assessments, which accords with Wiggins and McTighe’s (2005) backward design theory. This method of GAE grading resulted in much higher student attendance as compared with the first iteration. Third, instructors used evidence-based strategies to reduce resistance within student groups, including creating permanent, diverse groups at the start of the semester. Fourth, instructors took student feedback into account with regard to their satisfaction with specific activities and modified activities with the goal of maximizing student engagement.

IMPLICATIONS FOR ADMINISTRATORS

Given that changing one’s teaching from teacher-centered instruction to learner-centered teaching is challenging, there must be administrative support for these efforts.

First, administrators can play a key role in acknowledging the importance of learner-centered teaching. Historically, universities have failed to encourage faculty members to prioritize teaching to a similar degree as research ( Fairweather et al ., 1996 ). Unfortunately, many tenure-track faculty members at research-intensive universities fear that they may be penalized for investing the time to adopt learner-centered teaching. Research-oriented universities should prioritize teaching in order to support more widespread adoption of evidence-based teaching approaches. Julie reflected on her frustration with the university’s message that teaching is devalued relative to research:

I think for assistant professors, I was actually scolded for putting time into teaching and trying to participate in teaching improvements and so, I think it’s discouraged, perhaps rightly so, because they’re not going to value it, so if that’s going to take away from what’s required to get tenure, to get promoted, they want you to know that. So they’re just being honest perhaps.

As part of a university culture that values learner-centered teaching, administrators (e.g., chairs, promotion committees) should acknowledge instructors who are making the effort to transition their courses and understand if their teaching evaluations are lower during the initial semesters of transition.

Second, as evidenced by our study and by others in the literature, transitioning from lecture-based teaching to learner-centered teaching requires a large time commitment from instructors. Thus, funding and release time are valuable supports that administrators can provide to improve the quality of teaching at their institutions. The provost grant was a fundamental support contributing to our instructors’ success in transitioning a core biology course. Further, the fact that teaching fellowships were awarded from the university provost shows that our research-intensive university is beginning to value faculty members’ adoption of learner-centered teaching. Alex commented on these fellowships:

The message comes through that the university values teaching, otherwise we wouldn’t have these fellowships from the Provost, that’s about as high up as it gets, I mean there is this signal, a voice that says, great, please do this. But then when the rubber meets the road, are you going to get promoted? It is not considered a substitute for quality research productivity as a research-active faculty.

Third, learner-centered instruction requires more human resources relative to teacher-centered instruction (e.g., for grading, facilitating small-group discussions, demonstrations, assisting in revising course activities). Administrators should consider ways to assign more TAs to courses that use learner-centered teaching. TAs and/or ULAs could be compensated through financial means or through other methods like course credit. Our university, for example, has developed a training program for undergraduate TAs, in which they receive training in how to facilitate small groups.

Fourth, in universities where there are state-of-the art facilities for teaching and learning, there should be a priority for courses that adopt innovative teaching approaches. In our university, such facilities are in a state of development, and administrators are planning to incentivize faculty who are using evidence-based teaching approaches by giving them priority to teach in the new, state-of-the art teaching and learning facility, which includes classrooms with round tables, movable seats, and advanced technology.

Finally, universities should provide support for a campus teaching and learning expert and an FLC. These resources were fundamental in the transition process of our faculty members. FLCs may be discipline-based ( Marbach-Ad et al ., 2010 ) or campus-wide ( Cox, 2001 ). FLCs and teaching and learning experts can provide pedagogical and curricular guidance, as well as emotional support for the stressors associated with teaching.

Supplementary Material

Acknowledgments.

This work has been approved by the University of Maryland Institutional Review Board (IRB protocol 601750-2). We thank our teaching team members who participated in the study.

  • American Association for the Advancement of Science. Vision and Change in Undergraduate Biology Education: A Call to Action. Washington, DC: 2011. http://visionandchange.org/files/2011/03/Revised-Vision-and-Change-Final-Report.pdf (accessed 8 November 2016) [ Google Scholar ]
  • Andrews TC, Lemons PP. It’s personal: biology instructors prioritize personal evidence over empirical evidence in teaching decisions. CBE Life Sci Educ. 2015; 14 :ar7. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Baldwin RG. The climate for undergraduate teaching and learning in STEM fields. New Dir Teach Learn. 2009; 117 :9–17. [ Google Scholar ]
  • Bloom BS, Krathwohl DR. Taxonomy of Educational Objectives: The Classification of Educational Goals, Handbook 1: Cognitive Domain. New York: Longmans; 1956. [ Google Scholar ]
  • Bourrie DM, Cegielski CG, Jones-Farmer LA, Sankar CS. Identifying characteristics of dissemination success using an expert panel. Decision Sci J Innov Educ. 2014; 12 :357–380. [ Google Scholar ]
  • Brownell SE, Tanner KD. Barriers to faculty pedagogical change: lack of training, time, incentives, and… tensions with professional identity. CBE Life Sci Educ. 2012; 11 :339–346. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Carleton KL, Rietschel CH, Marbach-Ad G. Group active engagements using quantitative modeling of physiology concepts in large-enrollment biology classes. J Microbiol Biol Educ. 2017; 17 ( in press ) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Connolly MR, Seymour E. 2015. Why Theories of Change Matter (WCER Working Paper No. 2015-2). www.wcer.wisc.edu/publications/workingPapers/papers.php (accessed 8 November 2016)
  • Cox MD. Faculty learning communities: change agents for transforming institutions into learning organizations. To Improve the Academy. 2001; 19 :69–93. [ Google Scholar ]
  • Creswell JW. Research Design: Qualitative, Quantitative, and Mixed Methods Approaches, 2nd ed. Thousand Oaks, CA: Sage; 2003. [ Google Scholar ]
  • Dancy M, Henderson C. Pedagogical practices and instructional change of physics faculty. Am J Phys. 2010; 78 :1056–1063. [ Google Scholar ]
  • Ebert-May D, Derting TL, Hodder J, Momsen JL, Long TM, Jardeleza SE. What we say is not what we do: effective evaluation of faculty professional development programs. BioScience. 2011; 61 :550–558. [ Google Scholar ]
  • Eddy S, Hogan K. Getting under the hood: how and for whom does increasing course structure work. CBE Life Sci Educ. 2014; 13 :453–468. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Ellsworth JB. Surviving Change: A Survey of Educational Change Models. Washington, DC: Office of Educational Research and Improvement; 2000. [ Google Scholar ]
  • Fairweather J, Colbeck C, Paulson K, Campbell C, Bjorklund S, Malewski E. Engineering Coalition of Schools for Excellence and Leadership (ECSEL): Year 6. University Park: Center for the Study of Higher Education, Penn State University; 1996. [ Google Scholar ]
  • Felder RM. Reaching the second tier. J Coll Sci Teach. 1993; 23 :286–290. [ Google Scholar ]
  • Fink LD. Creating Significant Learning Experiences: An Integrated Approach to Designing College Courses. Hoboken, NJ: Wiley; 2013. [ Google Scholar ]
  • Freeman S, Eddy S, McDonough M, Smith MK, Okoroafor N, Jordt H, Wenderoth MP. Active learning increases student performance in science, engineering, and mathematics. Proc Natl Acad Sci USA. 2014; 111 :8410–8415. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Haag ES, Marbach-Ad G. Quantitative modeling of membrane transport and anisogamy by small groups within a large-enrollment organismal biology course. J Microbiol Biol Educ. 2017; 17 ( in press ) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Haak DC, HilleRisLambers J, Pitre E, Freeman S. Increased structure and active learning reduce the achievement gap in introductory biology. Science. 2011; 332 :1213–1216. [ PubMed ] [ Google Scholar ]
  • Hamdan N, McKnight P, McKnight K, Arfstrom KM. A review of flipped learning. 2013. Flipped Learning Network. flippedlearning.org/wp-content/uploads/2016/07/LitReview_FlippedLearning.pdf (accessed 8 November 2016)
  • Hammer D. Epistemological beliefs in introductory physics. Cogn Instr. 1994; 12 :151–183. [ Google Scholar ]
  • Handelsman J, Miller S, Pfund C. Scientific Teaching. New York: Freeman; 2007. [ PubMed ] [ Google Scholar ]
  • Hart Research Associates. Falling Short? College Learning and Career Success. Washington, DC: 2015. [ Google Scholar ]
  • Henderson C. The challenges of instructional change under the best of circumstances: a case study of one college physics instructor. Am J Phys. 2005; 73 :778–786. [ Google Scholar ]
  • Henderson C, Dancy MH. Barriers to the use of research-based instructional strategies: the influence of both individual and situational characteristics. Phys Rev Spec Top Phys Educ Res. 2007; 3 :020102. [ Google Scholar ]
  • Jensen JL, Kummer TA, Godoy PD. Improvements from a flipped classroom may simply be the fruits of active learning. CBE Life Sci Educ. 2015; 14 :ar5. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Johnson DW, Johnson RT, Holubec EJ, Roy P. Circles of Learning: Cooperation in the Classroom. Alexandria, VA: Association for Supervision and Curriculum Development; 1984. [ Google Scholar ]
  • Kezar A, Gehrke S, Elrod S. Implicit theories of change as a barrier to change on college campuses: an examination of STEM reform. Rev High Educ. 2015; 38 :479–506. [ Google Scholar ]
  • Marbach-Ad G, McAdams KC, Benson S, Briken V, Cathcart L, Chase M, El-Sayed NM, Frauwirth K, Fredericksen B, Joseph SW, Lee V. A model for using a concept inventory as a tool for students’ assessment and faculty professional development. CBE Life Sci Educ. 2010; 9 :408–416. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Marbach-Ad G, Rietschel C, Saluja N, Carleton KL, Haag E. The use of group activities in introductory biology supports learning gains and uniquely benefits high-achieving students. J Microbiol Biol Educ. 2017; 17 ( in press ) [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Maykut P, Morehouse R. Beginning Qualitative Research: A Philosophic and Practical Approach. Bristol, PA: Falmer; 1994. [ Google Scholar ]
  • McKeachie W, Svinicki M. McKeachie’s Teaching Tips: Strategies, Research, and Theory for College and University Teachers. Boston. MA: Houghton Mifflin; 2006. [ Google Scholar ]
  • Merriam SB. Qualitative Research: A Guide to Design and Implementation. San Francisco, CA: Jossey-Bass; 2009. [ Google Scholar ]
  • Michaelsen LK, Black RH. In: Collaborative Learning: A Sourcebook for Higher Education, vol. 2. University Park: National Center on Postsecondary Teaching, Learning, and Assessment, Pennsylvania State University; 1994. Building learning teams: the key to harnessing the power of small groups in education; pp. 65–81. [ Google Scholar ]
  • Michaelsen LK, Knight AB, Fink LD. Team-Based Learning: A Transformative Use of Small Groups in College Teaching. Sterling, VA: Stylus; 2004. [ Google Scholar ]
  • Michaelsen LK, Sweet M, Parmelee DX (eds.), editors. Team-Based Learning: Small Group Learning’s Next Big Step, New Directions for Teaching and Learning, Number 116. Hoboken, NJ: Wiley; 2008. [ Google Scholar ]
  • Okebukola PA. Cooperative learning and students’ attitudes to laboratory work. School Sci Math. 1986; 86 :582–590. [ Google Scholar ]
  • Otero V, Pollock S, Finkelstein N. A physics department’s role in preparing physics teachers: the Colorado learning assistant model. Am J Phys. 2010; 78 :1218–1224. [ Google Scholar ]
  • Preszler RW. Replacing lecture with peer-led workshops improves student learning. CBE Life Sci Educ. 2009; 8 :182–192. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Rogers EM. Diffusion of Innovations, 5th ed. New York: The Free Press; 2003. [ Google Scholar ]
  • Schalk KA, McGinnis JR, Harring JR, Hendrickson A, Smith AC. The undergraduate teaching assistant experience offers opportunities similar to the undergraduate research experience. J Microbiol Educ. 2009; 10 :32–42. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Seidel SB, Tanner KD. What if students revolt?—Considering student resistance: origins, options, and opportunities for investigation. CBE Life Sci Educ. 2013; 12 :586–595. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Seymour E, Hewitt NM. Talking about Leaving: Why Undergraduates Leave the Sciences. Boulder, CO: Westview; 1997. [ Google Scholar ]
  • Shekhar P, Demonbrun M, Borrego M, Finelli C, Prince M, Henderson C, Waters C. Development of an observation protocol to study undergraduate engineering student resistance to active learning. Int J Eng Educ. 2015; 31 :597–609. [ Google Scholar ]
  • Watson WE, Kumar K, Michaelsen LK. Cultural diversity’s impact on interaction process and performance: comparing homogeneous and diverse task groups. Acad Manage J. 1993; 36 :590–602. [ Google Scholar ]
  • Weimer M. Learner-Centered Teaching. San Francisco: Jossey-Bass; 2013. [ Google Scholar ]
  • Wieman C, Perkins K, Gilbert S. Transforming science education at large research universities: a case study in progress. Change. 2010; 42 :6–14. [ Google Scholar ]
  • Wiggins GP, McTighe J. Understanding by Design. Alexandria, VA: Association for Supervision and Curriculum Development; 2005. [ Google Scholar ]
  • Yin RK. Case Study Research: Design and Methods, 3rd ed. Thousand Oaks, CA: Sage; 2003. [ Google Scholar ]

Library homepage

  • school Campus Bookshelves
  • menu_book Bookshelves
  • perm_media Learning Objects
  • login Login
  • how_to_reg Request Instructor Account
  • hub Instructor Commons

Margin Size

  • Download Page (PDF)
  • Download Full Book (PDF)
  • Periodic Table
  • Physics Constants
  • Scientific Calculator
  • Reference & Cite
  • Tools expand_more
  • Readability

selected template will load here

This action is not available.

Biology LibreTexts

Case Study: Unusual microbes

  • Last updated
  • Save as PDF
  • Page ID 404

\( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

\( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

\( \newcommand{\Span}{\mathrm{span}}\)

\( \newcommand{\id}{\mathrm{id}}\)

\( \newcommand{\kernel}{\mathrm{null}\,}\)

\( \newcommand{\range}{\mathrm{range}\,}\)

\( \newcommand{\RealPart}{\mathrm{Re}}\)

\( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

\( \newcommand{\Argument}{\mathrm{Arg}}\)

\( \newcommand{\norm}[1]{\| #1 \|}\)

\( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

\( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

\( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

\( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

\( \newcommand{\vectorC}[1]{\textbf{#1}} \)

\( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

\( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

\( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

Terminology

Epulopiscium, thiomargarita, small bacteria.

  • Bacteria that give birth to live young

Square bacteria

Microbes with too much dna, microbes with too many genes, bacteria with "atypical" chromosomes.

  • Bacteria that can count -- and talk

Bacteria that know where north is

Bacteria that eat other bacteria, multicellular bacteria, a huge virus, the biggest microbe, briefly noted, contributors, introduction.

This page was written in collaboration with Borislav Dopudja, a third-year science student at the University of Zagreb. It grew out of some casual but extensive discussions we were having exploring some of the oddities of the microbial world, things that don't quite fit with our common views. It seems worthwhile to share some of these. There is no attempt here to be profound, but rather to have some fun enjoying the diversity of the microbial world. Borislav's web site is: www.pluff-sky.net/. I have also listed it, with more information, on my page of Internet resources: Biology Miscellaneous in the Biology: other section.

What are "microbes"? Microbes (or microorganisms) are small organisms. For our purposes, that means single-celled organisms. The single cells may be prokaryotic or eukaryotic. The prokaryotic microbes include the bacteria and the archaea (or the eubacteria and archaebacteria, by older terminology). The eukaryotic microbes include the protists (protozoa), the fungi and at least the unicellular algae. The terms are not always used consistently, especially in older literature. Whether viruses should be included is a matter of taste, and I won't be entirely consistent there; for the most part, we will discuss cellular organisms.

Most of the links below are to web sites that are suitable for "the general audience". A few links to articles from the regular scientific literature are given in small type. In particular, in some cases I have included links to the first reports of these organisms or of key features.

Many links are to Microbe magazine -- or to its precursor, ASM News. Microbe is the news magazine of the American Society for Microbiology. Microbe is written for microbiologists, but written to be enjoyed by a wide range of non-specialists. Both the news stories and feature articles can serve well as readable material with serious scientific content, yet not too technical. Microbe is now freely available online. Links to individual items are given as they come up. If you would like to browse Microbe magazine -- recommended! -- go to http://www.microbemagazine.org/ . The current issue will come up; for more, see "Explore Microbe" at the left.

Some links are given to original articles or news stories in Science magazine. Some of these are freely available online, though you may need to create a free registration before getting access to the full text. In general, Science releases research articles -- but not news stories -- for free access 12 months after publication; their file goes back to about 1997. (Those with institutional subscription access, such as those using university computers at UC Berkeley, have full access, and will not be asked to register or log on.) The home page for Science magazine: http://www.sciencemag.org .

Big bacteria

One of the most characteristic properties of bacteria is that they are small. Microscopic. Barely visible under the microscope: we can tell their general shape, but can generally see very little structure. Typical dimensions are on the order of 1 micrometer (1 μm). So, have a look at Epulopiscium and Thiomargarita -- bacteria big enough to be seen with the naked eye. These bacteria -- at least the larger specimens -- approach 1 millimeter (1 mm) in size. One of these was first reported in 1985 -- but not understood to be a bacterium until 1993; the other was first reported in 1999.

Epulopiscium grows in the gut of certain fish. It has a complex life cycle, which is coordinated with the daily rhythm of its host. This complex -- and unusual -- life cycle qualifies Epulopiscium for another section of this page: Bacteria that give birth to live young .

  • http://microbewiki.kenyon.edu/index.php/Epulopiscium . From the Microbe Wiki.
  • www.microbelibrary.org/index....ium-fishelsoni. From the Microbe Library at ASM.
  • www.accessexcellence.org/LC/ST/st12bg.php. "Epulopiscium fishelsoni, Big bug baffles biologists!", an essay on this unusual bug, from Peggy E Pollak & W Linn Montgomery, both early investigators of Epulos. (Another Access Excellence page is listed on this page, in the section The biggest microbe? . The Access Excellence site is listed as a general resource on my page of Miscellaneous Internet Resources, under Of local interest... -- since it had its origins near here.)

Cornell researchers study bacterium big enough to see -- the Shaquille O'Neal of bacteria . Press release (May 6, 2008) on new work showing that an Epulopiscium cell contains 100,000 or so copies of its genome, thus has many times more DNA than a human cell. This site is worth it for the pictures alone. The upper picture is a classic, showing an Epulo, a paramecium and an ordinary E. coli bacterium. http://www.news.cornell.edu/stories/...cteria.kr.html . The paper, from Angert's lab at Cornell and collaborators in Australia and New Zealand, is: J E Mendell et al, Extreme polyploidy in a large bacterium. PNAS 105:6730-6734, 5/6/08. Online: http://www.pnas.org/content/105/18/6730.abstract .

The first report of Epulopiscium, describing it as a large and peculiar cigar-shaped organism, presumably a protist: L Fishelson et al, A unique symbiosis in the gut of tropical herbivorous surgeonfish (Acanthuridae: Teleostei) from the Red Sea. Science 229:49, 7/5/85. The abstract is freely available at http://www.sciencemag.org/content/22...08/49.abstract . You may or may not be able to get the full article at that site. If not and you have an institutional subscription to JStor, such as at UCB, try www.jstor.org/stable/1695432. The definitive report that Epulopiscium is really a bacterium: E R Angert et al, The largest bacterium. Nature 362:239, 3/18/93. http://www.nature.com/nature/journal.../362239a0.html .

Thiomargarita can be quite big, but it "cheats". It is mostly vacuole. Why? Well, it is quite like a deep sea diver carrying an oxygen tank. Thiomargarita uses nitrate ions in its respiration, rather than oxygen gas; the vacuole is a supply of nitrate that lets the bug continue to respire at great depths.

Is Life Thriving Deep Beneath the Seafloor? An article from the Woods Hole Oceanographic Institute (WHOI). http://www.whoi.edu/oceanus/viewArticle.do?id=2497 . To focus on Thiomargarita, scroll down to "The world's largest bacterium". The article is by WHOI oceanographer Carl Wirsen, April 2004. WHOI microbiologist Andreas Teske was part of the team that discovered Thiomargarita; he is a co-author of the Science paper listed below as the original report.

The original report on Thiomargarita: H N Schulz et al, Dense populations of a giant sulfur bacterium in Namibian shelf sediments. Science 284:493, 4/16/99. It is accompanied by a news story: B Wuethrich, Microbiology: Giant sulfur-eating microbe found. Science 284:415, 4/16/99. The article is freely available at: http://www.sciencemag.org/content/28...3/493.abstract .

Scientists from UC Berkeley, led by Dr Jill Banfield, have found an archaeon smaller than any cellular organism previously known. It is about 200 nm (0.2 μm) diameter. It is so small that it is very near the "limit" of what people think might be the smallest possible organism. In fact, some people think it might be below that limit! Time will tell whether the new claim is valid. An important issue is whether this is a "complete" organism, or a parasite of some kind that is absolutely dependent on other cells to provide basic functions. This is part of their work on the acidic mine drainage from the Richmond Mine at Iron Mountain, Calif.

Shotgun sequencing finds nanoorganisms. A news release from UC Berkeley, December 2006, on this discovery: http://www.berkeley.edu/news/media/releases/2006/12/21_microbes.shtml . The original report on this tiny organism: B J Baker et al, Lineages of acidophilic archaea revealed by community genomic analysis. Science 314:1933, 12/22/06. http://www.sciencemag.org/content/31.../1933.abstract .

There is another story of small bacteria, a story that has been around for several years but has not really been confirmed. The basic idea is a claim that there are tiny bacteria involved in such processes as calcification of your arteries. These bacteria, which have been termed nanobacteria , are alleged to be even smaller than those discussed above -- far below any reasonable limit of what is "possible" for a living cell. Since these alleged organisms really do not fit in any modern understanding of what cells are, solid evidence is needed -- and is lacking. Two new papers appeared in early 2008 with rather strong evidence that these "things" are not alive. They appear to be some calcium minerals, complexed with protein. They may well be interesting, and they may still be involved in disease processes, but they are not bacteria. The Wikipedia entry is a good introduction to these "nanobacteria" (or "calcifying nanoparticles"), including the uncertainties that surround them. It notes these 2008 papers, and has links to them, and to one good news story on the new findings. http://en.Wikipedia.org/wiki/Nanobacterium .

Bacteria divide by binary fission: they grow bigger, and then divide in two. But there are exceptions. An interesting type of exception occurs when bacteria seem to give birth to live young. That is, they develop new cells inside, and then liberate these daughter cells. One of the first cases where this type of bacterial reproduction was seen was with Epulopiscium, discussed in the section on Big bacteria . Then it was found in the bacterium Metabacterium -- but in a form that was easier to understand. It has long been known that some bacteria make spores. Specifically, bacteria of the genera Bacillus and Clostridia make "endospores": each cell makes one spore, a resistant structure that is capable of long term survival. Such spore formation does not increase the population, because each cell makes one spore. It merely results in a new type of cell, the resistant spore. But Metabacterium makes multiple spores per cell -- and rarely undergoes the more "ordinary" process of binary fission. Thus a variation of ordinary endospore formation has become the primary means of reproduction. With Epulopiscium, it would seem that this process has been modified further, so that what is produced is not spores but rather ordinary cells -- baby cells.

Thus both Metabacterium and Epulopiscium "give birth to live young" -- a process that can be thought of as a variation of ordinary endospore formation.

Esther Angert, Beyond binary fission: Some bacteria reproduce by alternative means. Microbe 1:127, 3/06: forms.asm.org/microbe/index.asp?bid=41230 (HTML) or forms.asm.org/ASM/files/ccLib...0306000127.pdf (PDF). Angert, at Cornell, works with both Metabacterium and Epulopiscium. She was the first to recognize that Epulopiscium was actually a bacterium.

meta1_1_2.jpg

Metabacterium with four daughter spores . The figure at the right shows a single Metabacterium polyspora cell containing four spores, with the bright appearance that is typical of bacterial endospores. The individual spores are several μm long. The figure is a trimmed version of a figure at: author.cals.cornell.edu/cals/...abacterium.cfm. That page discusses the life cycle of Metabacterium, and how it relates to the natural environment for this organism. It is part of Esther Angert's web site; the home page is author.cals.cornell.edu/cals/...-lab/intro.cfm.

More links for Epulopiscium are in the section on Big bacteria .

What is the shape of bacteria? Round. Or roundish -- such as rods with rounded ends. Certainly not square, with sharp corners. Imagine then the surprise of the scientist who, in 1980, found square bacteria with sharp corners, in concentrated salt solutions. They are not only square, but very thin -- about 200 nm (0.2 μm) thick. They seem to grow as two dimensional objects, increasing the size of their squares, but not their thickness. Square bacteria caught in the act of division look like a sheet of postage stamps.

Their thinness increases their surface to volume ratio; this may be important in helping them to maintain a proper intracellular environment. They probably spend much energy pumping ions out!

It is not known why they are square or how they achieve their squareness. The square bacteria are archaea. (That was recognized in the original report; at that time, the archaebacteria were considered a type of bacteria, whereas we now consider them a distinct group from the bacteria per se.) They have been named Haloquadratum walsbyi.

foursquares_ao_vs.jpg

The following two links both include good pictures of the square bacteria. The figure at the right is a variation of one shown at the second site, Dyall-Smith's web site.

  • Square bacteria grown in the laboratory. Press release, from the University of Melbourne, announcing the first successful growth of the square bacteria in the lab, October 2004. uninews.unimelb.edu.au/news/1855/.
  • Web site for Dr Mike Dyall-Smith, group leader for that work: http://www.haloarchaea.com/ . The broad topic of the site is Haloarchaea and Haloviruses.

Edwin Abbott would have loved them. http://www.ibiblio.org/eldritch/eaa/FL.HTM . A good read!

Genome paper: H Bolhuis et al, The genome of the square archaeon Haloquadratum walsbyi: life at the limits of water activity. BMC Genomics 7:169, 7/4/06. Free online: http://www.biomedcentral.com/1471-2164/7/169 . The original report of these organisms: A E Walsby, A square bacterium. Nature 283:69, 1/3/80. It's a delightful little paper, a brief report of a quite unexpected observation. Online: http://www.nature.com/nature/journal.../283069a0.html . Even reading the opening lines, which are freely available there, gives a nice hint of the literary quality of this paper. However, it probably requires subscription for full access.

The organism with the largest known genome? Amoeba dubia, a protist. 670 billion base pairs of DNA. That is about 200 times more than we have. What is the significance of this finding? It's not at all clear. In fact, it is hard to even find the source of the number. Is this really a measure of the haploid genome size? Or is it simply based on the cellular DNA content, with the assumption that the cell is diploid? Not only is the original source hard to pin down, there seems to be no modern work following it up. But the number is oft-quoted, so we quote it too. Someday we may understand what it means.

For a nice discussion of genome sizes, see http://www.genomesize.com/statistics.php . Scroll down to the section "A comment on the overall animal range", and what follows, including a nice graph summarizing genome sizes over all types of organisms. This is from T Ryan Gregory, Univ Guelph. Regardless of the ultimate verdict on the Amoeba dubia genome, many organisms have genomes much larger than ours.

In the web page referred to above, Gregory gives genome size in picograms (pg). Biologists often give genome sizes in base pairs (bp). 1 picogram of DNA is about 10 9 (one billion) base pairs. For example, the human genome contains about 3.5 billion bp, and weighs about 3.5 pg. Gregory's genome size site is also referred to on my Internet resources - Molecular Biology page, under Genomes , and in the Musings post Who is #1: the most DNA? (March 7, 2011) .

The human genome project brought us the revelation that we have only about 22,000 genes -- not all that many more than a worm (Caenorhabditis elegans, 20,000 genes) or a fruit fly (Drosophila melanogaster, 14,000 genes). Now, Trichomonas vaginalis - a common sexually-transmitted protist (protozoan)... Its genome sequence was reported in January 2007. Preliminary analysis suggests about 60,000 genes.

Don't make too much of this. Gene counts are notoriously difficult, as we learned from the human genome project. Identification of genes simply by looking at DNA sequences is something of an art. In fact, the report offers multiple numbers for the gene count, using different criteria. Of course, I chose the higher one for this note. Further, the significance of the gene count is unclear. We now understand that many proteins can be made from a "single" gene (for example, by alternative splicing). Nevertheless, this stands, at least for now: the most genes known in any microbe, in fact, in any organism. Glossary entry: Alternative splicing .

News story in Microbe, 4/07: "Peculiar" T. vaginalis parasites are jam-packed with genes. forms.asm.org/microbe/index.asp?bid=49481.

Scientists Crack the Genome of the Parasite Causing Trichomoniasis. The press release from the New York Univ School of Medicine, one of the lead institutions for this work. January 11, 2007. http://communications.med.nyu.edu/ne...trichomoniasis .

The report of the Trichomonas genome: J M Carlton et al, Draft genome sequence of the sexually transmitted pathogen Trichomonas vaginalis. Science 315:207, 1/12/07. Online: http://www.sciencemag.org/content/315/5809/207.short .

In the early days, it was very difficult to observe bacterial chromosomes. Bacteria are small, and their chromosomes are quite tiny by comparison with eukaryotic chromosomes. Further, bacterial chromosomes do not condense into more compact and more easily visible bodies, again in contrast to eukaryotic chromosomes. So, information about bacterial chromosomes emerged slowly, with various -- and mostly indirect -- techniques.

The early work suggested that bacteria have only one chromosome, and that it is circular. In one case, one E coli chromosome was even observed -- a circle. Other work seemed consistent with this, so the generality emerged: bacteria have only one chromosome, and it is circular. Both of these features served to distinguish bacteria from the eukaryotes. And that was nice: bacteria are supposedly "simpler", and having only one chromosome is certainly "simpler". Further, having a circular chromosome avoided the difficulties of replicating linear DNA, and could also be considered "simpler".

Alas, it is not really so. Neither feature is universal among bacteria. This web site discusses bacterial chromosomes, and has a table showing the number and type of chromosomes found in many bacteria. http://www.sci.sdsu.edu/~smaloy/MicrobialGenetics/topics/chroms-genes-prots/chromosomes.html . From Stanley Maloy, San Diego State University. It is part of a larger site on the broader topic of microbial genetics: http://www.sci.sdsu.edu/~smaloy/MicrobialGenetics/ .

The following site is from a discussion of bacterial genetics in an online microbiology textbook. I include it here particularly because it contains a nice copy of a very famous figure, which I referred to above. Go to http://www.ncbi.nlm.nih.gov/books/NBK7908/ ; scroll down to Fig 5.2. This shows a single chromosome of one E coli cell, in the act of replicating. It is clearly circular.

That site is from Chapter 5, Genetics, by R K Holmes & M G Jobling, of the online book Medical Microbiology, 4th edition, edited by S Baron. This online book is listed in the Microbiology: books section of my page of Internet resources: Biology - Miscellaneous. The figure is from John Cairns, Cold Spring Harbor Symposia on Quantitative Biology 28:44, 1963.

Some bacteria can emit light -- more or less as fireflies do. The phenomenon is called bioluminescence. But the light from a single bacterial cell would be too dim to be of any use. So, isolated bacteria do not emit light. They only emit light when there are many of them together, so that -- together -- they give off a substantial amount of light. Clearly, bacteria can count how many neighbors they have.

How do bacteria count their population size? The basic logic of how they do it is actually rather simple. To make light, they need an "inducer" -- a substance that turns on the light-producing system. They make an inducer, and secrete it into the external environment. They then take it up from the environment. What does this accomplish? Well, imagine a simple situation of bacteria growing in a test tube. If there are only a few bacteria, there will be little inducer in the tube. When the bacteria try to take up inducer from the environment, they find very little -- and thus they do not emit light. But if the bacteria grow, so that there are many many bacteria in the tube, all making and secreting inducer, then the bacteria find a high level of inducer in the environment; they take it up, and emit light. Thus the bacteria sense their population size by responding to the level of inducer in the medium; as a result, they emit light only when the population size is large.

Is that artificial situation of a test tube of bacteria relevant to the bacteria in nature? Indeed it is. Some fish cultivate these bacteria in a special pouch, called a light organ. The light organ emits light only when it contains enough bacteria to do so usefully.

The phenomenon discussed above is often called quorum sensing . That is, the bacteria check to see if a quorum is present before emitting light. The details of this are now quite well understood. And as the system was being studied, it became clear that it was simply one example of a much wider phenomenon: bacteria communicating to each other, for a range of purposes. In this case, the bacteria are communicating their population size to their own kind. But more broadly, bacteria are signaling their presence -- and numbers -- to other types of bacteria too.

What is the highest temperature at which life is possible? We all know that many of the molecules in living systems are quite sensitive to heat; the ease of cooking an egg reminds us of that regularly.

When I was in college, the highest temperature reported for life was around 60° C (degrees Celsius), the maximum temperature (T max ) for growth of the bacterium Bacillus stearothermophilus. Since then, the known maximum has increased to at least 113° C, and perhaps even to 121° C. This increase came along with the discoveries of an entirely new class of microorganism, the archaea, and of a new geological phenomenon, the deep sea thermal vent; both discoveries date from 1977. Thus the increase in known T max for life is not simply an abstract story of some biological limit, but is part of a broad series of major advances in both biology and geology.

Thermus aquaticus has a maximum growth temperature of about 80° C. It was isolated from hot springs in Yellowstone National Park, and was reported in 1969. Thermus aquaticus is perhaps the organism that ushered in the new era of the commercialization of enzymes from thermopiles -- useful precisely because of their heat stability; the "Taq" DNA polymerase made the polymerase chain reaction -- PCR -- practical.

As noted above, 1977 brought the separate discoveries of archaea and deep sea thermal vents. Over the following years, these stories converged, and a succession of hyperthermophilic archaea were discovered near the vents. 1997 brought Pyrolobus fumarii, which grows up to 113° C; this archaeon has been widely accepted as having the highest known T max . 2003 brought a report of an archaeon that could grow at 121° C -- the normal operating temperature of an autoclave commonly used to kill even the most resistant forms of life, or so we thought. This organism has been dubbed simply Strain 121 for now.

These continuing discoveries of organisms with ever higher T max , maybe even up to the common operating temperature of an autoclave, raise some questions: ... [ more ]

Derek Lovley's web page (Univ Massachusetts) on the work that led to Strain 121: www.geobacter.org/Life-Extreme.

The first report of Strain 121. K Kashefi & D R Lovley, Extending the upper temperature limit for life. Science 301:934, 8/15/03. Online at http://www.sciencemag.org/content/301/5635/934.full . For more about Lovley's lab, see "Electricigenic bacteria" in either the Redox section of the page for Internet resources for Intro Chem or the Carbohydrates section of the page for Internet resources for Intro Organic/Biochem. For an update, see a summary of the Ninth International Conference on Thermophiles (Bergen, Norway, September 2007). T Satyanarayana, Meeting report: Thermophiles 2007. Current Science 93(10):1340, 11/25/07. Current Science, published by the Indian Academy of Sciences is freely available online. This article is at: http://www.ias.ac.in/currsci/nov252007/1340.pdf . The article contains a number of interesting tidbits. They suggest that the finding that strain S121 grows at 121° C has been questioned; they also suggest that another microbe has been shown to grow at 122° C at high pressure. It is normal enough that such claims are questioned. Time will tell. None of these details change the general perspective on high temperature microbes presented here.

A microbiologist looks at a sample under the microscope. He notices that the bacteria seem to be moving over to one side of the microscope slide. Why? Perhaps they are responding to the light. So he adjusts the lighting, and it has no effect. After numerous such observations and tests, the conclusion is inescapable: the bacteria go north. Now, that is novel! He looks at the bacteria further, and finds that they contain tiny magnets -- iron oxide magnets, just like simple toy magnets. And that is how magnetic bacteria were discovered -- by Richard Blakemore in 1975.

Why do these bacteria use a magnet to guide their swimming? A common idea -- not entirely accepted -- is that these bacteria benefit from following the earth's magnetic lines of force. Doing that leads them "down" into the mud, which seems good for their lifestyle. Consistent with this, it was soon found that -- for some types of magnetic bacteria -- those in the northern hemisphere swim north, whereas those in the southern hemisphere swim south.

Magnetic Microbes , by Sandi Clement. commtechlab.msu.edu/Sites/dlc.../caOc96SC.html.

Magnetosomes . The figure at the right is from Richard Frankel's page: Magnetotactic Bacteria Photo Gallery. www.calpoly.edu/~rfrankel/mtbphoto.html. The figure shows a single cell of the bacterium Magnetospirillum magnetotacticum, with a chain of magnetosomes. Each individual magnetosome in the chain is approximately 45 nm across, and surrounded by a membrane. The Gallery page listed has many more figures, showing the diversity of magnetic bacteria. And for more, go to Frankel's home page, at Cal Poly San Luis Obispo: www.calpoly.edu/~rfrankel/. Scroll down to Research Interests, then Magnetotactic Bacteria.

R B Frankel & D A Bazylinski, Magnetosome mysteries. ASM News 70:176, 4/04. The news magazine ASM News -- now called Microbe -- is free online; this item is at forms.asm.org/microbe/index.asp?bid=26445.

C N Keim et al, Magnetoglobus, Magnetic aggregates in anaerobic environments. Microbe 2:437, 9/07. Microbe, the news magazine of the American Society for Microbiology is free online; this item is at forms.asm.org/microbe/index.asp?bid=52638. An article about a type of magnetic bacterium that normally occurs in multicellular aggregates. Should this be considered a multicellular bacterial organism? Considering that question gives insight into what multicellularity is about. This article is also listed in the section Multicellular bacteria .

W Hansen, This End Up -- Magnetic organelles point bacteria in the right direction. Berkeley Science Review, Issue 14, Spring 2008, p 8. A brief introduction to work on magnetic bacteria being done by Arash Komeili at UCB. Berkeley Science Review (BSR), published by UCB graduate students, is free online; this item is at sciencereview.berkeley.edu/ar...ticle=briefs_1.

The first report of magnetic bacteria: R Blakemore, Magnetotactic bacteria. Science 190:377-379, 10/24/75. The abstract is freely available at http://www.sciencemag.org/content/190/4212/377.abstract . You may or may not be able to get the full article at that site. If not and you have an institutional subscription to JStor, such as at UCB, try Access to Blakemore article through JStor.

The story of predatory bacteria starts with Bdellovibrio , a type of bacterium that obligatory lives within other bacterial cells. Since they kill the bacteria that they infect, Bdellovibrios form clear regions on a lawn of dense bacterial growth, much like bacterial viruses form plaques. But they are not viruses. They are cellular, with rather ordinary bacterial cells. It's just that they grow in a way that we find unusual. Well, it's not the "way" that is unusual as much as it is the "where". They burrow into a bacterial cell, and grow there.

E Jurkevitch, Predatory behaviors in bacteria -- diversity and transitions. Microbe 2:67, 2/07. Microbe, the news magazine of the American Society for Microbiology, is free online; this item is at forms.asm.org/microbe/index.asp?bid=48203.

At the end of the article listed above, Jurkevitch raises an interesting speculation about the possible role of predatory bacteria in the origin of the eukaryotic cell. Biologists agree that the mitochondrion arose from a bacterium that got inside another cell. But how did it get there? Bacteria do not show phagocytosis -- do not engulf other cells. However, predatory bacteria such as Bdellovibrio offer an alternative. Perhaps mitochondria originated by a predation event that led to symbiosis. There is no evidence on this point, so it must be regarded as speculation for now. At least it is a plausible view of how one of the great events of biological history might have occurred.

Single cells. Grow, and then divide into two. That is our simple image of bacteria. However, as we learn more about this vast group of organisms, we find that bacteria can be more complex. The myxobacteria probably have the most complex bacterial life cycle. They spend part of their life as free-living individual bacterial cells, then aggregate to form a fruiting body, an organized multicellular structure visible to the naked eye. In fact, their life cycle is rather similar to that of the cellular slime molds, such as Dictyostelium -- the myxomycetes.

Myxobacteria web page : http://myxobacteria.ahc.umn.edu/ . In particular, step through the section What are the Myxobacteria? for a good introduction with some wonderful pictures. From Dr Martin Dworkin, with the help of Tim Leonard, at the University of Minnesota.

M Dworkin, Lingering Puzzles about Myxobacteria . Microbe 2:18, 1/07. Dworkin's "puzzles" include:

  • How the cells construct the multicellular, macroscopic fruiting body
  • The biochemical basis of myxospore morphogenesis
  • The mechanism and function of individual cellular motility
  • The regulation of directionality of social movement
  • The mechanism of the cells' ability to perceive physical objects at a distance
  • The role of the myxobacteria in nature.

Microbe, the news magazine of the American Society for Microbiology, is free online; this item is at forms.asm.org/microbe/index.asp?bid=47794 (HTML) or forms.asm.org/ASM/files/ccLib...0107000018.pdf (PDF).

C N Keim et al, Magnetoglobus, Magnetic aggregates in anaerobic environments. Microbe 2:437, 9/07. Microbe, the news magazine of the American Society for Microbiology is free online; this item is at forms.asm.org/microbe/index.asp?bid=52638. An article about a type of magnetic bacterium that normally occurs in multicellular aggregates. Should this be considered a multicellular bacterial organism? Considering that question gives insight into what multicellularity is about. This article is also listed in the section Bacteria that know where north is . Other topics on this page introduce other ways in which some bacteria are more complex that we might have thought. These include:

Ordinary organisms are based on cells. The organisms reproduce by the cells growing and dividing. Viruses are different. Viruses are small and simple. Viruses do not grow and divide. They reproduce by infecting a cell, disassembling, and then directing the production of new "parts", which then assemble into new virus particles.

Small and simple? Well, usually. The smallest viruses have one millionth or so the amount of genome (DNA or RNA) we do. Some have only a handful of genes. Some have only a piece of DNA (or RNA) and a simple protein coat -- no machinery for making anything, and no enzymes.

Some viruses are not so small and not so simple. Biologists have still been able to make a clear distinction between viruses and cells, primarily by looking at their basic strategy for reproduction. Cells grow and divide; viruses disassemble and reassemble.

The most recent challenge to the simplicity of viruses is the mimivirus , which grows in the protozoan Acanthamoeba polyphaga. It has about three times more genetic material (DNA) than any previously known virus -- more DNA than some bacteria. It is bigger than some bacteria -- about 400 nm (0.4 μm) diameter. And it is quite complex, with a collection of enzymes that are supposedly not to be found in viruses. For example, mimivirus codes for several enzymes used in protein synthesis -- genes never before found in any virus. Yet its life style (and structure) make it clear that this is a virus. It was characterized as a virus only in 2003.

2008 brings new developments that make the story of mimivirus even more fascinating. First, a new mimivirus, even bigger than the first. They call it mamavirus. But perhaps more importantly, a satellite virus: a virus that can grow only in cells infected by mimivirus. A news story about this satellite virus, dubbed Sputnik: 'Sputnik' Virus Orbits, Hijacks Other Viruses, Aug. 13, 2008. dsc.discovery.com/news/2008/0...s-sputnik.html.

Discussions about mimivirus and Sputnik inevitably seem to wander onto topics such as "What is a virus?" or even "What is life?" These are fun to discuss, but a caution: they need not have simple answers, or even any answers at all beyond our common definitions. Use such questions to provide a framework for your knowledge and understanding, but forcing simple answers to complex questions is not fruitful. Mimivirus has its own website: http://www.giantvirus.org .

The organism now known as mimvirus was found in 1992. The first paper that identified it as a virus: B La Scola et al, A giant virus in Amoebae. Science 299:2033, 3/28/03. Online: http://www.sciencemag.org/content/299/5615/2033.short . The report of the Sputnik satellite virus: B La Scola et al, The virophage as a unique parasite of the giant mimivirus. Nature 455:100, 9/4/08. There is a good news story about this finding: Biggest known virus yields first-ever virophage. Microbe 3:505, 11/08. Free online: microbemagazine.org/images/st...1108000502.pdf. Scroll down to the story, on page 4 of the file.

Probably the unicellular green alga Acetabularia , whose cells can be several centimeters long. Because of the large size, Acetabularia was a favorite organism for studying the relationship between nucleus and cytoplasm. The following links introduce the organism and some classic experimental work.

  • en.Wikipedia.org/wiki/Acetabularia. Basic introduction to Acetabularia.
  • www.accessexcellence.org/RC/V...mmerling_s.php. Classic work on the role of nucleus and cytoplasm in determining cell development, done by J Hammerling in the 1930s. The large cells of Acetabularia allowed a simple but novel transplantation to be done; the results revealed the key role of the nucleus. This item is given as a link at the end of the previous one. (Another Access Excellence page is listed on this page, in the section Big bacteria . The Access Excellence site is listed as a general resource on my page of Miscellaneous Internet Resources, under Of local interest... -- since it had its origins near here.)

This section is something of a "miscellany" -- a place to briefly note some other unusual aspects of microbial life. In some cases, I may make only a single small point or note only a single paper, Perhaps some of these will grow into "full-blown" topics at some point, or perhaps we will just keep a section of "miscellany".

G forces? Humans don't do well with g forces a few times normal gravity. Microbes do better, it seems. A recent paper shows that several microbes studied, bacteria and yeast, grew in an ultracentrifuge tube with accelerations many thousands of times g. Two of the bacterial grew at the highest accelerations tested, over 400,000 x g. They have no information about what limits the growth as the g force increases; they speculate that it has something to do with sedimentation within the cell. That organisms vary might allow them to pursue finding what is important. It is also unclear why this is of interest. After all, such high g forces are found in nature only under extreme conditions, such as the shock waves of supernovae. For now, this paper is basically just a cute finding. It will be interesting to see where it leads.

  • News story... Bacteria Grow Under 400,000 Times Earth's Gravity (National Geographic, April 25, 2011): http://news.nationalgeographic.com/news/2011/04/110425-gravity-extreme-bacteria-e-coli-alien-life-space-science/ .
  • The paper... S Deguchi et al, Microbial growth at hyperaccelerations up to 403,627 x g. PNAS 108:7997, May 10, 2011. Online at: http://www.pnas.org/content/108/19/7997 .

Where is the inside? Some bacteria, such as the gram negatives, have a double membrane system. It is the inner membrane that is energized, and used to make ATP. Now we have a discovery of the first double membrane system of an archaeon -- and it is the outer membrane that is energized. The archaeon, Ignicoccus hospitalis , is closely associated with Nanoarchaeum equitans -- which relies on the Ignicoccus for its energy; is this energy parasitism dependent on the unusual energy system of the Ignicoccus? The authors even wonder whether Ignicoccus might be an ancestor of the eukaryotic cell. Clearly, this is an unusual and intriguing finding -- still quite incomplete.

For a fine introduction to this novel system, see the ASM blog entry by Moselio Schaechter... Of Archaeal Periplasm & Iconoclasm (February 11, 2010): http://schaechter.asmblog.org/schaechter/2010/02/of-archaeal-periplasm-iconoclasm.html .

The paper... U Küper et al, Energized outer membrane and spatial separation of metabolic processes in the hyperthermophilic Archaeon Ignicoccus hospitalis . PNAS 107:3152, 2/16/10. Online at: http://www.pnas.org/content/107/7/3152 .

Arsenic . There are bacteria that can oxidize arsenic compounds, and there are bacteria that can reduce arsenic compounds. Now there is a report of bacteria that can use arsenite -- AsO 3 3- , containing As(III) -- as the electron donor for photosynthesis. (The most common electron donor is water -- with oxygen gas being evolved. The most common electron donor in anaerobic systems is sulfide, often with sulfur granules being produced.) Analysis of this process suggests that arsenic metabolism is quite ancient, and that it is an important part of the arsenic cycle in nature. News story: In Lake, Photosynthesis Relies on Arsenic, August 18, 2008. http://www.nytimes.com/2008/08/19/science/19obarsenic.html .

The paper... T R Kulp et al, Arsenic(III) Fuels Anoxygenic Photosynthesis in Hot Spring Biofilms from Mono Lake, California. Science 321:967, 8/15/08. Online at: http://www.sciencemag.org/content/321/5891/967.abstract .

Microbes survive the cold . Scientists have recovered DNA and even viable bacteria from ancient ice samples in the Antarctic (and other places). The idea is that bacteria were trapped in the ice, perhaps in pockets of liquid water just big enough for the one cell. The bacteria may have carried out maintenance reactions, perhaps only a few chemical reactions per day, to survive. Even with quibbling about how old each sample really is, this is still a fascinating insight into survival of life in extreme conditions. News story: Eight-million-year-old bug is alive and growing, August 7, 2007. http://www.newscientist.com/article/dn12433 .

Here are a couple of papers, both of which should be freely available. The first goes with the news story listed above, and is generally about the isolation of the old bacteria and their DNA. The second, from UC Berkeley, is about how the bacteria may metabolize and survive in the ice. K D Bidle et al, Fossil genes and microbes in the oldest ice on Earth. PNAS 104:13455, 8/14/07. Free online at: http://www.pnas.org/content/104/33/13455.abstract . R A Rohde & P B Price, Diffusion-controlled metabolism for long-term survival of single isolated microorganisms trapped within ice crystals. PNAS 104:16592, 10/16/07. Free online at: http://www.pnas.org/content/104/42/16592.abstract .

A lonely bug . Organisms live in complex communities. Seems pretty basic in our modern understanding of biology. Certainly, we expect to find bacteria in complex communities. So, it is striking when we find a report of the discovery of a bacterial growth in a South African goldmine that seems to contain only one species. Of course, it is hard to exclude some very low level of other organisms, but the analysis shows that the main bacterium, called Candidatus Desulforudis audaxviator, is at least 99.9% of the culture.

What is it growing on down there? Well, seems likely that it is using the energy from uranium decay as its main energy source. So this loner is also a nuclear-powered bug.

The paper is: D. Chivian et al, Environmental Genomics Reveals a Single-Species Ecosystem Deep Within Earth . Science 322:275, 10/10/08. Free online at: http://www.sciencemag.org/content/322/5899/275.abstract . For a good news story about this work, see Journey Toward The Center Of The Earth: One-of-a-kind Microorganism Lives All Alone , 10/10/08: http://http://www.sciencedaily.com/releases/2008/10/081009143708.htm .

More "Curious microbes"

While looking for some nice web sites to include in the various sections above, I came across Sandi Clement's page on Magnetic Microbes listed for Bacteria that know where north is . Turns out that is part of a larger site with a theme rather similar to this one -- and written by students in a class on Extreme and Unusual Microbes taught by Dr. Rick Martin at the Center for Microbial Ecology, Michigan State Univ. The site is called The Curious Microbe - Essays of the Extreme and the Unusual : commtechlab.msu.edu/Sites/dlc...us/cindex.html.

  • Robert Bruner ( http://bbruner.org )

This page  viewed 12916  times The BioWiki has 47106 Modules.

Importance of Biology for Engineers: A Case Study

  • Conference paper
  • First Online: 29 August 2023
  • Cite this conference paper

case study on biology

  • Chinmaya Panda   ORCID: orcid.org/0000-0002-9575-6913 5 ,
  • R. Shreya 5 &
  • Lalit M. Pandey 5  

Included in the following conference series:

  • North-East Research Conclave

109 Accesses

The field of biological sciences has grown multitude in the past decade to address real-life challenges and industrial innovations to cater to the needs of society. Different biological phenomena can be approximated in terms of physical processes of mechanical work, electrical signals, and chemical energy. Considering the immense importance that biology and life-sciences hold for humankind, simplification of complex biological phenomena is a must for imparting greater understanding to multidisciplinary researchers. Therefore, many Universities have diversified their undergraduate bioscience program to include interdisciplinary courses focusing on biomechanics, bioinformatics, nanobiotechnology, and many more. Together with the study aspect, research is also being conducted in the diverse research areas of biotechnology. Keeping sustainability and co-existence in mind, students and engineers should be trained to initiate better innovations and contribute their ideas for protecting the ecosystem around them.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Breithaupt, H. (2006). The engineers approach to biology. EMBO Reports, 7 (1), 21–23. https://doi.org/10.1038/sj.embor.7400607

Brent, R. (2004). A partnership between biology and engineering. Nature Biotechnology, 22 , 1211–1214. https://doi.org/10.1038/nbt1004-1211

Article   Google Scholar  

Schuster, B., Junkin, M., Kashaf, S. S., Romero-Calvo, I., Kirby, K., Matthews, J., Tay, S., et al. (2020). Automated microfluidic platform for dynamic and combinatorial drug screening of tumor organoids. Nature Communications, 11 (1), 1–12. https://doi.org/10.1038/s41467-020-19058-4

Quesada-González, D., & Merkoçi, A. (2018). Nanomaterial-based devices for point-of-care diagnostic applications. Chemical Society Reviews, 47 (13), 4697–4709. https://doi.org/10.1039/C7CS00837F

Li, W., Pei, Y., Zhang, C., & Kottapalli, A. G. P. (2021). Bioinspired designs and biomimetic applications of triboelectric nanogenerators. Nano Energy, 84 , 105865. https://doi.org/10.1016/j.nanoen.2021.105865

Tsiapalis, D., De Pieri, A., Biggs, M., Pandit, A., & Zeugolis, D. I. (2017). Biomimetic bioactive biomaterials: The next generation of implantable devices. ACS Biomaterials Science and Engineering, 3 (7), 1172–1174. https://doi.org/10.1021/acsbiomaterials.7b00372

Cho, I. H., Kim, D. H., & Park, S. (2020). Electrochemical biosensors: Perspective on functional nanomaterials for on-site analysis. Biomaterials Research, 24 (1), 1–12. https://doi.org/10.1186/s40824-019-0181-y

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Hassabis, D., et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596 (7873), 583–589. https://doi.org/10.1038/s41586-021-03819-2

Raman, R., Mitchell, M., Perez-Pinera, P., Bashir, R., & DeStefano, L. (2016). Design and integration of a problem-based biofabrication course into an undergraduate biomedical engineering curriculum. Journal of Biological Engineering, 10 (1), 1–8. https://doi.org/10.1186/s13036-016-0032-5

French, K. E. (2019). Harnessing synthetic biology for sustainable development. Nature Sustainability, 2 (4), 250–252. https://doi.org/10.1038/s41893-019-0270-x

Download references

Author information

Authors and affiliations.

Department of Biosciences and Bioengineering, Indian Institute of Technology, Guwahati, Assam, India

Chinmaya Panda, R. Shreya & Lalit M. Pandey

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Lalit M. Pandey .

Editor information

Editors and affiliations.

Department of Civil Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India

Hemant B. Kaushik

Department of Mechanical Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India

Uday Shanker Dixit

Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India

Department of Biosciences and Bioengineering, Indian Institute of Technology Guwahati, Guwahati, Assam, India

Bithiah Grace Jaganathan

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper.

Panda, C., Shreya, R., Pandey, L.M. (2023). Importance of Biology for Engineers: A Case Study. In: Kaushik, H.B., Dixit, U.S., Jose, J., Jaganathan, B.G. (eds) Trends in Teaching-Learning Technologies. NERC 2022. Springer, Singapore. https://doi.org/10.1007/978-981-99-4874-1_8

Download citation

DOI : https://doi.org/10.1007/978-981-99-4874-1_8

Published : 29 August 2023

Publisher Name : Springer, Singapore

Print ISBN : 978-981-99-4873-4

Online ISBN : 978-981-99-4874-1

eBook Packages : Education Education (R0)

Share this paper

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

Loading metrics

Open Access

Peer-reviewed

Meta-Research Article

Meta-Research Articles feature data-driven examinations of the methods, reporting, verification, and evaluation of scientific research.

See Journal Information »

Assessing the evolution of research topics in a biological field using plant science as an example

Roles Conceptualization, Data curation, Formal analysis, Funding acquisition, Investigation, Methodology, Project administration, Resources, Software, Supervision, Validation, Visualization, Writing – original draft, Writing – review & editing

* E-mail: [email protected]

Affiliations Department of Plant Biology, Michigan State University, East Lansing, Michigan, United States of America, Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, Michigan, United States of America, DOE-Great Lake Bioenergy Research Center, Michigan State University, East Lansing, Michigan, United States of America

ORCID logo

Roles Conceptualization, Investigation, Project administration, Supervision, Writing – review & editing

Affiliation Department of Plant Biology, Michigan State University, East Lansing, Michigan, United States of America

  • Shin-Han Shiu, 
  • Melissa D. Lehti-Shiu

PLOS

  • Published: May 23, 2024
  • https://doi.org/10.1371/journal.pbio.3002612
  • Peer Review
  • Reader Comments

Fig 1

Scientific advances due to conceptual or technological innovations can be revealed by examining how research topics have evolved. But such topical evolution is difficult to uncover and quantify because of the large body of literature and the need for expert knowledge in a wide range of areas in a field. Using plant biology as an example, we used machine learning and language models to classify plant science citations into topics representing interconnected, evolving subfields. The changes in prevalence of topical records over the last 50 years reflect shifts in major research trends and recent radiation of new topics, as well as turnover of model species and vastly different plant science research trajectories among countries. Our approaches readily summarize the topical diversity and evolution of a scientific field with hundreds of thousands of relevant papers, and they can be applied broadly to other fields.

Citation: Shiu S-H, Lehti-Shiu MD (2024) Assessing the evolution of research topics in a biological field using plant science as an example. PLoS Biol 22(5): e3002612. https://doi.org/10.1371/journal.pbio.3002612

Academic Editor: Ulrich Dirnagl, Charite Universitatsmedizin Berlin, GERMANY

Received: October 16, 2023; Accepted: April 4, 2024; Published: May 23, 2024

Copyright: © 2024 Shiu, Lehti-Shiu. This is an open access article distributed under the terms of the Creative Commons Attribution License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: The plant science corpus data are available through Zenodo ( https://zenodo.org/records/10022686 ). The codes for the entire project are available through GitHub ( https://github.com/ShiuLab/plant_sci_hist ) and Zenodo ( https://doi.org/10.5281/zenodo.10894387 ).

Funding: This work was supported by the National Science Foundation (IOS-2107215 and MCB-2210431 to MDL and SHS; DGE-1828149 and IOS-2218206 to SHS), Department of Energy grant Great Lakes Bioenergy Research Center (DE-SC0018409 to SHS). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Abbreviations: BERT, Bidirectional Encoder Representations from Transformers; br, brassinosteroid; ccTLD, country code Top Level Domain; c-Tf-Idf, class-based Tf-Idf; ChatGPT, Chat Generative Pretrained Transformer; ga, gibberellic acid; LOWESS, locally weighted scatterplot smoothing; MeSH, Medical Subject Heading; SHAP, SHapley Additive exPlanations; SJR, SCImago Journal Rank; Tf-Idf, Term frequency-Inverse document frequency; UMAP, Uniform Manifold Approximation and Projection

Introduction

The explosive growth of scientific data in recent years has been accompanied by a rapidly increasing volume of literature. These records represent a major component of our scientific knowledge and embody the history of conceptual and technological advances in various fields over time. Our ability to wade through these records is important for identifying relevant literature for specific topics, a crucial practice of any scientific pursuit [ 1 ]. Classifying the large body of literature into topics can provide a useful means to identify relevant literature. In addition, these topics offer an opportunity to assess how scientific fields have evolved and when major shifts in took place. However, such classification is challenging because the relevant articles in any topic or domain can number in the tens or hundreds of thousands, and the literature is in the form of natural language, which takes substantial effort and expertise to process [ 2 , 3 ]. In addition, even if one could digest all literature in a field, it would still be difficult to quantify such knowledge.

In the last several years, there has been a quantum leap in natural language processing approaches due to the feasibility of building complex deep learning models with highly flexible architectures [ 4 , 5 ]. The development of large language models such as Bidirectional Encoder Representations from Transformers (BERT; [ 6 ]) and Chat Generative Pretrained Transformer (ChatGPT; [ 7 ]) has enabled the analysis, generation, and modeling of natural language texts in a wide range of applications. The success of these applications is, in large part, due to the feasibility of considering how the same words are used in different contexts when modeling natural language [ 6 ]. One such application is topic modeling, the practice of establishing statistical models of semantic structures underlying a document collection. Topic modeling has been proposed for identifying scientific hot topics over time [ 1 ], for example, in synthetic biology [ 8 ], and it has also been applied to, for example, automatically identify topical scenes in images [ 9 ] and social network topics [ 10 ], discover gene programs highly correlated with cancer prognosis [ 11 ], capture “chromatin topics” that define cell-type differences [ 12 ], and investigate relationships between genetic variants and disease risk [ 13 ]. Here, we use topic modeling to ask how research topics in a scientific field have evolved and what major changes in the research trends have taken place, using plant science as an example.

Plant science corpora allow classification of major research topics

Plant science, broadly defined, is the study of photosynthetic species, their interactions with biotic/abiotic environments, and their applications. For modeling plant science topical evolution, we first identified a collection of plant science documents (i.e., corpus) using a text classification approach. To this end, we first collected over 30 million PubMed records and narrowed down candidate plant science records by searching for those with plant-related terms and taxon names (see Materials and methods ). Because there remained a substantial number of false positives (i.e., biomedical records mentioning plants in passing), a set of positive plant science examples from the 17 plant science journals with the highest numbers of plant science publications covering a wide range of subfields and a set of negative examples from journals with few candidate plant science records were used to train 4 types of text classification models (see Materials and methods ). The best text classification model performed well (F1 = 0.96, F1 of a naïve model = 0.5, perfect model = 1) where the positive and negative examples were clearly separated from each other based on prediction probability of the hold-out testing dataset (false negative rate = 2.6%, false positive rate = 5.2%, S1A and S1B Fig ). The false prediction rate for documents from the 17 plant science journals annotated with the Medical Subject Heading (MeSH) term “Plants” in NCBI was 11.7% (see Materials and methods ). The prediction probability distribution of positive instances with the MeSH term has an expected left-skew to lower values ( S1C Fig ) compared with the distributions of all positive instances ( S1A Fig ). Thus, this subset with the MeSH term is a skewed representation of articles from these 17 major plant science journals. To further benchmark the validity of the plant science records, we also conducted manual annotation of 100 records where the false positive and false negative rates were 14.6% and 10.6%, respectively (see Materials and methods ). Using 12 other plant science journals not included as positive examples as benchmarks, the false negative rate was 9.9% (see Materials and methods ). Considering the range of false prediction rate estimates with different benchmarks, we should emphasize that the model built with the top 17 plant science journals represents a substantial fraction of plant science publications but with biases. Applying the model to the candidate plant science record led to 421,658 positive predictions, hereafter referred to as “plant science records” ( S1D Fig and S1 Data ).

To better understand how the models classified plant science articles, we identified important terms from a more easily interpretable model (Term frequency-Inverse document frequency (Tf-Idf) model; F1 = 0.934) using Shapley Additive Explanations [ 14 ]; 136 terms contributed to predicting plant science records (e.g., Arabidopsis, xylem, seedling) and 138 terms contributed to non-plant science record predictions (e.g., patients, clinical, mice; Tf-Idf feature sheet, S1 Data ). Plant science records as well as PubMed articles grew exponentially from 1950 to 2020 ( Fig 1A ), highlighting the challenges of digesting the rapidly expanding literature. We used the plant science records to perform topic modeling, which consisted of 4 steps: representing each record as a BERT embedding, reducing dimensionality, clustering, and identifying the top terms by calculating class (i.e., topic)-based Tf-Idf (c-Tf-Idf; [ 15 ]). The c-Tf-Idf represents the frequency of a term in the context of how rare the term is to reduce the influence of common words. SciBERT [ 16 ] was the best model among those tested ( S2 Data ) and was used for building the final topic model, which classified 372,430 (88.3%) records into 90 topics defined by distinct combinations of terms ( S3 Data ). The topics contained 620 to 16,183 records and were named after the top 4 to 5 terms defining the topical areas ( Fig 1B and S3 Data ). For example, the top 5 terms representing the largest topic, topic 61 (16,183 records), are “qtl,” “resistance,” “wheat,” “markers,” and “traits,” which represent crop improvement studies using quantitative genetics.

thumbnail

  • PPT PowerPoint slide
  • PNG larger image
  • TIFF original image

(A) Numbers of PubMed (magenta) and plant science (green) records between 1950 and 2020. (a, b, c) Coefficients of the exponential function, y = ae b . Data for the plot are in S1 Data . (B) Numbers of documents for the top 30 plant science topics. Each topic is designated by an index number (left) and the top 4–6 terms with the highest cTf-Idf values (right). Data for the plot are in S3 Data . (C) Two-dimensional representation of the relationships between plant science records generated by Uniform Manifold Approximation and Projection (UMAP, [ 17 ]) using SciBERT embeddings of plant science records. All topics panel: Different topics are assigned different colors. Outlier panel: UMAP representation of all records (gray) with outlier records in red. Blue dotted circles: areas with relatively high densities indicating topics that are below the threshold for inclusion in a topic. In the 8 UMAP representations on the right, records for example topics are in red and the remaining records in gray. Blue dotted circles indicate the relative position of topic 48.

https://doi.org/10.1371/journal.pbio.3002612.g001

Records with assigned topics clustered into distinct areas in a two-dimensional (2D) space ( Fig 1C , for all topics, see S4 Data ). The remaining 49,228 outlier records not assigned to any topic (11.7%, middle panel, Fig 1C ) have 3 potential sources. First, some outliers likely belong to unique topics but have fewer records than the threshold (>500, blue dotted circles, Fig 1C ). Second, some of the many outliers dispersed within the 2D space ( Fig 1C ) were not assigned to any single topic because they had relatively high prediction scores for multiple topics ( S2 Fig ). These likely represent studies across subdisciplines in plant science. Third, some outliers are likely interdisciplinary studies between plant science and other domains, such as chemistry, mathematics, and physics. Such connections can only be revealed if records from other domains are included in the analyses.

Topical clusters reveal closely related topics but with distinct key term usage

Related topics tend to be located close together in the 2D representation (e.g., topics 48 and 49, Fig 1C ). We further assessed intertopical relationships by determining the cosine similarities between topics using cTf-Idfs ( Figs 2A and S3 ). In this topic network, some topics are closely related and form topic clusters. For example, topics 25, 26, and 27 collectively represent a more general topic related to the field of plant development (cluster a , lower left in Fig 2A ). Other topic clusters represent studies of stress, ion transport, and heavy metals ( b ); photosynthesis, water, and UV-B ( c ); population and community biology (d); genomics, genetic mapping, and phylogenetics ( e , upper right); and enzyme biochemistry ( f , upper left in Fig 2A ).

thumbnail

(A) Graph depicting the degrees of similarity (edges) between topics (nodes). Between each topic pair, a cosine similarity value was calculated using the cTf-Idf values of all terms. A threshold similarity of 0.6 was applied to illustrate the most related topics. For the full matrix presented as a heatmap, see S4 Fig . The nodes are labeled with topic index numbers and the top 4–6 terms. The colors and width of the edges are defined based on cosine similarity. Example topic clusters are highlighted in yellow and labeled a through f (blue boxes). (B, C) Relationships between the cTf-Idf values (see S3 Data ) of the top terms for topics 26 and 27 (B) and for topics 25 and 27 (C) . Only terms with cTf-Idf ≥ 0.6 are labeled. Terms with cTf-Idf values beyond the x and y axis limit are indicated by pink arrows and cTf-Idf values. (D) The 2D representation in Fig 1C is partitioned into graphs for different years, and example plots for every 5-year period since 1975 are shown. Example topics discussed in the text are indicated. Blue arrows connect the areas occupied by records of example topics across time periods to indicate changes in document frequencies.

https://doi.org/10.1371/journal.pbio.3002612.g002

Topics differed in how well they were connected to each other, reflecting how general the research interests or needs are (see Materials and methods ). For example, topic 24 (stress mechanisms) is the most well connected with median cosine similarity = 0.36, potentially because researchers in many subfields consider aspects of plant stress even though it is not the focus. The least connected topics include topic 21 (clock biology, 0.12), which is surprising because of the importance of clocks in essentially all aspects of plant biology [ 18 ]. This may be attributed, in part, to the relatively recent attention in this area.

Examining topical relationships and the cTf-Idf values of terms also revealed how related topics differ. For example, topic 26 is closely related to topics 27 and 25 (cluster a on the lower left of Fig 2A ). Topics 26 and 27 both contain records of developmental process studies mainly in Arabidopsis ( Fig 2B ); however, topic 26 is focused on the impact of light, photoreceptors, and hormones such as gibberellic acids (ga) and brassinosteroids (br), whereas topic 27 is focused on flowering and floral development. Topic 25 is also focused on plant development but differs from topic 27 because it contains records of studies mainly focusing on signaling and auxin with less emphasis on Arabidopsis ( Fig 2C ). These examples also highlight the importance of using multiple top terms to represent the topics. The similarities in cTf-Idfs between topics were also useful for measuring the editorial scope (i.e., diverse, or narrow) of journals publishing plant science papers using a relative topic diversity measure (see Materials and methods ). For example, Proceedings of the National Academy of Sciences , USA has the highest diversity, while Theoretical and Applied Genetics has the lowest ( S4 Fig ). One surprise is the relatively low diversity of American Journal of Botany , which focuses on plant ecology, systematics, development, and genetics. The low diversity is likely due to the relatively larger number of cellular and molecular science records in PubMed, consistent with the identification of relatively few topical areas relevant to studies at the organismal, population, community, and ecosystem levels.

Investigation of the relative prevalence of topics over time reveals topical succession

We next asked whether relationships between topics reflect chronological progression of certain subfields. To address this, we assessed how prevalent topics were over time using dynamic topic modeling [ 19 ]. As shown in Fig 2D , there is substantial fluctuation in where the records are in the 2D space over time. For example, topic 44 (light, leaves, co, synthesis, photosynthesis) is among the topics that existed in 1975 but has diminished gradually since. In 1985, topic 39 (Agrobacterium-based transformation) became dense enough to be visualized. Additional examples include topics 79 (soil heavy metals), 42 (differential expression), and 82 (bacterial community metagenomics), which became prominent in approximately 2005, 2010, and 2020, respectively ( Fig 2D ). In addition, animating the document occupancy in the 2D space over time revealed a broad change in patterns over time: Some initially dense areas became sparse over time and a large number of topics in areas previously only loosely occupied at the turn of the century increased over time ( S5 Data ).

While the 2D representations reveal substantial details on the evolution of topics, comparison over time is challenging because the number of plant science records has grown exponentially ( Fig 1A ). To address this, the records were divided into 50 chronological bins each with approximately 8,400 records to make cross-bin comparisons feasible ( S6 Data ). We should emphasize that, because of the way the chronological bins were split, the number of records for each topic in each bin should be treated as a normalized value relative to all other topics during the same period. Examining this relative prevalence of topics across bins revealed a clear pattern of topic succession over time (one topic evolved into another) and the presence of 5 topical categories ( Fig 3 ). The topics were categorized based on their locally weighted scatterplot smoothing (LOWESS) fits and ordered according to timing of peak frequency ( S7 and S8 Data , see Materials and methods ). In Fig 3 , the relative decrease in document frequency does not mean that research output in a topic is dwindling. Because each row in the heatmap is normalized based on the minimum and maximum values within each topic, there still can be substantial research output in terms of numbers of publications even when the relative frequency is near zero. Thus, a reduced relative frequency of a topic reflects only a below-average growth rate compared with other topical areas.

thumbnail

(A-E) A heat map of relative topic frequency over time reveals 5 topical categories: (A) stable, (B) early, (C) transitional, (D) sigmoidal, and (E) rising. The x axis denotes different time bins with each bin containing a similar number of documents to account for the exponential growth of plant science records over time. The sizes of all bins except the first are drawn to scale based on the beginning and end dates. The y axis lists different topics denoted by the label and top 4 to 5 terms. In each cell, the prevalence of a topic in a time bin is colored according to the min-max normalized cTf-Idf values for that topic. Light blue dotted lines delineate different decades. The arrows left of a subset of topic labels indicate example relationships between topics in topic clusters. Blue boxes with labels a–f indicate topic clusters, which are the same as those in Fig 2 . Connecting lines indicate successional trends. Yellow circles/lines 1 – 3: 3 major transition patterns. The original data are in S5 Data .

https://doi.org/10.1371/journal.pbio.3002612.g003

The first topical category is a stable category with 7 topics mostly established before the 1980s that have since remained stable in terms of prevalence in the plant science records (top of Fig 3A ). These topics represent long-standing plant science research foci, including studies of plant physiology (topics 4, 58, and 81), genetics (topic 61), and medicinal plants (topic 53). The second category contains 8 topics established before the 1980s that have mostly decreased in prevalence since (the early category, Fig 3B ). Two examples are physiological and morphological studies of hormone action (topic 45, the second in the early category) and the characterization of protein, DNA, and RNA (topic 18, the second to last). Unlike other early topics, topic 78 (paleobotany and plant evolution studies, the last topic in Fig 3B ) experienced a resurgence in the early 2000s due to the development of new approaches and databases and changes in research foci [ 20 ].

The 33 topics in the third, transitional category became prominent in the 1980s, 1990s, or even 2000s but have clearly decreased in prevalence ( Fig 3C ). In some cases, the early and the transitional topics became less prevalent because of topical succession—refocusing of earlier topics led to newer ones that either show no clear sign of decrease (the sigmoidal category, Fig 3D ) or continue to increase in prevalence (the rising category, Fig 3E ). Consistent with the notion of topical succession, topics within each topic cluster ( Fig 2 ) were found across topic categories and/or were prominent at different time periods (indicated by colored lines linking topics, Fig 3 ). One example is topics in topic cluster b (connected with light green lines and arrows, compare Figs 2 and 3 ); the study of cation transport (topic 47, the third in the transitional category), prominent in the 1980s and early 1990s, is connected to 5 other topics, namely, another transitional topic 29 (cation channels and their expression) peaking in the 2000s and early 2010s, sigmoidal topics 24 and 28 (stress response, tolerance mechanisms) and 30 (heavy metal transport), which rose to prominence in mid-2000s, and the rising topic 42 (stress transcriptomic studies), which increased in prevalence in the mid-2010s.

The rise and fall of topics can be due to a combination of technological or conceptual breakthroughs, maturity of the field, funding constraints, or publicity. The study of transposable elements (topic 62) illustrates the effect of publicity; the rise in this field coincided with Barbara McClintock’s 1983 Nobel Prize but not with the publication of her studies in the 1950s [ 21 ]. The reduced prevalence in early 2000 likely occurred in part because analysis of transposons became a central component of genome sequencing and annotation studies, rather than dedicated studies. In addition, this example indicates that our approaches, while capable of capturing topical trends, cannot be used to directly infer major papers leading to the growth of a topic.

Three major topical transition patterns signify shifts in research trends

Beyond the succession of specific topics, 3 major transitions in the dynamic topic graph should be emphasized: (1) the relative decreasing trend of early topics in the late 1970s and early 1980s; (2) the rise of transitional topics in late 1980s; and (3) the relative decreasing trend of transitional topics in the late 1990s and early 2000s, which coincided with a radiation of sigmoidal and rising topics (yellow circles, Fig 3 ). The large numbers of topics involved in these transitions suggest major shifts in plant science research. In transition 1, early topics decreased in relative prevalence in the late 1970s to early 1980s, which coincided with the rise of transitional topics over the following decades (circle 1, Fig 3 ). For example, there was a shift from the study of purified proteins such as enzymes (early topic 48, S5A Fig ) to molecular genetic dissection of genes, proteins, and RNA (transitional topic 35, S5B Fig ) enabled by the wider adoption of recombinant DNA and molecular cloning technologies in late 1970s [ 22 ]. Transition 2 (circle 2, Fig 3 ) can be explained by the following breakthroughs in the late 1980s: better approaches to create transgenic plants and insertional mutants [ 23 ], more efficient creation of mutant plant libraries through chemical mutagenesis (e.g., [ 24 ]), and availability of gene reporter systems such as β-glucuronidase [ 25 ]. Because of these breakthroughs, molecular genetics studies shifted away from understanding the basic machinery to understanding the molecular underpinnings of specific processes, such as molecular mechanisms of flower and meristem development and the action of hormones such as auxin (topic 27, S5C Fig ); this type of research was discussed as a future trend in 1988 [ 26 ] and remains prevalent to this date. Another example is gene silencing (topic 12), which became a focal area of study along with the widespread use of transgenic plants [ 27 ].

Transition 3 is the most drastic: A large number of transitional, sigmoidal, and rising topics became prevalent nearly simultaneously at the turn of the century (circle 3, Fig 3 ). This period also coincides with a rapid increase in plant science citations ( Fig 1A ). The most notable breakthroughs included the availability of the first plant genome in 2000 [ 28 ], increasing ease and reduced cost of high-throughput sequencing [ 29 ], development of new mass spectrometry–based platforms for analyzing proteins [ 30 ], and advancements in microscopic and optical imaging approaches [ 31 ]. Advances in genomics and omics technology also led to an increase in stress transcriptomics studies (42, S5D Fig ) as well as studies in many other topics such as epigenetics (topic 11), noncoding RNA analysis (13), genomics and phylogenetics (80), breeding (41), genome sequencing and assembly (60), gene family analysis (23), and metagenomics (82 and 55).

In addition to the 3 major transitions across all topics, there were also transitions within topics revealed by examining the top terms for different time bins (heatmaps, S5 Fig ). Taken together, these observations demonstrate that knowledge about topical evolution can be readily revealed through topic modeling. Such knowledge is typically only available to experts in specific areas and is difficult to summarize manually, as no researcher has a command of the entire plant science literature.

Analysis of taxa studied reveals changes in research trends

Changes in research trends can also be illustrated by examining changes in the taxa being studied over time ( S9 Data ). There is a strong bias in the taxa studied, with the record dominated by research models and economically important taxa ( S6 Fig ). Flowering plants (Magnoliopsida) are found in 93% of records ( S6A Fig ), and the mustard family Brassicaceae dominates at the family level ( S6B Fig ) because the genus Arabidopsis contributes to 13% of plant science records ( Fig 4A ). When examining the prevalence of taxa being studied over time, clear patterns of turnover emerged similar to topical succession ( Figs 4B , S6C, and S6D ; Materials and methods ). Given that Arabidopsis is mentioned in more publications than other species we analyzed, we further examined the trends for Arabidopsis publications. The increase in the normalized number (i.e., relative to the entire plant science corpus) of Arabidopsis records coincided with advocacy of its use as a model system in the late 1980s [ 32 ]. While it remains a major plant model, there has been a decrease in overall Arabidopsis publications relative to all other plant science publications since 2011 (blue line, normalized total, Fig 4C ). Because the same chronological bins, each with same numbers of records, from the topic-over-time analysis ( Fig 3 ) were used, the decrease here does not mean that there were fewer Arabidopsis publications—in fact, the number of Arabidopsis papers has remained steady since 2011. This decrease means that Arabidopsis-related publications represent a relatively smaller proportion of plant science records. Interestingly, this decrease took place much earlier (approximately 2005) and was steeper in the United States (red line, Fig 4C ) than in all countries combined (blue line, Fig 4C ).

thumbnail

(A) Percentage of records mentioning specific genera. (B) Change in the prevalence of genera in plant science records over time. (C) Changes in the normalized numbers of all records (blue) and records from the US (red) mentioning Arabidopsis over time. The lines are LOWESS fits with fraction parameter = 0.2. (D) Topical over (red) and under (blue) representation among 5 genera with the most plant science records. LLR: log 2 likelihood ratios of each topic in each genus. Gray: topic-species combination not significantly enriched at the 5% level based on enrichment p -values adjusted for multiple testing with the Benjamini–Hochberg method [ 33 ]. The data used for plotting are in S9 Data . The statistics for all topics are in S10 Data .

https://doi.org/10.1371/journal.pbio.3002612.g004

Assuming that the normalized number of publications reflects the relative intensity of research activities, one hypothesis for the relative decrease in focus on Arabidopsis is that advances in, for example, plant transformation, genetic manipulation, and genome research have allowed the adoption of more previously nonmodel taxa. Consistent with this, there was a precipitous increase in the number of genera being published in the mid-90s to early 2000s during which approaches for plant transgenics became established [ 34 ], but the number has remained steady since then ( S7A Fig ). The decrease in the proportion of Arabidopsis papers is also negatively correlated with the timing of an increase in the number of draft genomes ( S7B Fig and S9 Data ). It is plausible that genome availability for other species may have contributed to a shift away from Arabidopsis. Strikingly, when we analyzed US National Science Foundation records, we found that the numbers of funded grants mentioning Arabidopsis ( S7C Fig ) have risen and fallen in near perfect synchrony with the normalized number of Arabidopsis publication records (red line, Fig 4C ). This finding likely illustrates the impact of funding on Arabidopsis research.

By considering both taxa information and research topics, we can identify clear differences in the topical areas preferred by researchers using different plant taxa ( Fig 4D and S10 Data ). For example, studies of auxin/light signaling, the circadian clock, and flowering tend to be carried out in Arabidopsis, while quantitative genetic studies of disease resistance tend to be done in wheat and rice, glyphosate research in soybean, and RNA virus research in tobacco. Taken together, joint analyses of topics and species revealed additional details about changes in preferred models over time, and the preferred topical areas for different taxa.

Countries differ in their contributions to plant science and topical preference

We next investigated whether there were geographical differences in topical preference among countries by inferring country information from 330,187 records (see Materials and methods ). The 10 countries with the most records account for 73% of the total, with China and the US contributing to approximately 18% each ( Fig 5A ). The exponential growth in plant science records (green line, Fig 1A ) was in large part due to the rapid rise in annual record numbers in China and India ( Fig 5B ). When we examined the publication growth rates using the top 17 plant science journals, the general patterns remained the same ( S7D Fig ). On the other hand, the US, Japan, Germany, France, and Great Britain had slower rates of growth compared with all non-top 10 countries. The rapid increase in records from China and India was accompanied by a rapid increase in metrics measuring journal impact ( Figs 5C and S8 and S9 Data ). For example, using citation score ( Fig 5C , see Materials and methods ), we found that during a 22-year period China (dark green) and India (light green) rapidly approached the global average (y = 0, yellow), whereas some of the other top 10 countries, particularly the US (red) and Japan (yellow green), showed signs of decrease ( Fig 5C ). It remains to be determined whether these geographical trends reflect changes in priority, investment, and/or interest in plant science research.

thumbnail

(A) Numbers of plant science records for countries with the 10 highest numbers. (B) Percentage of all records from each of the top 10 countries from 1980 to 2020. (C) Difference in citation scores from 1999 to 2020 for the top 10 countries. (D) Shown for each country is the relationship between the citation scores averaged from 1999 to 2020 and the slope of linear fit with year as the predictive variable and citation score as the response variable. The countries with >400 records and with <10% missing impact values are included. Data used for plots (A–D) are in S11 Data . (E) Correlation in topic enrichment scores between the top 10 countries. PCC, Pearson’s correlation coefficient, positive in red, negative in blue. Yellow rectangle: countries with more similar topical preferences. (F) Enrichment scores (LLR, log likelihood ratio) of selected topics among the top 10 countries. Red: overrepresentation, blue: underrepresentation. Gray: topic-country combination that is not significantly enriched at the 5% level based on enrichment p -values adjusted for multiple testing with the Benjamini–Hochberg method (for all topics and plotting data, see S12 Data ).

https://doi.org/10.1371/journal.pbio.3002612.g005

Interestingly, the relative growth/decline in citation scores over time (measured as the slope of linear fit of year versus citation score) was significantly and negatively correlated with average citation score ( Fig 5D ); i.e., countries with lower overall metrics tended to experience the strongest increase in citation scores over time. Thus, countries that did not originally have a strong influence on plant sciences now have increased impact. These patterns were also observed when using H-index or journal rank as metrics ( S8 Fig and S11 Data ) and were not due to increased publication volume, as the metrics were normalized against numbers of records from each country (see Materials and methods ). In addition, the fact that different metrics with different caveats and assumptions yielded consistent conclusions indicates the robustness of our observations. We hypothesize that this may be a consequence of the ease in scientific communication among geographically isolated research groups. It could also be because of the prevalence of online journals that are open access, which makes scientific information more readily accessible. Or it can be due to the increasing international collaboration. In any case, the causes for such regression toward the mean are not immediately clear and should be addressed in future studies.

We also assessed how the plant research foci of countries differ by comparing topical preference (i.e., the degree of enrichment of plant science records in different topics) between countries. For example, Italy and Spain cluster together (yellow rectangle, Fig 5E ) partly because of similar research focusing on allergens (topic 0) and mycotoxins (topic 54) and less emphasis on gene family (topic 23) and stress tolerance (topic 28) studies ( Fig 5F , for the fold enrichment and corrected p -values of all topics, see S12 Data ). There are substantial differences in topical focus between countries ( S9 Fig ). For example, research on new plant compounds associated with herbal medicine (topic 69) is a focus in China but not in the US, but the opposite is true for population genetics and evolution (topic 86) ( Fig 5F ). In addition to revealing how plant science research has evolved over time, topic modeling provides additional insights into differences in research foci among different countries, which are informative for science policy considerations.

In this study, topic modeling revealed clear transitions among research topics, which represent shifts in research trends in plant sciences. One limitation of our study is the bias in the PubMed-based corpus. The cellular, molecular, and physiological aspects of plant sciences are well represented, but there are many fewer records related to evolution, ecology, and systematics. Our use of titles/abstracts from the top 17 plant science journals as positive examples allowed us to identify papers we typically see in these journals, but this may have led to us missing “outlier” articles, which may be the most exciting. Another limitation is the need to assign only one topic to a record when a study is interdisciplinary and straddles multiple topics. Furthermore, a limited number of large, inherently heterogeneous topics were summarized to provide a more concise interpretation, which undoubtedly underrepresents the diversity of plant science research. Despite these limitations, dynamic topic modeling revealed changes in plant science research trends that coincide with major shifts in biological science. While we were interested in identifying conceptual advances, our approach can identify the trend but the underlying causes for such trends, particularly key records leading to the growth in certain topics, still need to be identified. It also remains to be determined which changes in research trends lead to paradigm shifts as defined by Kuhn [ 35 ].

The key terms defining the topics frequently describe various technologies (e.g., topic 38/39: transformation, 40: genome editing, 59: genetic markers, 65: mass spectrometry, 69: nuclear magnetic resonance) or are indicative of studies enabled through molecular genetics and omics technologies (e.g., topic 8/60: genome, 11: epigenetic modifications, 18: molecular biological studies of macromolecules, 13: small RNAs, 61: quantitative genetics, 82/84: metagenomics). Thus, this analysis highlights how technological innovation, particularly in the realm of omics, has contributed to a substantial number of research topics in the plant sciences, a finding that likely holds for other scientific disciplines. We also found that the pattern of topic evolution is similar to that of succession, where older topics have mostly decreased in relative prevalence but appear to have been superseded by newer ones. One example is the rise of transcriptome-related topics and the correlated, reduced focus on regulation at levels other than transcription. This raises the question of whether research driven by technology negatively impacts other areas of research where high-throughput studies remain challenging.

One observation on the overall trends in plant science research is the approximately 10-year cycle in major shifts. One hypothesis is related to not only scientific advances but also to the fashion-driven aspect of science. Nonetheless, given that there were only 3 major shifts and the sample size is small, it is difficult to speculate as to why they happened. By analyzing the country of origin, we found that China and India have been the 2 major contributors to the growth in the plant science records in the last 20 years. Our findings also show an equalizing trend in global plant science where countries without a strong plant science publication presence have had an increased impact over the last 20 years. In addition, we identified significant differences in research topics between countries reflecting potential differences in investment and priorities. Such information is important for discerning differences in research trends across countries and can be considered when making policy decisions about research directions.

Materials and methods

Collection and preprocessing of a candidate plant science corpus.

For reproducibility purposes, a random state value of 20220609 was used throughout the study. The PubMed baseline files containing citation information ( ftp://ftp.ncbi.nlm.nih.gov/pubmed/baseline/ ) were downloaded on November 11, 2021. To narrow down the records to plant science-related citations, a candidate citation was identified as having, within the titles and/or abstracts, at least one of the following words: “plant,” “plants,” “botany,” “botanical,” “planta,” and “plantarum” (and their corresponding upper case and plural forms), or plant taxon identifiers from NCBI Taxonomy ( https://www.ncbi.nlm.nih.gov/taxonomy ) or USDA PLANTS Database ( https://plants.sc.egov.usda.gov/home ). Note the search terms used here have nothing to do with the values of the keyword field in PubMed records. The taxon identifiers include all taxon names including and at taxonomic levels below “Viridiplantae” till the genus level (species names not used). This led to 51,395 search terms. After looking for the search terms, qualified entries were removed if they were duplicated, lacked titles and/or abstracts, or were corrections, errata, or withdrawn articles. This left 1,385,417 citations, which were considered the candidate plant science corpus (i.e., a collection of texts). For further analysis, the title and abstract for each citation were combined into a single entry. Text was preprocessed by lowercasing, removing stop-words (i.e., common words), removing non-alphanumeric and non-white space characters (except Greek letters, dashes, and commas), and applying lemmatization (i.e., grouping inflected forms of a word as a single word) for comparison. Because lemmatization led to truncated scientific terms, it was not included in the final preprocessing pipeline.

Definition of positive/negative examples

Upon closer examination, a large number of false positives were identified in the candidate plant science records. To further narrow down citations with a plant science focus, text classification was used to distinguish plant science and non-plant science articles (see next section). For the classification task, a negative set (i.e., non-plant science citations) was defined as entries from 7,360 journals that appeared <20 times in the filtered data (total = 43,329, journal candidate count, S1 Data ). For the positive examples (i.e., true plant science citations), 43,329 plant science citations (positive examples) were sampled from 17 established plant science journals each with >2,000 entries in the filtered dataset: “Plant physiology,” “Frontiers in plant science,” “Planta,” “The Plant journal: for cell and molecular biology,” “Journal of experimental botany,” “Plant molecular biology,” “The New phytologist,” “The Plant cell,” “Phytochemistry,” “Plant & cell physiology,” “American journal of botany,” “Annals of botany,” “BMC plant biology,” “Tree physiology,” “Molecular plant-microbe interactions: MPMI,” “Plant biology,” and “Plant biotechnology journal” (journal candidate count, S1 Data ). Plant biotechnology journal was included, but only 1,894 records remained after removal of duplicates, articles with missing info, and/or withdrawn articles. The positive and negative sets were randomly split into training and testing subsets (4:1) while maintaining a 1:1 positive-to-negative ratio.

Text classification based on Tf and Tf-Idf

Instead of using the preprocessed text as features for building classification models directly, text embeddings (i.e., representations of texts in vectors) were used as features. These embeddings were generated using 4 approaches (model summary, S1 Data ): Term-frequency (Tf), Tf-Idf [ 36 ], Word2Vec [ 37 ], and BERT [ 6 ]. The Tf- and Tf-Idf-based features were generated with CountVectorizer and TfidfVectorizer, respectively, from Scikit-Learn [ 38 ]. Different maximum features (1e4 to 1e5) and n-gram ranges (uni-, bi-, and tri-grams) were tested. The features were selected based on the p- value of chi-squared tests testing whether a feature had a higher-than-expected value among the positive or negative classes. Four different p- value thresholds were tested for feature selection. The selected features were then used to retrain vectorizers with the preprocessed training texts to generate feature values for classification. The classification model used was XGBoost [ 39 ] with 5 combinations of the following hyperparameters tested during 5-fold stratified cross-validation: min_child_weight = (1, 5, 10), gamma = (0.5, 1, 1.5, 2.5), subsample = (0.6, 0.8, 1.0), colsample_bytree = (0.6, 0.8, 1.0), and max_depth = (3, 4, 5). The rest of the hyperparameters were held constant: learning_rate = 0.2, n_estimators = 600, objective = binary:logistic. RandomizedSearchCV from Scikit-Learn was used for hyperparameter tuning and cross-validation with scoring = F1-score.

Because the Tf-Idf model had a relatively high model performance and was relatively easy to interpret (terms are frequency-based, instead of embedding-based like those generated by Word2Vec and BERT), the Tf-Idf model was selected as input to SHapley Additive exPlanations (SHAP; [ 14 ]) to assess the importance of terms. Because the Tf-Idf model was based on XGBoost, a tree-based algorithm, the TreeExplainer module in SHAP was used to determine a SHAP value for each entry in the training dataset for each Tf-Idf feature. The SHAP value indicates the degree to which a feature positively or negatively affects the underlying prediction. The importance of a Tf-Idf feature was calculated as the average SHAP value of that feature among all instances. Because a Tf-Idf feature is generated based on a specific term, the importance of the Tf-Idf feature indicates the importance of the associated term.

Text classification based on Word2Vec

The preprocessed texts were first split into train, validation, and test subsets (8:1:1). The texts in each subset were converted to 3 n-gram lists: a unigram list obtained by splitting tokens based on the space character, or bi- and tri-gram lists built with Gensim [ 40 ]. Each n-gram list of the training subset was next used to fit a Skip-gram Word2Vec model with vector_size = 300, window = 8, min_count = (5, 10, or 20), sg = 1, and epochs = 30. The Word2Vec model was used to generate word embeddings for train, validate, and test subsets. In the meantime, a tokenizer was trained with train subset unigrams using Tensorflow [ 41 ] and used to tokenize texts in each subset and turn each token into indices to use as features for training text classification models. To ensure all citations had the same number of features (500), longer texts were truncated, and shorter ones were zero-padded. A deep learning model was used to train a text classifier with an input layer the same size as the feature number, an attention layer incorporating embedding information for each feature, 2 bidirectional Long-Short-Term-Memory layers (15 units each), a dense layer (64 units), and a final, output layer with 2 units. During training, adam, accuracy, and sparse_categorical_crossentropy were used as the optimizer, evaluation metric, and loss function, respectively. The training process lasted 30 epochs with early stopping if validation loss did not improve in 5 epochs. An F1 score was calculated for each n-gram list and min_count parameter combination to select the best model (model summary, S1 Data ).

Text classification based on BERT models

Two pretrained models were used for BERT-based classification: DistilBERT (Hugging face repository [ 42 ] model name and version: distilbert-base-uncased [ 43 ]) and SciBERT (allenai/scibert-scivocab-uncased [ 16 ]). In both cases, tokenizers were retrained with the training data. BERT-based models had the following architecture: the token indices (512 values for each token) and associated masked values as input layers, pretrained BERT layer (512 × 768) excluding outputs, a 1D pooling layer (768 units), a dense layer (64 units), and an output layer (2 units). The rest of the training parameters were the same as those for Word2Vec-based models, except training lasted for 20 epochs. Cross-validation F1-scores for all models were compared and used to select the best model for each feature extraction method, hyperparameter combination, and modeling algorithm or architecture (model summary, S1 Data ). The best model was the Word2Vec-based model (min_count = 20, window = 8, ngram = 3), which was applied to the candidate plant science corpus to identify a set of plant science citations for further analysis. The candidate plant science records predicted as being in the positive class (421,658) by the model were collectively referred to as the “plant science corpus.”

Plant science record classification

In PubMed, 1,384,718 citations containing “plant” or any plant taxon names (from the phylum to genus level) were considered candidate plant science citations. To further distinguish plant science citations from those in other fields, text classification models were trained using titles and abstracts of positive examples consisting of citations from 17 plant science journals, each with >2,000 entries in PubMed, and negative examples consisting of records from journals with fewer than 20 entries in the candidate set. Among 4 models tested the best model (built with Word2Vec embeddings) had a cross validation F1 of 0.964 (random guess F1 = 0.5, perfect model F1 = 1, S1 Data ). When testing the model using 17,330 testing set citations independent from the training set, the F1 remained high at 0.961.

We also conducted another analysis attempting to use the MeSH term “Plants” as a benchmark. Records with the MeSH term “Plants” also include pharmaceutical studies of plants and plant metabolites or immunological studies of plants as allergens in journals that are not generally considered plant science journals (e.g., Acta astronautica , International journal for parasitology , Journal of chromatography ) or journals from local scientific societies (e.g., Acta pharmaceutica Hungarica , Huan jing ke xue , Izvestiia Akademii nauk . Seriia biologicheskaia ). Because we explicitly labeled papers from such journals as negative examples, we focused on 4,004 records with the “Plants” MeSH term published in the 17 plant science journals that were used as positive instances and found that 88.3% were predicted as the positive class. Thus, based on the MeSH term, there is an 11.7% false prediction rate.

We also enlisted 5 plant science colleagues (3 advanced graduate students in plant biology and genetic/genome science graduate programs, 1 postdoctoral breeder/quantitative biologist, and 1 postdoctoral biochemist/geneticist) to annotate 100 randomly selected abstracts as a reviewer suggested. Each record was annotated by 2 colleagues. Among 85 entries where the annotations are consistent between annotators, 48 were annotated as negative but with 7 predicted as positive (false positive rate = 14.6%) and 37 were annotated as positive but with 4 predicted as negative (false negative rate = 10.8%). To further benchmark the performance of the text classification model, we identified another 12 journals that focus on plant science studies to use as benchmarks: Current opinion in plant biology (number of articles: 1,806), Trends in plant science (1,723), Functional plant biology (1,717), Molecular plant pathology (1,573), Molecular plant (1,141), Journal of integrative plant biology (1,092), Journal of plant research (1,032), Physiology and molecular biology of plants (830), Nature plants (538), The plant pathology journal (443). Annual review of plant biology (417), and The plant genome (321). Among the 12,611 candidate plant science records, 11,386 were predicted as positive. Thus, there is a 9.9% false negative rate.

Global topic modeling

BERTopic [ 15 ] was used for preliminary topic modeling with n-grams = (1,2) and with an embedding initially generated by DistilBERT, SciBERT, or BioBERT (dmis-lab/biobert-base-cased-v1.2; [ 44 ]). The embedding models converted preprocessed texts to embeddings. The topics generated based on the 3 embeddings were similar ( S2 Data ). However, SciBERT-, BioBERT-, and distilBERT-based embedding models had different numbers of outlier records (268,848, 293,790, and 323,876, respectively) with topic index = −1. In addition to generating the fewest outliers, the SciBERT-based model led to the highest number of topics. Therefore, SciBERT was chosen as the embedding model for the final round of topic modeling. Modeling consisted of 3 steps. First, document embeddings were generated with SentenceTransformer [ 45 ]. Second, a clustering model to aggregate documents into clusters using hdbscan [ 46 ] was initialized with min_cluster_size = 500, metric = euclidean, cluster_selection_method = eom, min_samples = 5. Third, the embedding and the initialized hdbscan model were used in BERTopic to model topics with neighbors = 10, nr_topics = 500, ngram_range = (1,2). Using these parameters, 90 topics were identified. The initial topic assignments were conservative, and 241,567 records were considered outliers (i.e., documents not assigned to any of the 90 topics). After assessing the prediction scores of all records generated from the fitted topic models, the 95-percentile score was 0.0155. This score was used as the threshold for assigning outliers to topics: If the maximum prediction score was above the threshold and this maximum score was for topic t , then the outlier was assigned to t . After the reassignment, 49,228 records remained outliers. To assess if some of the outliers were not assigned because they could be assigned to multiple topics, the prediction scores of the records were used to put records into 100 clusters using k- means. Each cluster was then assessed to determine if the outlier records in a cluster tended to have higher prediction scores across multiple topics ( S2 Fig ).

Topics that are most and least well connected to other topics

The most well-connected topics in the network include topic 24 (stress mechanisms, median cosine similarity = 0.36), topic 42 (genes, stress, and transcriptomes, 0.34), and topic 35 (molecular genetics, 0.32, all t test p -values < 1 × 10 −22 ). The least connected topics include topic 0 (allergen research, median cosine similarity = 0.12), topic 21 (clock biology, 0.12), topic 1 (tissue culture, 0.15), and topic 69 (identification of compounds with spectroscopic methods, 0.15; all t test p- values < 1 × 10 −24 ). Topics 0, 1, and 69 are specialized topics; it is surprising that topic 21 is not as well connected as explained in the main text.

Analysis of documents based on the topic model

case study on biology

Topical diversity among top journals with the most plant science records

Using a relative topic diversity measure (ranging from 0 to 10), we found that there was a wide range of topical diversity among 20 journals with the largest numbers of plant science records ( S3 Fig ). The 4 journals with the highest relative topical diversities are Proceedings of the National Academy of Sciences , USA (9.6), Scientific Reports (7.1), Plant Physiology (6.7), and PLOS ONE (6.4). The high diversities are consistent with the broad, editorial scopes of these journals. The 4 journals with the lowest diversities are American Journal of Botany (1.6), Oecologia (0.7), Plant Disease (0.7), and Theoretical and Applied Genetics (0.3), which reflects their discipline-specific focus and audience of classical botanists, ecologists, plant pathologists, and specific groups of geneticists.

Dynamic topic modeling

The codes for dynamic modeling were based on _topic_over_time.py in BERTopics and modified to allow additional outputs for debugging and graphing purposes. The plant science citations were binned into 50 subsets chronologically (for timestamps of bins, see S5 Data ). Because the numbers of documents increased exponentially over time, instead of dividing them based on equal-sized time intervals, which would result in fewer records at earlier time points and introduce bias, we divided them into time bins of similar size (approximately 8,400 documents). Thus, the earlier time subsets had larger time spans compared with later time subsets. If equal-size time intervals were used, the numbers of documents between the intervals would differ greatly; the earlier time points would have many fewer records, which may introduce bias. Prior to binning the subsets, the publication dates were converted to UNIX time (timestamp) in seconds; the plant science records start in 1917-11-1 (timestamp = −1646247600.0) and end in 2021-1-1 (timestamp = 1609477201). The starting dates and corresponding timestamps for the 50 subsets including the end date are in S6 Data . The input data included the preprocessed texts, topic assignments of records from global topic modeling, and the binned timestamps of records. Three additional parameters were set for topics_over_time, namely, nr_bin = 50 (number of bins), evolution_tuning = True, and global_tuning = False. The evolution_tuning parameter specified that averaged c-Tf-Idf values for a topic be calculated in neighboring time bins to reduce fluctuation in c-Tf-Idf values. The global_tuning parameter was set to False because of the possibility that some nonexisting terms could have a high c-Tf-Idf for a time bin simply because there was a high global c-Tf-Idf value for that term.

The binning strategy based on similar document numbers per bin allowed us to increase signal particularly for publications prior to the 90s. This strategy, however, may introduce more noise for bins with smaller time durations (i.e., more recent bins) because of publication frequencies (there can be seasonal differences in the number of papers published, biased toward, e.g., the beginning of the year or the beginning of a quarter). To address this, we examined the relative frequencies of each topic over time ( S7 Data ), but we found that recent time bins had similar variances in relative frequencies as other time bins. We also moderated the impact of variation using LOWESS (10% to 30% of the data points were used for fitting the trend lines) to determine topical trends for Fig 3 . Thus, the influence of the noise introduced via our binning strategy is expected to be minimal.

Topic categories and ordering

The topics were classified into 5 categories with contrasting trends: stable, early, transitional, sigmoidal, and rising. To define which category a topic belongs to, the frequency of documents over time bins for each topic was analyzed using 3 regression methods. We first tried 2 forecasting methods: recursive autoregressor (the ForecasterAutoreg class in the skforecast package) and autoregressive integrated moving average (ARIMA implemented in the pmdarima package). In both cases, the forecasting results did not clearly follow the expected trend lines, likely due to the low numbers of data points (relative frequency values), which resulted in the need to extensively impute missing data. Thus, as a third approach, we sought to fit the trendlines with the data points using LOWESS (implemented in the statsmodels package) and applied additional criteria for assigning topics to categories. When fitting with LOWESS, 3 fraction parameters (frac, the fraction of the data used when estimating each y-value) were evaluated (0.1, 0.2, 0.3). While frac = 0.3 had the smallest errors for most topics, in situations where there were outliers, frac = 0.2 or 0.1 was chosen to minimize mean squared errors ( S7 Data ).

The topics were classified into 5 categories based on the slopes of the fitted line over time: (1) stable: topics with near 0 slopes over time; (2) early: topics with negative (<−0.5) slopes throughout (with the exception of topic 78, which declined early on but bounced back by the late 1990s); (3) transitional: early positive (>0.5) slopes followed by negative slopes at later time points; (4) sigmoidal: early positive slopes followed by zero slopes at later time points; and (5) rising: continuously positive slopes. For each topic, the LOWESS fits were also used to determine when the relative document frequency reached its peak, first reaching a threshold of 0.6 (chosen after trial and error for a range of 0.3 to 0.9), and the overall trend. The topics were then ordered based on (1) whether they belonged to the stable category or not; (2) whether the trends were decreasing, stable, or increasing; (3) the time the relative document frequency first reached 0.6; and (4) the time that the overall peak was reached ( S8 Data ).

Taxa information

To identify a taxon or taxa in all plant science records, NCBI Taxonomy taxdump datasets were downloaded from the NCBI FTP site ( https://ftp.ncbi.nlm.nih.gov/pub/taxonomy/new_taxdump/ ) on September 20, 2022. The highest-level taxon was Viridiplantae, and all its child taxa were parsed and used as queries in searches against the plant science corpus. In addition, a species-over-time analysis was conducted using the same time bins as used for dynamic topic models. The number of records in different time bins for top taxa are in the genus, family, order, and additional species level sheet in S9 Data . The degree of over-/underrepresentation of a taxon X in a research topic T was assessed using the p -value of a Fisher’s exact test for a 2 × 2 table consisting of the numbers of records in both X and T, in X but not T, in T but not X, and in neither ( S10 Data ).

For analysis of plant taxa with genome information, genome data of taxa in Viridiplantae were obtained from the NCBI Genome data-hub ( https://www.ncbi.nlm.nih.gov/data-hub/genome ) on October 28, 2022. There were 2,384 plant genome assemblies belonging to 1,231 species in 559 genera (genome assembly sheet, S9 Data ). The date of the assembly was used as a proxy for the time when a genome was sequenced. However, some species have updated assemblies and have more recent data than when the genome first became available.

Taxa being studied in the plant science records

Flowering plants (Magnoliopsida) are found in 93% of records, while most other lineages are discussed in <1% of records, with conifers and related species being exceptions (Acrogynomsopermae, 3.5%, S6A Fig ). At the family level, the mustard (Brassicaceae), grass (Poaceae), pea (Fabaceae), and nightshade (Solanaceae) families are in 51% of records ( S6B Fig ). The prominence of the mustard family in plant science research is due to the Brassica and Arabidopsis genera ( Fig 4A ). When examining the prevalence of taxa being studied over time, clear patterns of turnovers emerged ( Figs 4B , S6C, and S6D ). While the study of monocot species (Liliopsida) has remained steady, there was a significant uptick in the prevalence of eudicot (eudicotyledon) records in the late 90s ( S6C Fig ), which can be attributed to the increased number of studies in the mustard, myrtle (Myrtaceae), and mint (Lamiaceae) families among others ( S6D Fig ). At the genus level, records mentioning Gossypium (cotton), Phaseolus (bean), Hordeum (wheat), and Zea (corn), similar to the topics in the early category, were prevalent till the 1980s or 1990s but have mostly decreased in number since ( Fig 4B ). In contrast, Capsicum , Arabidopsis , Oryza , Vitus , and Solanum research has become more prevalent over the last 20 years.

Geographical information for the plant science corpus

The geographical information (country) of authors in the plant science corpus was obtained from the address (AD) fields of first authors in Medline XML records accessible through the NCBI EUtility API ( https://www.ncbi.nlm.nih.gov/books/NBK25501/ ). Because only first author affiliations are available for records published before December 2014, only the first author’s location was considered to ensure consistency between records before and after that date. Among the 421,658 records in the plant science corpus, 421,585 had Medline records and 421,276 had unique PMIDs. Among the records with unique PMIDs, 401,807 contained address fields. For each of the remaining records, the AD field content was split into tokens with a “,” delimiter, and the token likely containing geographical info (referred to as location tokens) was selected as either the last token or the second to last token if the last token contained “@” indicating the presence of an email address. Because of the inconsistency in how geographical information was described in the location tokens (e.g., country, state, city, zip code, name of institution, and different combinations of the above), the following 4 approaches were used to convert location tokens into countries.

The first approach was a brute force search where full names and alpha-3 codes of current countries (ISO 3166–1), current country subregions (ISO 3166–2), and historical country (i.e., country that no longer exists, ISO 3166–3) were used to search the address fields. To reduce false positives using alpha-3 codes, a space prior to each code was required for the match. The first approach allowed the identification of 361,242, 16,573, and 279,839 records with current country, historical country, and subregion information, respectively. The second method was the use of a heuristic based on common address field structures to identify “location strings” toward the end of address fields that likely represent countries, then the use of the Python pycountry module to confirm the presence of country information. This approach led to 329,025 records with country information. The third approach was to parse first author email addresses (90,799 records), recover top-level domain information, and use country code Top Level Domain (ccTLD) data from the ISO 3166 Wikipedia page to define countries (72,640 records). Only a subset of email addresses contains country information because some are from companies (.com), nonprofit organizations (.org), and others. Because a large number of records with address fields still did not have country information after taking the above 3 approaches, another approach was implemented to query address fields against a locally installed Nominatim server (v.4.2.3, https://github.com/mediagis/nominatim-docker ) using OpenStreetMap data from GEOFABRIK ( https://www.geofabrik.de/ ) to find locations. Initial testing indicated that the use of full address strings led to false positives, and the computing resource requirement for running the server was high. Thus, only location strings from the second approach that did not lead to country information were used as queries. Because multiple potential matches were returned for each query, the results were sorted based on their location importance values. The above steps led to an additional 72,401 records with country information.

Examining the overlap in country information between approaches revealed that brute force current country and pycountry searches were consistent 97.1% of the time. In addition, both approaches had high consistency with the email-based approach (92.4% and 93.9%). However, brute force subregion and Nominatim-based predictions had the lowest consistencies with the above 3 approaches (39.8% to 47.9%) and each other. Thus, a record’s country information was finalized if the information was consistent between any 2 approaches, except between the brute force subregion and Nominatim searches. This led to 330,328 records with country information.

Topical and country impact metrics

case study on biology

To determine annual country impact, impact scores were determined in the same way as that for annual topical impact, except that values for different countries were calculated instead of topics ( S8 Data ).

Topical preferences by country

To determine topical preference for a country C , a 2 × 2 table was established with the number of records in topic T from C , the number of records in T but not from C , the number of non- T records from C , and the number of non- T records not from C . A Fisher’s exact test was performed for each T and C combination, and the resulting p -values were corrected for multiple testing with the Bejamini–Hochberg method (see S12 Data ). The preference of T in C was defined as the degree of enrichment calculated as log likelihood ratio of values in the 2 × 2 table. Topic 5 was excluded because >50% of the countries did not have records for this topic.

The top 10 countries could be classified into a China–India cluster, an Italy–Spain cluster, and remaining countries (yellow rectangles, Fig 5E ). The clustering of Italy and Spain is partly due to similar research focusing on allergens (topic 0) and mycotoxins (topic 54) and less emphasis on gene family (topic 23) and stress tolerance (topic 28) studies ( Figs 5F and S9 ). There are also substantial differences in topical focus between countries. For example, plant science records from China tend to be enriched in hyperspectral imaging and modeling (topic 9), gene family studies (topic 23), stress biology (topic 28), and research on new plant compounds associated with herbal medicine (topic 69), but less emphasis on population genetics and evolution (topic 86, Fig 5F ). In the US, there is a strong focus on insect pest resistance (topic 75), climate, community, and diversity (topic 83), and population genetics and evolution but less focus on new plant compounds. In summary, in addition to revealing how plant science research has evolved over time, topic modeling provides additional insights into differences in research foci among different countries.

Supporting information

S1 fig. plant science record classification model performance..

(A–C) Distributions of prediction probabilities (y_prob) of (A) positive instances (plant science records), (B) negative instances (non-plant science records), and (C) positive instances with the Medical Subject Heading “Plants” (ID = D010944). The data are color coded in blue and orange if they are correctly and incorrectly predicted, respectively. The lower subfigures contain log10-transformed x axes for the same distributions as the top subfigure for better visualization of incorrect predictions. (D) Prediction probability distribution for candidate plant science records. Prediction probabilities plotted here are available in S13 Data .

https://doi.org/10.1371/journal.pbio.3002612.s001

S2 Fig. Relationships between outlier clusters and the 90 topics.

(A) Heatmap demonstrating that some outlier clusters tend to have high prediction scores for multiple topics. Each cell shows the average prediction score of a topic for records in an outlier cluster. (B) Size of outlier clusters.

https://doi.org/10.1371/journal.pbio.3002612.s002

S3 Fig. Cosine similarities between topics.

(A) Heatmap showing cosine similarities between topic pairs. Top-left: hierarchical clustering of the cosine similarity matrix using the Ward algorithm. The branches are colored to indicate groups of related topics. (B) Topic labels and names. The topic ordering was based on hierarchical clustering of topics. Colored rectangles: neighboring topics with >0.5 cosine similarities.

https://doi.org/10.1371/journal.pbio.3002612.s003

S4 Fig. Relative topical diversity for 20 journals.

The 20 journals with the most plant science records are shown. The journal names were taken from the journal list in PubMed ( https://www.nlm.nih.gov/bsd/serfile_addedinfo.html ).

https://doi.org/10.1371/journal.pbio.3002612.s004

S5 Fig. Topical frequency and top terms during different time periods.

(A-D) Different patterns of topical frequency distributions for example topics (A) 48, (B) 35, (C) 27, and (D) 42. For each topic, the top graph shows the frequency of topical records in each time bin, which are the same as those in Fig 3 (green line), and the end date for each bin is indicated. The heatmap below each line plot depicts whether a term is among the top terms in a time bin (yellow) or not (blue). Blue dotted lines delineate different decades (see S5 Data for the original frequencies, S6 Data for the LOWESS fitted frequencies and the top terms for different topics/time bins).

https://doi.org/10.1371/journal.pbio.3002612.s005

S6 Fig. Prevalence of records mentioning different taxonomic groups in Viridiplantae.

(A, B) Percentage of records mentioning specific taxa at the ( A) major lineage and (B) family levels. (C, D) The prevalence of taxon mentions over time at the (C) major lineage and (E) family levels. The data used for plotting are available in S9 Data .

https://doi.org/10.1371/journal.pbio.3002612.s006

S7 Fig. Changes over time.

(A) Number of genera being mentioned in plant science records during different time bins (the date indicates the end date of that bin, exclusive). (B) Numbers of genera (blue) and organisms (salmon) with draft genomes available from National Center of Biotechnology Information in different years. (C) Percentage of US National Science Foundation (NSF) grants mentioning the genus Arabidopsis over time with peak percentage and year indicated. The data for (A–C) are in S9 Data . (D) Number of plant science records in the top 17 plant science journals from the USA (red), Great Britain (GBR) (orange), India (IND) (light green), and China (CHN) (dark green) normalized against the total numbers of publications of each country over time in these 17 journals. The data used for plotting can be found in S11 Data .

https://doi.org/10.1371/journal.pbio.3002612.s007

S8 Fig. Change in country impact on plant science over time.

(A, B) Difference in 2 impact metrics from 1999 to 2020 for the 10 countries with the highest number of plant science records. (A) H-index. (B) SCImago Journal Rank (SJR). (C, D) Plots show the relationships between the impact metrics (H-index in (C) , SJR in (D) ) averaged from 1999 to 2020 and the slopes of linear fits with years as the predictive variable and impact metric as the response variable for different countries (A3 country codes shown). The countries with >400 records and with <10% missing impact values are included. The data used for plotting can be found in S11 Data .

https://doi.org/10.1371/journal.pbio.3002612.s008

S9 Fig. Country topical preference.

Enrichment scores (LLR, log likelihood ratio) of topics for each of the top 10 countries. Red: overrepresentation, blue: underrepresentation. The data for plotting can be found in S12 Data .

https://doi.org/10.1371/journal.pbio.3002612.s009

S1 Data. Summary of source journals for plant science records, prediction models, and top Tf-Idf features.

Sheet–Candidate plant sci record j counts: Number of records from each journal in the candidate plant science corpus (before classification). Sheet—Plant sci record j count: Number of records from each journal in the plant science corpus (after classification). Sheet–Model summary: Model type, text used (txt_flag), and model parameters used. Sheet—Model performance: Performance of different model and parameter combinations on the validation data set. Sheet–Tf-Idf features: The average SHAP values of Tf-Idf (Term frequency-Inverse document frequency) features associated with different terms. Sheet–PubMed number per year: The data for PubMed records in Fig 1A . Sheet–Plant sci record num per yr: The data for the plant science records in Fig 1A .

https://doi.org/10.1371/journal.pbio.3002612.s010

S2 Data. Numbers of records in topics identified from preliminary topic models.

Sheet–Topics generated with a model based on BioBERT embeddings. Sheet–Topics generated with a model based on distilBERT embeddings. Sheet–Topics generated with a model based on SciBERT embeddings.

https://doi.org/10.1371/journal.pbio.3002612.s011

S3 Data. Final topic model labels and top terms for topics.

Sheet–Topic label: The topic index and top 10 terms with the highest cTf-Idf values. Sheets– 0 to 89: The top 50 terms and their c-Tf-Idf values for topics 0 to 89.

https://doi.org/10.1371/journal.pbio.3002612.s012

S4 Data. UMAP representations of different topics.

For a topic T , records in the UMAP graph are colored red and records not in T are colored gray.

https://doi.org/10.1371/journal.pbio.3002612.s013

S5 Data. Temporal relationships between published documents projected onto 2D space.

The 2D embedding generated with UMAP was used to plot document relationships for each year. The plots from 1975 to 2020 were compiled into an animation.

https://doi.org/10.1371/journal.pbio.3002612.s014

S6 Data. Timestamps and dates for dynamic topic modeling.

Sheet–bin_timestamp: Columns are: (1) order index; (2) bin_idx–relative positions of bin labels; (3) bin_timestamp–UNIX time in seconds; and (4) bin_date–month/day/year. Sheet–Topic frequency per timestamp: The number of documents in each time bin for each topic. Sheets–LOWESS fit 0.1/0.2/0.3: Topic frequency per timestamp fitted with the fraction parameter of 0.1, 0.2, or 0.3. Sheet—Topic top terms: The top 5 terms for each topic in each time bin.

https://doi.org/10.1371/journal.pbio.3002612.s015

S7 Data. Locally weighted scatterplot smoothing (LOWESS) of topical document frequencies over time.

There are 90 scatter plots, one for each topic, where the x axis is time, and the y axis is the document frequency (blue dots). The LOWESS fit is shown as orange points connected with a green line. The category a topic belongs to and its order in Fig 3 are labeled on the top left corner. The data used for plotting are in S6 Data .

https://doi.org/10.1371/journal.pbio.3002612.s016

S8 Data. The 4 criteria used for sorting topics.

Peak: the time when the LOWESS fit of the frequencies of a topic reaches maximum. 1st_reach_thr: the time when the LOWESS fit first reaches a threshold of 60% maximal frequency (peak value). Trend: upward (1), no change (0), or downward (−1). Stable: whether a topic belongs to the stable category (1) or not (0).

https://doi.org/10.1371/journal.pbio.3002612.s017

S9 Data. Change in taxon record numbers and genome assemblies available over time.

Sheet–Genus: Number of records mentioning a genus during different time periods (in Unix timestamp) for the top 100 genera. Sheet–Genus: Number of records mentioning a family during different time periods (in Unix timestamp) for the top 100 families. Sheet–Genus: Number of records mentioning an order during different time periods (in Unix timestamp) for the top 20 orders. Sheet–Species levels: Number of records mentioning 12 selected taxonomic levels higher than the order level during different time periods (in Unix timestamp). Sheet–Genome assembly: Plant genome assemblies available from NCBI as of October 28, 2022. Sheet–Arabidopsis NSF: Absolute and normalized numbers of US National Science Foundation funded proposals mentioning Arabidopsis in proposal titles and/or abstracts.

https://doi.org/10.1371/journal.pbio.3002612.s018

S10 Data. Taxon topical preference.

Sheet– 5 genera LLR: The log likelihood ratio of each topic in each of the top 5 genera with the highest numbers of plant science records. Sheets– 5 genera: For each genus, the columns are: (1) topic; (2) the Fisher’s exact test p -value (Pvalue); (3–6) numbers of records in topic T and in genus X (n_inT_inX), in T but not in X (n_inT_niX), not in T but in X (n_niT_inX), and not in T and X (n_niT_niX) that were used to construct 2 × 2 tables for the tests; and (7) the log likelihood ratio generated with the 2 × 2 tables. Sheet–corrected p -value: The 4 values for generating LLRs were used to conduct Fisher’s exact test. The p -values obtained for each country were corrected for multiple testing.

https://doi.org/10.1371/journal.pbio.3002612.s019

S11 Data. Impact metrics of countries in different years.

Sheet–country_top25_year_count: number of total publications and publications per year from the top 25 countries with the most plant science records. Sheet—country_top25_year_top17j: number of total publications and publications per year from the top 25 countries with the highest numbers of plant science records in the 17 plant science journals used as positive examples. Sheet–prank: Journal percentile rank scores for countries (3-letter country codes following https://www.iban.com/country-codes ) in different years from 1999 to 2020. Sheet–sjr: Scimago Journal rank scores. Sheet–hidx: H-Index scores. Sheet–cite: Citation scores.

https://doi.org/10.1371/journal.pbio.3002612.s020

S12 Data. Topical enrichment for the top 10 countries with the highest numbers of plant science publications.

Sheet—Log likelihood ratio: For each country C and topic T, it is defined as log((a/b)/(c/d)) where a is the number of papers from C in T, b is the number from C but not in T, c is the number not from C but in T, d is the number not from C and not in T. Sheet: corrected p -value: The 4 values, a, b, c, and d, were used to conduct Fisher’s exact test. The p -values obtained for each country were corrected for multiple testing.

https://doi.org/10.1371/journal.pbio.3002612.s021

S13 Data. Text classification prediction probabilities.

This compressed file contains the PubMed ID (PMID) and the prediction probabilities (y_pred) of testing data with both positive and negative examples (pred_prob_testing), plant science candidate records with the MeSH term “Plants” (pred_prob_candidates_with_mesh), and all plant science candidate records (pred_prob_candidates_all). The prediction probability was generated using the Word2Vec text classification models for distinguishing positive (plant science) and negative (non-plant science) records.

https://doi.org/10.1371/journal.pbio.3002612.s022

Acknowledgments

We thank Maarten Grootendorst for discussions on topic modeling. We also thank Stacey Harmer, Eva Farre, Ning Jiang, and Robert Last for discussion on their respective research fields and input on how to improve this study and Rudiger Simon for the suggestion to examine differences between countries. We also thank Mae Milton, Christina King, Edmond Anderson, Jingyao Tang, Brianna Brown, Kenia Segura Abá, Eleanor Siler, Thilanka Ranaweera, Huan Chen, Rajneesh Singhal, Paulo Izquierdo, Jyothi Kumar, Daniel Shiu, Elliott Shiu, and Wiggler Catt for their good ideas, personal and professional support, collegiality, fun at parties, as well as the trouble they have caused, which helped us improve as researchers, teachers, mentors, and parents.

  • View Article
  • PubMed/NCBI
  • Google Scholar
  • 2. Blei DM, Lafferty JD. Topic Models. In: Srivastava A, Sahami M, editors. Text Mining. Cambridge: Chapman and Hall/CRC; 2009. pp. 71–93.
  • 7. ChatGPT. [cited 2023 Aug 25]. Available from: https://chat.openai.com
  • 9. Fei-Fei L, Perona P. A Bayesian hierarchical model for learning natural scene categories. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05); 2005. pp. 524–531 vol. 2. https://doi.org/10.1109/CVPR.2005.16
  • 19. Blei DM, Lafferty JD. Dynamic topic models. Proceedings of the 23rd International Conference on Machine learning. New York, NY, USA: Association for Computing Machinery; 2006. pp. 113–120. https://doi.org/10.1145/1143844.1143859
  • 35. Kuhn T. The Structure of Scientific Revolution. Chicago: University of Chicago Press; 1962.
  • 36. CiteSeer | Proceedings of the second international conference on Autonomous agents. [cited 2023 Aug 23]. Available from: https://dl.acm.org/doi/10.1145/280765.280786
  • 39. Chen T, Guestrin C. XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, NY, USA: ACM; 2016. pp. 785–794. https://doi.org/10.1145/2939672.2939785
  • 40. Řehůřek R, Sojka P. Software Framework for Topic Modelling with Large Corpora. Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks. Valletta, Malta: ELRA; 2010. pp. 45–50.
  • 42. Hugging Face–The AI community building the future. 2023 Aug 19 [cited 2023 Aug 25]. Available from: https://huggingface.co/

What are you looking for?

Suggested search, a new ‘rule of biology’ may have come to light, expanding insight into evolution and aging.

A molecular biologist at the USC Dornsife College of Letters, Arts and Sciences may have found a new “rule of biology.”

A rule of biology, sometimes called a biological law, describes a recognized pattern or truism among living organisms. Allen’s rule, for example, states that among warm-blooded animals, those found in colder areas have shorter, thicker limbs (to conserve body heat) than those in hotter regions, which need more body surface area to dissipate heat.

Zoologist Joel Allen formulated this idea in 1877, and though he wasn’t the first or the last to present a rule of biology, his is one of just a handful to gain acceptance among scientists.

Now, John Tower , professor of biological sciences at USC Dornsife, believes he has uncovered another rule of biology. He published his idea on May 16 in the journal Frontiers in Aging .

Life may require instability

Tower’s rule challenges long-held notions that most living organisms prefer stability over instability because stability requires less energy and fewer resources. For instance, hexagons appear frequently in nature — think honeycomb and insect eyes — because they are stable and require the least amount of material to cover a surface.

Tower centers his rule on instability, specifically a concept called “selectively advantageous instability,” or SAI, in which some volatility in biological components, such as proteins and genetic material, provides an advantage to cells.

Tower believes SAI is a fundamental part of biology. “Even the simplest cells contain proteases and nucleases and regularly degrade and replace their proteins and RNAs, indicating that SAI is essential for life,” he explains.

He says SAI also plays a key role in evolution.

As cells go about their business, building and degrading various unstable components, he explains, they will exist in one of two states — one state with an unstable component present and one state in which the unstable component is absent.

Natural selection may act differently on the two cell states. “This can favor the maintenance of both a normal gene and a gene mutation in the same cell population, if the normal gene is favorable in one cell state and the gene mutation is favorable in the other cell state,” he says. Allowing this genetic diversity can make cells and organisms more adaptable.

SAI may be at the root of aging — and more

Selectively advantageous instability may also contribute to aging. Creating and then replacing the unstable component within cells comes at the cost of materials and energy. Breaking it down may also require additional energy.

Also, since SAI sets up two potential states for a cell, allowing normal and mutated genes to co-exist, if the mutated gene is harmful, this may contribute to aging, Tower says.

In addition to evolution and aging, SAI has other far-reaching implications.

“Science has been fascinated lately with concepts such as chaos theory , criticality, Turing patterns and ‘ cellular consciousness ,’ says Tower. “Research in the field suggests that SAI plays an important role in producing each of these phenomena.”

Because of its apparent ubiquity in biology and its far-reaching implications, SAI may be the newest rule of biology, he says.

About the research

The research was supported by National Institute on Aging grant R01AG057741.

Related Articles

Usc dornsife mourns loss of jan amend, visionary scientist who helped forge geobiochemistry field, meet spring’s new usc dornsife faculty members, two usc dornsife scientists elected senior members of the national academy of inventors.

  • Science and Technology
  • biological sciences
  • molecular biology

turtle

The Biology Corner

Biology Teaching Resources

two turtles

Case Study – A Tiny Heart (old version)

screenshot

This case study was revised in 2023, get the NEW VERSION !

This case study focuses on a baby boy who was born with a problem with his heart.  The story is based on a real scenario, though some of the names have been changed, and the parents gave permission to include photos of the infant.

Students will read about symptoms that occur when a baby is born with stenosis, or a narrowing of the artery.   Students consider treatment options and compare the circulation of a fetus to that of an adult.   Finally, the Ross Procedure is described where a valve from the pulmonary artery is moved to the aorta.

This case study was made for a high school anatomy class, and may not be appropriate for younger audiences.  Students should have already completed the chapter on the circulatory system and have a strong foundation in how the circulatory system works.  Case studies are designed to be completed in small groups so that students can have discussions and help each other with difficult vocabulary.

case study on biology

HS-LS1-2 Develop and use a model to illustrate the hierarchical organization of interacting systems that provide specific functions within multicellular organisms

Shannan Muskopf

  • Open access
  • Published: 25 May 2024

Phylogenetic placement of the monotypic Baolia (Amaranthaceae s.l.) based on morphological and molecular evidence

  • Shuai Liu 1 ,
  • Marie Claire Veranso-Libalah 2 ,
  • Alexander P. Sukhorukov 3 , 4 ,
  • Xuegang Sun 5 ,
  • Maya V. Nilova 3 ,
  • Maria Kushunina 4 , 6 ,
  • Jannathan Mamut 1 &
  • Zhibin Wen 7 , 8 , 9 , 10  

BMC Plant Biology volume  24 , Article number:  456 ( 2024 ) Cite this article

Metrics details

Baolia H.W.Kung & G.L.Chu is a monotypic genus only known in Diebu County, Gansu Province, China. Its systematic position is contradictory, and its morphoanatomical characters deviate from all other Chenopodiaceae. Recent study has regarded Baolia as a sister group to Corispermoideae. We therefore sequenced and compared the chloroplast genomes of this species, and resolved its phylogenetic position based on both chloroplast genomes and marker sequences.

We sequenced 18 chloroplast genomes of 16 samples from two populations of Baolia bracteata and two Corispermum species. These genomes of Baolia ranged in size from 152,499 to 152,508 bp. Simple sequence repeats (SSRs) were primarily located in the LSC region of Baolia chloroplast genomes, and most of them consisted of single nucleotide A/T repeat sequences. Notably, there were differences in the types and numbers of SSRs between the two populations of B. bracteata . Our phylogenetic analysis, based on both complete chloroplast genomes from 33 species and a combination of three markers (ITS, rbcL , and matK ) from 91 species, revealed that Baolia and Corispermoideae ( Agriophyllum , Anthochlamys , and Corispermum ) form a well-supported clade and sister to Acroglochin . According to our molecular dating results, a major divergence event between Acroglochin , Baolia , and Corispermeae occurred during the Middle Eocene, approximately 44.49 mya. Ancestral state reconstruction analysis showed that Baolia exhibited symplesiomorphies with those found in core Corispermoideae characteristics including pericarp and seed coat.

Conclusions

Comparing the chloroplast genomes of B. bracteata with those of eleven typical Chenopodioideae and Corispermoideae species, we observed a high overall similarity and a one notable noteworthy case of inversion of approximately 3,100 bp. of DNA segments only in two Atriplex and four Chenopodium species. We suggest that Corispermoideae should be considered in a broader sense, it includes Corispermeae (core Corispermoideae: Agriophyllum , Anthochlamys , and Corispermum ), as well as two new monotypic tribes, Acroglochineae ( Acroglochin ) and Baolieae ( Baolia ).

Peer Review reports

The family Chenopodiaceae (Amaranthaceae s.l.) with approximately 110 genera and 1700 species, is a large clade divided into seven subfamilies: Betoideae, Camphorosmoideae, Chenopodioideae, Corispermoideae, Salicornioideae, Salsoloideae, and Suaedoideae [ 1 ]. These major groups within Chenopodiaceae s.s. are distinguished by a set of morphoanatomical characteristics that make them visually distinguishable, as demonstrated by several studies [ 2 , 3 , 4 ]. Over the last two decades, numerous molecular phylogenetic studies have significantly enhanced our understanding of relationships within each subfamily. For example, Hohmann et al. [ 5 ] explored that Betoideae and included Beta , Hablitzia , Patellifolia , Oreobliton , and Aphanisma . The origins of the Oreobliton and Aphanisma species showed an evolution towards drier habitats. Kadereit and Freitag [ 6 ] examined the relationship between Camphorosmoideae and Salsoloideae, and provided a revised classification of Camphorosmoideae including Camphorosmeae, as well as descriptions of the new genera Spirobassia , Eokochia , Grubovia and Sedobassia . Fuentes-Bazan et al. [ 7 , 8 ], Sukhorukov et al. [ 9 ], Uotila et al. [ 10 ] suggested that Chenopodioideae can be divided into Anserineae, Axyrideae, Dysphanieae, and Chenopodieae (incl. Atripliceae). Shepherd et al. [ 11 ] and Kadereit et al. [ 12 ] provided an insight into Salicornioideae and the relationships among the clade Sarcocornia  +  Salicornia ( Salicornia s.l.) and especially the Australian members were clarified. Additionally, Akhani et al. [ 13 ] and Wen et al. [ 14 ] focused on Salsoloideae that greatly improved the phylogenetic position of their members dividing them into Salsoleae and Caroxyleae. Schütze et al. [ 15 ] studied Suaedoideae with further merger of Alexandra and Borszczowia into Suaeda .

A monotypic genus Baolia H.W.Kung & G.L.Chu, discovered only a few decades ago [ 16 ], remained enigmatic for a long time due to its limited distribution in Central China with only one collection from the type locality in Diebu [Têwo] county, Gansu province. Recently, Baolia bracteata H.W.Kung & G.L.Chu was rediscovered 15 km east from the type locality and included in a phylogenetic analysis using nuclear (nrITS) and two chloroplast markers ( rbcL and atpB - rbcL ) [ 17 ]. This analysis resolved it as a sister group to Corispermoideae, which includes Corispermum L., Agriophyllum M.Bieb., and Anthochlamys Fenzl [ 17 ]. Despite their close phylogenetic positions, Baolia and Corispermoideae exhibit high heterogeneity in morphological characteristics [ 16 , 18 , 19 ].

Unlike gene fragments, complete chloroplast genomes encompass a greater amount of genetic information and mutation sites. These attributes prove advantageous in various aspects including phylogenetic analysis, assessment of genetic diversity, and plant molecular identification [ 20 , 21 ]. Until now, chloroplast genomes from only a limited number of Chenopodiaceae species have been deposited in GenBank ( https://www.ncbi.nlm.nih.gov/sra ). However, numerous genera within the family still lack representation, and the prospect of establishing a comprehensive phylogeny based on complete plastomes of Chenopodiaceae s.s. remains a distant goal. To address this issue, a solution lies in leveraging the multitude of sequences amassed from molecular phylogenetic investigations of Chenopodiaceae over the years, which could provide a more comprehensive and in-depth sampling.

Consequently, this study aims to generate new sequences (nuclear ribosomal ITS and two plastid loci rbcL and matK ) to complement available GenBank sequences and resolve phylogenetic relationships between Baolia and closely related taxa. Furthermore, the placement of Acroglochin warrants thorough discussion. In a recent study [ 17 ], this genus was found to be a sister to the ‘ Baolia  + Corispermoideae’ clade. Considering the previously proposed phylogenetic position of Acroglochin either within Betoideae [ 1 ] or in close proximity to Corispermum [ 5 , 22 , 23 ], a reevaluation becomes imperative.

Using new and previously generated molecular data, our objectives were as follows: (1) to scrutinize variations in the structure and composition of chloroplast genomes in two Baolia populations, while conducting a comparative analysis with eleven typical Chenopodioideae and Corispermoideae species; (2) to elucidate the phylogenetic relationships between Acroglochin , Baolia , and Corispermoideae; (3) to evaluate and reconstruct ancestral states of significant morphoanatomical traits.

Genome structural variation

The chloroplast genome of Baolia bracteata exhibited the typical tetrad cyclic structure, comprised of the LSC (86,140 − 86,146 bp), SSC (20,118 − 20,127 bp) and two IR regions (23,118 bp) (Additional file 1: Fig. S1). The length of the plastid genome ranged from 152,499 to 152,508 bp (Additional file 2: Table S1). In contrast to the LSC and SSC regions, the variation in length of the IR region was relatively small. The GC content of the LSC, SSR, IRs was 36.7%, 34.5%, 30.7%, and 43.4%, respectively. The distribution of GC content was uneven throughout the whole chloroplast genome (Additional file 2: Table S1). The chloroplast genome comprised a total of 131 genes, including 86 protein-coding genes, 38 tRNA genes, and 8 rRNA genes. Additionally, rpl23 and rps19 were identified as pseudogenes (Additional file 2: Table S2). Among these 131 genes, 14 contained one intron ( atpF , ndhA , ndhB , petB , petD , rpl2 , rpl16 , rpoC1 , trnI-GAU , trnG-GCC , trnL-UAA , trnV-UAC , trnA-UGC , and trnK-UUU ), while 3 contained two introns ( rps12 , ycf3 , and clpP ) (Additional file 2: Table S3).

Chloroplast genome sizes of eleven analyzed species from Chenopodioideae and Corispermoideae ranged from 150,590 to 152,237 bp. Atriplex centralasiatica Iljin had the largest plastome size while Corispermum declinatum Stephan ex Iljin had the smallest plastome size. The total number of genes among these eleven species ranged from 129 to 133, encompassing 84–88 protein coding genes, 37 tRNA genes and 8 rRNA genes. The total GC content of these chloroplast genomes (36.9–37.3%), LSC regions (34.7–35.4%), SSC regions (30.3–31.0%), and IR regions (42.7–42.8%) did not exhibit significant differences across different species (Additional file 2: Table S1).

Simple repetitive sequences (SSRs) and repetitive sequences

A total of 1,386 SSRs were identified in the 16 chloroplast genomes of the two populations of Baolia bracteata . To analyze the characteristics of these SSRs, we selected three types with different numbers of SSRs: B. bracteata 1–3 and 1–5 in population 1, and B. bracteata 2 − 1 in population 2. This selection allowed further investigation of the type and distribution of SSRs (Fig.  1 A). Of the total of SSRs, 70.59–71.26% were located in the LSC region, 9.20–13.30% in the IR region and 19.54-20.00% in the SSC region. Notably, B. bracteata 1–3 had two fewer A and T single-nucleotide repeats compared to B. bracteata 1–5 and 2 − 1. These repeats were found in the LSC, IGS ( ycf4, cemA ) and the rpl 16-intron1 (Fig.  1 B; Additional file 2: Table S4). Importantly, approximately 29-73.56% of the total of SSRs consisted of A/T single-nucleotide repeat sequences, suggesting an A/T nucleotide bias among the chloroplast SSRs of B. bracteata .

To characterize the B. bracteata chloroplast genome, we analyzed four types of repeat sequences: forward repeats (F), reverse repeats (R), palindromic repeats (P), and complementary repeats (C). All four types of repetitive sequences were detected in B. bracteata 1–2, 1–3, 1–5 in population 1, and B. bracteata 2 − 1 in population 2. These representative types were then used to study the positions of these repeats. We found 68–69 repetitive sequences (> 10 bp), including one R-type repetition, 34 P-type repetitions, and 33–34 F-type repetitions. However, C-type repetition was not identified. Most of the forward and palindromic repeats, as well as all the reverse and complementary repeats, were located in the LSC region. Both B. bracteata 1–2 and 2 − 1 exhibited an R-type repetition, located in the rpl16 -intron1 gene within the LSC region. Baolia bracteata 1–2, 1–5, and 2 − 1 displayed 11 F-type repetitions. In contrast, B. bracteata 1-3 had 12 F-type repetitions. Notably, B. bracteata 1–3 contained an additional F-type repeat sequence situated in ycf3 -intron1 or IGS ( rps 12, trnV - GAC ), distinguishing it from B. bracteata 1–2, 1–5, and 2 − 1 (Fig.  1 C; Additional file 2: Table S4).

The lengths of the repeats ranged from 30 to 30,118 bp. Based on their length, we categorized the repeats into four categories: 30–45 bp, 45–60 bp, 60–75 bp, and > 75 bp. The majority of repeats (88.24–88.40%) were within the 30–45 bp range, while 10.14–10.29% fell within 46–60 bp range, and 1.45–1.50% exceeded 75 bp in length (Fig.  1 D; Additional file 2: Table S4).

figure 1

Types and distributions of repeat sequences and short sequence repeats (SSRs) in Baolia bracteata chloroplast genomes. A  The number of SSR loci in different chloroplast genome regions. B  Distribution of repeats classified by type. C  Number and position repeat sequences in four B. bracteata chloroplast genomes. D  The length of the plastid repeat sequence in B. bracteata

Comparative genomic analysis

Using Baolia bracteata as a reference, we conducted an analysis of the junction sites between the IR and SC regions in comparison with eleven species from Chenopodioideae and Corispermoideae (Fig.  2 ). The sizes of the IR region ranged from 23,118 to 25,231 bp, encompassing the rpl2 and trnN genes, while the LSC region contained the rpl22 and trnH genes. In most species, the SSC/IRb boundary was situated in the coding regions of the ycf1 and ndhF genes. However, in B. bracteata , Corispermum chinganicum Iljin and C. declinatum , the SSC/IRb boundaries of were located exclusively in the ndhF gene. Similarly, for Dysphania ambrosioides (L.) Mosyakin & Clemants, this boundary was found only within the ycf1 gene. The junction of the LSC/IRb region contained the rps19 gene. The IRa/SSC boundary was identified within the ycf1 gene, with B. bracteata , C. chinganicum , and C. declinatum exhibiting a complete IRa/SSC boundary of the ycf1 gene within the SSC region (Fig.  2 ).

Furthermore, we conducted a comprehensive sequence mVISTA homology analysis of the chloroplast genomes of these 12 species (Additional file 1: Fig. S2). These genomes exhibited similarities in terms of length, structure and gene distribution. A high degree of homology was observed across all genomes, with a few regions displaying less than 90% homology. Notably, the IR region demonstrated greater conservation than the SC region, and coding regions exhibited higher conservation compared to non-coding regions. The multiple comparison analysis using Mauve revealed substantial interlocking blocks within the chloroplast genomes of all 12 species. However, a notable inversion of approximately 3,100 bp was observed at the LSC position in two Atriplex L. and four Chenopodium L. species, containing the genes rbcL - atpB - atpE - trnM - trnV (Additional file 1: Fig. S3).

figure 2

The borders of large single copy (LSC), small single copy (SSC), and inverted repeat (IR) regions among 12 chloroplast genomes. The number above the gene features means the distance between the ends of genes and the borders sites

Phylogenetic analysis

For the phylogenetic analysis, we utilized 33 complete chloroplast genome sequences from 18 species, comprising 18 newly sequenced chloroplast genomes from C. chinganicum , C. declinatum and B. bracteata , as well as sequences for 15 species downloaded from NCBI (Additional file 2: Table S5). The maximum likelihood (ML) and Bayesian inference (BI) methods were employed to generate phylogenetic trees, both of which yielded congruent topologies. Specifically, all B. bracteata samples formed a well-supported monophyletic clade (bootstrap support, bs = 100%; posterior probability, pP = 1). This clade was identified as the sister group to two Corispermum species (bs = 100%; pP = 1). Species from Chenopodioideae including Chenopodium , Atriplex , Oxybasis , and Dysphania collectively formed a well-supported clade (bs = 100%; pP = 1) (Fig.  3 ).

For marker sequences employed in the phylogenetic analysis (ITS, rbcL , and matK ), a dataset of 236 sequences, which included 18 newly obtained sequences, representing 91 species, was used (Additional file 2: Table S6). The concatenated data matrix encompassed 3,665 characters. The ML analysis conducted on the three genes resulted in an optimal single tree (-ln L  = 37562.0677). In particular, a monophyletic group comprising six representative B. bracteata samples was identified (bs = 100%; pP = 1). This monophyletic group emerged as the sister clade to Corispermeae (bs = 100%, pP = 1) (Additional file 1: Fig. S4). Additionally, Acroglochin was resolved as the sister taxon to the clade comprising Corispermoideae ( Agriophyllum , Anthochlamys , and Corispermum ) and the Baolia clade (bs = 87%, pP = 1).

figure 3

Phylogenetic tree reconstruction of the 33 species inferred from Maximum Likelihood (ML) and Bayesian Inference (BI) analyses based on the complete plastomes. Bayesian posterior probabilities / ML bootstrap values are shown above branches. Branches with support rates of not 100% and 1 are not marked

Dated molecular phylogeny

For divergence time estimation, our analysis focused exclusively on Corispermoideae and Chenopodioideae along with the genera Acroglochin and Baolia . The rbcL and matK matrix comprised 2,879 characters and 43 species, and the ITS matrix comprised 684 characters and 45 species. The divergence trees resulting from the matK  +  rbcL datasets is presented in Additional file 1: Fig. S5. Within this tree, Corispermeae and Baolia formed a monophyletic clade that was a sister to Acroglochin . The divergence tree resulting from the ITS is shown in Additional file 1: Fig. S6. In this tree, Corispermeae and Acroglochin formed a weakly supported clade sister to Baolia . Based on the outcomes of molecular dating, a major split between Acroglochin , Baolia , and Corispermeae occurred during the Middle Eocene approximately 44.49 (59.40–27.33) mya. Additional dates for various lineages can be found in Table  1 .

Fruit and seed anatomy of Baolia and Acroglochin

The fruit is indehiscent and displays a distinctive foveolate surface, setting it apart from other members of Chenopodiaceae s.s. Our investigation has revealed that these foveolae are a result of the bursting or compression of the outer walls of the exocarp cells during the drying process that follows fruit ripening (Fig.  4 A). Upon soaking, many exocarp cells regain their original mamillate shape (Fig.  4 B). The mesocarp (Fig.  4 C) consists of brachysclereids, characterized by small lumens filled with brown tannin-like substances. This supportive tissue contributes to the fruit’s firmness. The lowermost layers of the mesocarp contain monoprismatic crystals. The endocarp is composed of a single layer with thickened cell walls. The seed coat is superficially smooth, closely attached to the pericarp but not fused with it. It is thin, comprising two compressed layers, with tannin-filled cells. Occasionally, one to several colorless intermediate layers can be observed between these layers. Perisperm is abundant, and the embryo is annular and positioned vertically.

Acroglochin

The fruit is one-seeded, dehiscent through a lid. The pericarp exhibits a greenish hue and consists of multiple parenchymatous layers. The seeds are dark-red, somewhat depressedly-roundish, or slightly elongated, with a shiny surface with marginal keeling and polygonal cell shape (Fig.  4 D, E). The seed-coat testa measures 25–30 μm in thickness and features stalactite-like formations in the outer cell walls (Fig.  4 F). The tegmen is significantly thinner, made up of 2–3 compressed cell layers. Perisperm is abundant, and the embryo is annular and positioned vertically.

figure 4

Fruit anatomy of Baolia bracteata ( A - C ) and Acroglochin persicarioides ( D - F ): ( A ) - fruit of B. bracteata ; ( B ) - foveolate fruit surface of B. bracteata ; ( C ) - cross-section of pericarp and seed-coat of B. bracteata ; ( D ) - seed of A. persicarioides ; ( E ) - seed surface of A. persicarioides ; ( F ) - cross-section of A. persicarioides seed coat

Ancestral state reconstruction

The ancestral state reconstruction revealed that characters formerly employed for defining Baolia , Acroglochin , and Corispermoideae exhibit varying degrees of homoplasy (Table  2 ; Additional file 2: Tables S7-S17; Additional file 1: Figs. S7-S15). For instance, attributes such as fruit dehiscence, the presence of sclerenchymatous tissue in the pericarp, and the thickness of the seed coat testa display complex patterns of convergence (Fig.  5 ; Additional file 1: Fig. S15). Notably, the presence of acicular apices appears to be an apomorphic state shared between Acroglochin and Teloxys (Additional file 1: Fig. S8). A similar pattern emerges for the inflorescence structure featuring clusters of monochasium, which represents a derived state in Acroglochin , Ceratocarpus , and Teloxys (Additional file 1: Fig. S9).

Several traits within Baolia exhibit symplesiomorphies with those found in core Corispermoideae ( Agriophyllum , Anthochlamys , and Corispermum ), including characteristics like seed-coat testa, pericarp with sclerenchymatous tissue (Fig.  4 C; Additional file 1: Figs. S14-S15). A noteworthy apomorphy in Baolia involves papillate fruits with honeycomb-like surface formed by ruptured outer walls of exocarp cells (Figs.  4 A and C and 5 ). The current ancestral character reconstruction underscores the necessity for a meticulous reassessment of the morphological attributes that have traditionally been employed in delineating the boundaries of Baolia and Acroglochin .

figure 5

Ancestral character reconstruction of pericarp surface characters in Corispermoideae. Pericarp surface: 0 - smooth; 1 - papillate or mamillate (sometimes with trichomes) with non-bursting outer walls of the exocarp cells; 2 - papillate with bursting outer walls of the exocarp cells and forming at fruiting honey-comb sculpture; 3 - with bladder hairs; 4 - with stellate hairs

Repetitive sequence, comparative genomic analysis and phylogenetic inference

Repetitive sequences within chloroplast genomes offer valuable insights into genome rearrangements, sequence divergence, and can serve as useful molecular markers for phylogenetic and population studies [ 24 , 25 ]. An analysis of the chloroplast genomes of 16 B. bracteata samples from two populations revealed the presence of 85 to 87 SSRs (Fig.  1 A). These SSRs were predominantly located in the LSC region, and the majority of them consist of single-nucleotide A/T repeat sequences, a pattern consistent with the chloroplast genomes of most angiosperms [ 26 , 27 ], and they contributed significantly to the A/T abundance of the plastid genome. The abundance of long repeats and SSRs in the intergenic region (Additional file 2: Table S4) may result from variants, including indels and SNPs [ 28 , 29 ]. The variations in SSR types and numbers, as well as repetitive sequences, between the two populations of B. bracteata were distinct (Fig.  1 ), providing valuable insights for further studies on the level of population genetic diversity.

In most plants, the boundaries and junctions of the four structure parts of the chloroplast genome structure are conserved (e.g [ 21 , 26 , 27 , 28 , 29 , 30 , 31 , 32 , 33 , 34 ]). Our results based on complete chloroplast genome analysis indicate that B. bracteata and eleven other species from Chenopodioideae and Corispermoideae exhibit highly conserved structure, gene content and gene order, with little variation between species. One notable exception is the presence of an inversion event in Atriplex and Chenopodium species (Additional file 1: Fig. S3). A previous study [ 35 ] detected sequence inversions in the rbcL - trnV region (~ 3.1 kb) of the chloroplast genomes of Chenopodium quinoa and C . album (Chenopodioideae). Large inversions have also been found in other taxa, like Hevea brasiliensis [ 36 ], Annona cherimola [ 37 ], Viscum minimum [ 38 ], Passiflora edulis [ 39 ]. These findings suggest that large inversions are relatively common in plant genomes. AT-rich regions are prone to inversion of large segments, but this phenomenon is not present in the three Cypripedium species with long inversions [ 40 ]; therefore, the relationship between inversions and AT-rich sequences remains uncertain.

In the previous study based on 48 species and three loci sequence data (ITS, atpB - rbcL , and rbcL ), Acroglochin was resolved as sister to Baolia  + Corispermeae and consequently considered part of the expanded Chenopodioideae [ 17 ]. In this study, the phylogenetic relationships of Baolia and Corispermoideae were resolved not only based on three loci sequence data (236 sequences from 91 species) but also on chloroplast genomes (33 accessions from 18 species). In both analyses, B. bracteata samples formed a well-supported clade, which was sister to core Corispermoideae (Fig.  3 ; Additional file 1: Fig. S4). Acroglochin was the sister taxon to the core Corispermoideae +  Baolia clade (bs = 87%, pP = 1) (Additional file 1: Fig. S4). Species from Chenopodioideae formed a well-supported clade (bs = 100%, pP = 1), and they were sister to the clade composed of Acroglochin , core Corispermoideae, and Baolia (bs = 100%, pP = 1) (Additional file 1: Fig. S4).

Geographical and spatial diversification of Acroglochin and Baolia

Acroglochin is indeed a remarkable genus within the Chenopodiaceae family. It exhibits two rare synapomorphies shared with unrelated members of Chenopodiaceae s.s. These characteristics include acicular apices terminating short branches (Fig.  6 E), a trait it shares with Dysphania tibetica and Teloxys aristata (both belonging to the Dysphanieae tribe in Chenopodioideae), as well as a circumscissile fruit type, a feature found in many Betoideae members. These shared traits led to the initial classification of Acroglochin within the Betoideae subfamily (e.g [ 2 , 5 ]).

However, recent phylogenetic data [ 17 ] have suggested that Acroglochin should be excluded from the Betoideae subfamily. As a result, the revised circumscription of Betoideae, which excludes Acroglochin , indicates that this subfamily is primarily found in regions such as the Mediterranean area, Macaronesia, West Europe, Asia Minor, and the Caucasus. The subfamily is also represented in the California floristic province of North America by the monotypic genus Aphanisma Nutt. With the exclusion of Acroglochin from Betoideae, the Himalaya and Tibet regions do not have any native representatives of Betoideae.

The number of species within Acroglochin has been a subject of taxonomic debate. Some earlier authorities [ 2 , 5 ,  19 ] accepted only one species, A. persicarioides (Poir.) Moq. However, Zhu & Sanderson [ 30 ] recognized four species within Acroglochin , all of which were described from the Sichuan province of China. Acroglochin is known to have a wide distribution, including Bhutan, South/Central China, North India, Nepal, and North Pakistan (Fig.  6 F). Records of Acroglochin may be found in northern Myanmar, Vietnam, and Laos. Typically, Acroglochin is found at elevations between 1700 and 3200 m above sea level, with its main distribution range in subtropical monsoon climates. There are a few records in the Tibetan Autonomous Region (Xizang province, China), primarily associated with higher altitudes that exceed the typical altitudinal range of Acroglochin .

In contrast, the distribution of the monotypic genus Baolia is confined to the vicinity of Diebu [Têwo] county in Gansu province, China (Fig.  6 F) and only one collection from the type locality in Diebu was ever found. A new population (33˚56’47″N, 103˚44’15″E) found by one the authors (Sun Xuegang) was subsequently rediscovered 15 km to the east of the Baolia type locality. Baolia predominantly thrives on sunlit slopes in steppe habitats (Fig.  6 A-D) at an elevation of approximately 1900 meters above sea level [ 16 , 30 ]. These areas receive sufficient precipitation during the warm season. However, it’s important to note that the type locality of B. bracteata [ 16 ] faced a significant decline in population growth in the early 2000s due to escalating human activities, particularly related to new construction and changes in land use [ 41 , 42 ]. Given its restricted range and habitat threats, Baolia should be classified as ‘Critically Endangered’ (CR) according to the IUCN Red List Categories [ 43 ].

In contrast, all members of the core Corispermoideae (including Agriophyllum , Anthochlamys , and Corispermum ) exhibit a wide distribution across temperate, mostly (semi-) arid regions of Eurasia, with a few Corispermum species extending into North America. Some Corispermoideae species ( Agriophyllum tibeticum Sukhor., Corispermum sp. div.) are also present in mountainous regions of Tibet, although they are typically found at much higher elevations ranging from 3000 to 5000 m above sea level [ 18 , 19 ]. It is worth noting that none of the core Corispermoideae species are adapted to monsoon climates. As demonstrated here, the segregation of Acroglochin , Baolia , and core Corispermoideae is primarily driven by geographic and ecological divergence.

figure 6

Geographical distribution, habitat and characteristics of Baolia bracteata and Acroglochin persicarioides . Baolia bracteata habitat and features: ( A ) - general view of the habitat; ( B ) - young plant; ( C ) - mature plant with flowers and fruits; ( D ) - close-up of the inflorescence. Photographs by Sun Xuegang (2021, Diebu [Têwo] County, Gansu Province, China). ( E ) - (A) persicarioides plant at fruiting stage. Photograph by Alexander Sukhorukov (September 2013, Mid-West Nepal). ( F ) - geographic distribution map of (B) bracteata (labeled by box) and A. persicarioides (labeled by circles). Multi-year provincial administrative boundary data in China from Resource and Environmental Sciences Data Registry and Publishing System, 2023 ( http://www.resdc.cn/DOI ). The labeled distribution loci in the figure are plotted by Maria Kushunina based on the distribution information of specimens seen

Are there similarities between Baolia and Corispermoideae?

The morphological data do not provide strong evidence for close relationships between Baolia and all Corispermoideae. Several gross morphological characters shared by Baolia and Corispermoideae, such as the absence of acicular apices (character state 2:0) (Additional file 1: Fig. S8) and indehiscent fruits (6:0) (Additional file 1: Fig. S12), are common features found in nearly all members of the Chenopodiaceae. Among the micromorphological characters, carpological traits in Chenopodiaceae have been studied in great detail, revealing their taxonomic, evolutionary, and ecological implications [ 9 , 11 , 44 , 45 , 46 , 47 , 48 , 49 ]. The following fruit and seed characters appear to unite Baolia and Corispermoideae: (1) multi-layered pericarp with supporting tissue (character state 9:1) (Additional file 1: Fig. S14); (2) presence of tannin-like substances in the cell lumens; (3) thin seed coat consisting of two equal layers filled with tannins (character state 10:0) (Additional file 1: Fig. S15).

However, there are some distinctions between these characters in Baolia and Corispermoideae. For instance, in Corispermoideae, the supporting tissue in the pericarp is typically represented by fibers (brachysclereids are absent), and the presence of monocrystals in the pericarp has not been detected [ 18 , 44 ].

These subtle differences in carpological traits suggest that while Baolia and core Corispermoideae share some micromorphological features, they also exhibit distinct characteristics, further complicating their taxonomic relationships based solely on morphology.

Taxonomic treatment

We propose to consider the subfamily Corispermoideae Raf. in a broader sense, including the tribe Corispermeae ( Corispermum , Anthochlamys , and Agriophyllum ), and to describe two additional tribes, Acroglochineae and Baolieae. In the recent circumscription, the subfamily is very heterogeneous. An improved description of the tribe Corispermeae (or core Corispermoideae) was provided by Sukhorukov [ 18 , 44 ].

Acroglochineae Sukhor. & Z.-B.Wen, trib. nov.

Type: Acroglochin Schrad. in Roem. & Schult.

Annuals, glabrous or scarcely pubescent with simple hairs, branches terminating in acicular apices. Leaves alternate, long-petiolate, broadly ovate or ovoid, dentate or erose, teeth straight or incurved, tip mucronate. Inflorescences leafy, monochasial, falsely dichotomous. Flowers bisexual. Perianth of 5 free segments, keeled along midrib. Stamens 2, anthers small, without appendages. Stylodia 2, concrescent into a style in their lower half. Fruit dehiscent by a lid; pericarp smooth, white or greenish with several homocellular layers. Seeds dark-red, depressedly-globular, ~ 1.3 mm in diameter, smooth, with crustaceous testal layer; embryo horizontal.

One to four species in Himalaya, Central and South China.

Baolieae Sukhor. & Z.-B.Wen, trib. nov.

Type: Baolia H.W.Kung & G.L.Chu.

Annuals, shortly pubescent with simple hairs, branches not terminating in acicular apices. Leaves alternate, long-petiolate; leaf blade ovoid, entire, tip obtuse. Inflorescences leafy, axillary, glomerulate (clusters of 2 to 4 flowers). Flowers bisexual, supported by two bracteoles. Perianth green, of 5 almost free segments, keeled along midrib. Stamens 5, anthers small, without appendages. Stylodia 2. Fruit indehiscent, with foveolate surface after drying (mamillate when fresh), yellowish, crustaceous; pericarp tightly adjoining to the seed, mesocarp multicellular, composed of brachysclereids. Seeds yellowish, roundish, seed coat of two thin layers of equal thickness; embryo vertical.

A monotypic tribe consisting of Baolia bracteata H.W.Kung & G.L.Chu, a narrow endemic to Diebu [Têwo] county, Gansu province, China.

Baolia seemed first to be related to Chenopodium [ 16 ], but later it was transferred to the Polycnemeae tribe (Amaranthaceae s.s.) with possible relations with Polycnemum , Nitrophila and Hemichroa [ 50 ]. Sukhorukov [ 18 ] proposed that Baolia is rather a member of Amaranthaceae s.s. or Caryophyllaceae based on the reproductive characters studied. In light of the recent molecular results, the morphoanatomical similarities are convergences in Baolia , some Caryophyllaceae and Amaranthaceae s.s.

Materials and methods

Taxon sampling, dna extraction and sequencing of chloroplast genome.

For Baolia bracteata , sixteen samples from two populations including seven and nine individuals, respectively, were used. These collections were made by SXG in 2021 in Diebu [Têwo] County, Gansu Province, China. Additionally, two Corispermum species, C. chinganicum and C. declinatum , were also sampled. No specific permissions were required for sampling and collection from these localities. Voucher specimens are deposited in the Herbarium of the Xinjiang Institute of Ecology and Geography Chinese Academy of Sciences (XJBI) and Tree Specimen Room of Forestry College, Gansu Agricultural University (GAUF) (Table S5 and Table S6). Plant identifications were conducted by SXG and WZB.

Young and fresh leaves were harvested and promptly preserved in silica gel. Genomic DNA was subsequently extracted from approximately 100 mg of silica-dried leaves following isolation protocols followed the modified 2 × CTAB buffer method [ 51 ]. The quality of the DNA was assessed using electrophoresis in a 1% (w/v) agarose gel. To construct a library, tags were assigned to each sample, and Illumina MiSeq / HiSeq2500 sequencing was employed [ 52 ]. The library’s fragment size ranged between 500 bp and 700 bp, with bidirectional 150–250 bp sequencing performed. Ensuring a minimum of 2 GB of sequencing data per species [ 53 ]. Moreover, the extracted DNA underwent sequencing using the ABI 3730xl DNA sequencer.

Chloroplast genome assembly and annotation

GetOrganelle v1.7.5 was used with default parameters to assemble clean data [ 54 ]. Bandage v0.8.1 was utilized to confirm whether they were assembled into a ring structure [ 55 ]. The genomes of Chenopodium acuminatum Willd. (GenBank No. MW057780.1) and Salsola collina Pall. (GenBank No. OK189514.1) were selected as references. GeSeq v2.03 [ 56 ] and PGA ( https://github.com/quxiaojian/PGA ) [ 57 ] were employed for annotating the complete chloroplast genome and verifying sequencing accuracy. For sequence verification, BLAST v2.8.1 was employed [ 58 ].

Start and stop codons were manually adjusted, and pseudogenes were identified using Geneious v8.0.2 [ 59 ]. Genes with truncated, shortened, or deleted open reading frames, along with multiple stop codons, were classified as pseudogenes. The organelle genome drawing tool OGDRAW ( http://ogdraw.mpimp-golm.mpg.de/ ) was used to create and visualize the circular plastid diagram [ 60 , 61 ]. The accession numbers for the complete chloroplast genome sequences have been deposited in GenBank (Accession No. OR449093 - OR449108).

Comparative analysis of chloroplast genomes

The software MAFFT v7 was utilized to compare the chloroplast genome [ 62 ]. The mVISTA program ( http://genoes.lbl.gov/vista/mvista/submit.shtml ) [ 63 ] was employed to assess differences in chloroplast genomes among various species, with Baolia bracteata serving as the reference. IRscope ( https://irscope.shinyapps.io/irapp/ ) [ 64 ] was employed to compare chloroplast genome contractions and expansions between B. bracteata and other species. Rearrangements or inversions of fragments within the genome were identified using Mauve v2.4.0 with default settings [ 65 ]. Nucleotide polymorphism (Pi) values were evaluated using DnaSP v5 with window length set as the whole length of each matrix [ 66 ].

Repetitive sequence analysis of chloroplast genomes

The REPuter program ( https://bibiserv.cebitec.uni-bielefeld.de/reputer ) [ 67 ] was employed to locate larger repetitive sequences, with the following parameters: Hamming distance of 3, a minimum repeat size of 30 bp, and a maximum computed repeat of 5,000 bp [ 68 ]. This search aimed to identify forward (F), reverse (R), palindromic (P), and complementary (C) repeats within the LSC, IRb, IRa, and SSC regions. For identifying SSRs in sixteen chloroplast genomes of B. bracteata , the misa tool ( https://webblast.ipk-gatersleben.de/misa/index.php ) [ 69 ] was employed, using the subsequent parameters: a minimum repeat threshold of 10 for mononucleotide (mono-) repeats, 5 for dinucleotide (di-) repeats, 4 for trinucleotide (tri-) repeats, and 3 for tetranucleotide (tetra-), pentanucleotide (penta-), and hexanucleotide (hexa-) repeat thresholds.

Taxon sampling for targeted sanger sequencing

The nrITS region and two cp. markers, rbcL and matK , were employed in this study. Sequences from B. bracteata were extracted from each chloroplast genome using Geneious v8.0.2 [ 59 ] for the ITS sequence and both Geneious and PhyloSuite v1.2.2 [ 59 , 70 ] for rbcL and matK sequences. Three representative samples from each population of B. bracteata were included. Ultimately, eighteen B. bracteata sequences were generated (Additional file 2: Table S6). A total of 236 published and new sequences, representing 91 species, were incorporated into the phylogenetic analyses. Among these, 80 species belong to Chenopodiaceae s.s., eight belong to Amaranthaceae, and three species were used as outgroups representing three different families: Phaulothamnus spinescens A.Gray (Achatocarpaceae), Rhabdodendron amazonicum (Spruce ex Benth.) Huber (Rhabdodendronaceae), and Simmondsia chinensis (Link) C.K.Schneid. (Simmondsiaceae).

For chloroplast genome data, we selected 33 chloroplast genomes from 18 species for analysis (Additional file 2: Table S5). We employed MAFFT v7 to compare all the complete chloroplast genomes, the gaps were deleted by Gblocks v.0.91b [ 68 ]. Subsequently, the best model GTRGAMMA was selected in jModelTest2 on XSEDE (2.1.6) with Bootstrap iterations set to 1,000 [ 71 , 72 ]. Phylogenetic trees were constructed based on the maximum likelihood (ML) method in RAxML-HPC2 on XSEDE (8.2.12) [ 73 ]. To generate the Bayesian inference (BI) tree, we used MrBayes on XSEDE (3.2.7a) with the model TVM + I + G (lsetnst = 6 rates = invgamma) for selecting plastid intact sequences in the BI analyses [ 72 , 73 , 74 ]. We employed two independent Markov Chain Monte Carlo (MCMC) chains, running for 20 million generations with a sampling frequency of every 1,000 generations. The initial 25% of the sampled data was discarded for burn-in [ 75 ]. The constructed phylogenetic tree was visualized using FigTree v1.4.2 [ 76 ].

For gene fragment sequences data, sequences were aligned using MAFFT v7 and subsequently adjusted manually. Gaps were introduced into the alignment to represent missing data. Initially, we analyzed the nuclear (nrITS) and two plastid ( matK and rbcL ) datasets separately to detect any conflicts. Since no conflicts, we utilized the concatenated data of all three markers for this study. Phylogenetic analyses were conducting employing both the Maximum Likelihood (ML) and Bayesian Inference (BI) methods.

The ML support values were estimated through 1,000 bootstrap replicates. For the BI analysis, four chains were run (Markov Chain Monte Carlo), commencing with a random tree, and trees were saved every 100 generations for a total of 2 million generations. Prior to the ML and BI analyses, the appropriate model of DNA substitution was estimated using jModeltest v2.1 [ 73 ]. For the combined dataset, the TIM1 + I + G model was selected, with the gamma distribution shape parameter set to 0.6320. The base frequencies were specified as follows: A = 0.2811, C = 0.2060, G = 0.2241, and T = 0.2915. Both the ML and BI analyzes were conducting using the CIPRES Science Gateway v3.3 ( https://www.phylo.org ).

Divergence time estimation

Only species from the core Corispermoideae and Chenopodioideae as well as the genera Acroglochin and Baolia were included in the analyses. The sequences were aligned using MAFFT v7 and then manually adjusted. Gaps were introduced to the alignment as missing data. The two data sets, nuclear (nrITS) and plastid ( rbcL  +  matK ) were analyzed separately using BEAST v.1.8.2, respectively [ 77 ]. BEAUti was first used to set priors and created the BEAST.xml input files. For analyses, Chenopodioideae representatives were defined as monophyletic in order to set the root at the split between Chenopodioideae / ( Acroglochin  +  Baolia  + Corispermeae). The substitution model parameters were set to HKY + I + G for rbcL  +  matK dataset, GTR + G for nrITS dataset based on the program jModelTest2. The relaxed Bayesian clock was implemented with rates for each branch drawn independently from a lognormal distribution [ 78 ]. A birth and death prior was set for branch lengths. The root age was set to 57 − 55 mya [ 1 , 6 ] using the normal prior. Due to the differences between the previous estimation of crown age of Atripliceae [s.str.] based on rbcL  +  matK dataset and ITS dataset, the crown age of Atripliceae was set to 31-16.4 mya, 29.4–19.2 mya in rbcL  +  matK dataset, ITS dataset, respectively [ 79 ]. The first runs were used to examine MCMC performance, and operators were adjusted as suggested by the output analysis. The final run was performed with 50,000,000 interations for ITS dataset, and 100,000,000 interations for rbcL  +  matK dataset, a burn-in of 10% and a sample frequency of 1,000. The Bayes factor was calculated by Tracer v1.7.2 [ 80 ] to check the effective sample sizes (> 200), and then the maximum clade credibility tree was generated in TreeAnnotator v1.8.2 [ 77 ] with a posterior probability limit of 0.7 and generated mean node heights. Final trees were edited in Figtree v1.4.2.

Ancestral character reconstruction

Ancestral characters of Baolia and related genera were reconstructed based on the pruned maximum clade credibility Bayesian tree generated above. Taxa with more than 70% missing data in the character matrix and duplicate samples were pruned using the drop.tip function in R [ 81 ]. The character matrix included ten coded discrete characters that are significant in the taxonomy of Amaranthaceae (Additional file 2: Table S7). Ancestral state reconstructions were carried out using the MrBayes Ancestral States with R [ 82 ]. Similar to the native MrBayes, MBASR employs continuous-time Markov modeling against a tree’s topology and branch lengths to statistically estimate for character states at ancestral tree nodes for discrete traits [ 82 ]. All analyses were performed in R v.4.2.2.

Morphoanatomical studies

The morphoanatomical data for Agriophyllum , Anthochlamys and Corispermum (Corispermoideae) were obtained from previous detailed studies [ 44 , 83 ]. Carpological features of Baolia and Acroglochin were examined by preparing cross-sections using a Microm HM 355 S rotary microtome (Thermo Fisher Scientific, USA). Prior to sectioning, the material was immersed in water: alcohol: glycerin (1: 1: 1) solution, dehydrated in a series of ethanol dilutions and embedded in Technovit 7100 resin (Heraeus Kulzer, Germany). The cross-sections were examined using a Nikon Eclipse Ci microscope and captured with a Nikon DS-Vi1 camera (Nikon Corporation, Japan). The fruit and seed surface was examined using a scanning electron microscope (SEM) JSM-6380 (JEOL Ltd., Japan) at 15 kV after sputter-coating with gold-palladium using an EIKO IB-3 Ion Coater (EIKO Engineering Ltd., Japan) at the Electron Microscopy laboratory, M.V. Lomonosov Moscow State University. Before SEM imaging, Baolia fruit underwent dehydration in aqueous ethyl alcohol solutions of increasing concentrations, followed by alcohol-acetone solutions, and pure acetone. Ten carpological characters and their states were coded in the present study for Acroglochin , Baolia , three Corispermeae ( Corispermum , Anthochlamys , and Agriophyllum ) and Chenopodioideae (see Additional file 2: Table S7).

Distribution mapping

Herbarium specimens of Acroglochin and Baolia stored at B, BM, BR, BSD, CAH, CDBI, DD, E, FJS, G, H, HUJ, IBSC, IMC, JIU, K, KATH, L (including U and WAG), KUN, LE, LY, M, MHA, MSB, MW, NAS, P, PE, PRA, SHI, TO, TUCH, W, WU, WUK, XIA, and XJBI were analyzed (herbarium abbreviations according to Thiers 2023+). The herbarium specimens of Acroglochin collected by APS are located in MW. Distribution maps are based on the specimens seen, and these were prepared using SimpleMappr online tool ( http://www.simplemappr.net ).

Availability of data and materials

All plastomes generated in this study are deposited in NCBI database ( https://www.ncbi.nlm.nih.gov/ ) (GenBank accession Nos. OP584480-OP584485, OP584905-OP584916, OR449093-OR449108, OR458831-OR458832, see Table S5 and Table S6). These data will remain private until the related manuscript has been accepted.

Abbreviations

Bayesian inference

Bootstrap support

Chloroplast

Cetyl trimethylammonium bromide

Guanine-cytosine

General time reversible

Intergenic regions

Internal transcribed spacer of ribosomal DNA

Inverted repeat

Inverted repeat regions

Two IR regions that are identical but in opposite orientations

Large single copy

Markov chain Monte Carlo

Maximum Likelihood

National Center for Biotechnology Information

Plastid Genome Annotator

Posterior probability

Ribosomal RNA

Small single copy

Simple sequence repeat

Transfer RNA

Kadereit G, Borsch T, Weising K, Freitag H. Phylogeny of Amaranthaceae and Chenopodiaceae and the evolution of C 4 photosynthesis. Int J Plant Sci. 2003;164(6):959–86.

Article   CAS   Google Scholar  

Volkens G. Chenopodiaceae. In Engler A, Prantl K. Die natürlichen Pflanzenfamilien. 1st ed. Leipzig: Engelmann; 1892. p. 36–91.

Google Scholar  

Ulbrich E. Chenopodiaceae. In: Engler A, Harms H, editors. Die natürlichen Pflanzenfamilien. 2nd ed. Leipzig: Duncker & Humblot; 1934. p. 379–584.

Sukhorukov AP, Mavrodiev EV, Struwig M, Nilova MV, Dzhalilova KK, Balandin SA, Erst A, Krinitsyna AA. One-seeded fruits in the core Caryophyllales: their origin and structural diversity. PLoS One. 2015;10(2):e0117974.

Article   PubMed   PubMed Central   Google Scholar  

Hohmann S, Kadereit JW, Kadereit G. Understanding Mediterranean-Californian disjunctions: molecular evidence from Chenopodiaceae-Betoideae. Taxon. 2006;55(1):67–78.

Article   Google Scholar  

Kadereit G, Freitag H. Molecular phylogeny of Camphorosmeae (Camphorosmoideae, Chenopodiaceae): implications for biogeography, evolution of C 4 -photosynthesis and taxonomy. Taxon. 2011;60(1):51–78.

Fuentes-Bazan S, Mansion G, Borsch T. Towards a species level tree of the globally diverse genus Chenopodium (Chenopodiaceae). Mol Phylogenet Evol. 2012;62(1):359–74.

Article   PubMed   Google Scholar  

Fuentes-Bazan S, Uotila P, Borsch T. A novel phylogeny-based generic classification for Chenopodium Sensu Lato, and a tribal rearrangement of Chenopodioideae (Chenopodiaceae). Willdenowia. 2012;42(1):5–24.

Sukhorukov AP, Nilova MV, Krinitsina AA, Zaika MA, Erst AS, Shepherd KA. Molecular phylogenetic data and seed coat anatomy resolve the generic position of some critical Chenopodioideae (Chenopodiaceae–Amaranthaceae) with reduced perianth segments. PhytoKeys. 2018;109:103–28.

Uotila P, Sukhorukov AP, Bobon N, McDonald J, Krinitsina AA, Kadereit G. Phylogeny, biogeography and systematics of Dysphanieae (Amaranthaceae). Taxon. 2021;70(1):526–51.

Shepherd KA, Macfarlane TD, Waycott M. Phylogenetic analysis of the Australian salicornioideae (Chenopodiaceae) based on morphology and nuclear DNA. Aust Syst Bot. 2005;18(1):89–115.

Kadereit G, Mucina L, Freitag H. Phylogeny of Salicornioideae (Chenopodiaceae): diversification, biogeography and evolutionary trends in leaf and flower morphology. Taxon. 2006;55(3):617–42.

Akhani H, Edwards G, Roalson EH. Diversification of the Old World Salsoleae s.l. (Chenopodiaceae): molecular phylogenetic analysis of nuclear and chloroplast data sets and a revised classification. Int J Plant Sci. 2010;171(9):1059–71.

Wen ZB, Zhang ML, Zhu GL, Stewart CS. Phylogeny of Salsoleae s.l. (Chenopodiaceae) based on DNA sequence data from ITS, psbB - psbH , and rbcL , with emphasis on taxa of northwestern China. Plant Syst Evol. 2010;288(1):25–42.

Schütze P, Freitag H, Weising K. An integrated molecular and morphological study of the subfamily Suaedoideae Ulbr. (Chenopodiaceae). Plant Syst Evol. 2003;239:257–86.

Kung HW, Chu GL, Tsien CP, Li AJ, Ma CG. The Chenopodiaceae in China. Acta Phytotax Sin. 1978;16(1):99–123.

Li B, Feng H, Pan J. Phylogenetic study of the Chinese endemic genus Baolia . Acta Bot Boreal-Occident Sin. 2021;41(7):1137–47.

Sukhorukov AP. The carpology of the Chenopodiaceae with reference to the phylogeny, systematics and diagnostics of its representatives. Tula: Grif & Co.; 2014. p. 1–397.

Sukhorukov AP, Liu PL, Kushunina M. Taxonomic revision of Chenopodiaceae in Himalaya and Tibet. PhytoKeys. 2019;116:1–141.

Dong W, Liu H, Xu C, Zuo YJ, Chen ZJ, Zhou SL. A chloroplast genomic strategy for designing taxon specific DNA mini-barcodes: a case study on ginsengs. BMC Genet. 2014;15(1):1–8.

Ni L, Zhao Z, Xu H, Chen SL, Dorje G. The complete chloroplast genome of Gentiana straminea (Gentianaceae), an endemic species to the sino-himalayan subregion. Gene. 2016;577(2):281–8.

Article   CAS   PubMed   Google Scholar  

Müller K, Borsch T. Phylogenetics of Amaranthaceae based on matK / trnK sequence data: evidence from parsimony, likelihood, and bayesian analyses. Ann Mo Bot Gard. 2005;92(1):66–102.

Kadereit G, Hohmann S, Kadereit JW. A synopsis of the Chenopodiaceae subfam. Betoideae and notes on the taxonomy of Beta . Willdenowia. 2006;36(1):9–19.

Weng ML, Blazier JC, Govindu M, Jansen RK. Reconstruction of the ancestral plastid genome in Geraniaceae reveals a correlation between genome rearrangements, repeats, and nucleotide substitution rates. Mol Biol Evol. 2014;31(3):645–59.

Heale SM, Petes TD. The stabilization of repetitive tracts of DNA by variant repeats requires a functional DNA mismatch repair system. Cell. 1995;83(4):539–45.

Hu Y, Woeste KE, Zhao P. Completion of the chloroplast genomes of five Chinese Juglans and their contribution to chloroplast phylogeny. Front Plant Sci. 2017;7:231924.

Duan H, Guo JB, Xuan L, Wang ZY, Li MZ, Yin YL, et al. Comparative chloroplast genomics of the genus Taxodium . BMC Genomics. 2020;21:1–14.

She H, Liu Z, Xu Z, Zhang HL, Cheng F, Wu J, et al. Comparative chloroplast genome analyses of cultivated spinach and two wild progenitors shed light on the phylogenetic relationships and variation. Sci Rep. 2022;12(1):856.

Article   CAS   PubMed   PubMed Central   Google Scholar  

Brotherus VF, Handel-Mazzetti H. Symbolae Sinicae: botanische Ergebnisse der Expedition der Akademie der Wissenschaften in Wien nach Südwest-China. 1914–1918 (Chenopodiaceae). Wien: J. Springer; 1929. p. 1324.

Zhu GL, Sanderson SC. Genera and a new evolutionary system of World Chenopodiaceae. Beijing: Science; 2017. p. 68.

Cho KS, Yun BK, Yoon YH, Hong SY, Mekapogu M, Kim KH, et al. Complete chloroplast genome sequence of tartary buckwheat ( Fagopyrum tataricum ) and comparative analysis with common buckwheat ( F. esculentum ). PLoS One. 2015;10(5):e0125332.

Fu PC, Zhang YZ, Geng HM, Chen SL. The complete chloroplast genome sequence of Gentiana lawrencei var. Farreri (Gentianaceae) and comparative analysis with its congeneric species. Peer J. 2016;4:e2540.

Choi KS, Chung MG, Park SJ. The complete chloroplast genome sequences of three veroniceae species (Plantaginaceae): comparative analysis and highly divergent regions. Front Plant Sci. 2016;7:355.

Khan A, Asaf S, Khan AL, Shehzad T, Rawahi AA, Harrasi AA. Comparative chloroplast genomics of endangered Euphorbia species: insights into hotspot divergence, repetitive sequence variation, and phylogeny. Plants (Basel). 2020;9(2):199.

Hong SY, Cheon KS, Yoo KO, Lee HO, Cho KS, Suh JT, et al. Complete chloroplast genome sequences and comparative analysis of Chenopodium quinoa and C . album . Front Plant Sci. 2017;8:1696.

Tangphatsornruang S, Uthaipaisanwong P, Sangsrakru D, Chanprasert J, Yoocha T, Jomchai N, et al. Characterization of the complete chloroplast genome of Hevea brasiliensis reveals genome rearrangement, RNA editing sites and phylogenetic relationships. Gene. 2011;475(2):104–12.

Bellot S, Renner SS. The plastomes of two species in the endoparasite genus Pilostyles (Apodanthaceae) each retain just five or six possibly functional genes. Genome Biol Evol. 2016;8(1):189–201.

Petersen G, Cuenca A, Seberg O. Plastome evolution in hemiparasitic mistletoes. Genome Biol Evol. 2015;7(9):2520–32.

Cauz-Santos LA, Munhoz CF, Rodde N, Cauet S, Santos AA, Penha HA, et al. The chloroplast genome of Passiflora edulis (Passifloraceae) assembled from long sequence reads: structural organization and phylogenomic studies in Malpighiales. Front Plant Sci. 2017;8:334.

Guo YY, Yang JX, Li HK, Zhao HS. Chloroplast genomes of two species of Cypripedium : expanded genome size and proliferation of AT-biased repeat sequences. Front Plant Sci. 2021;12:609729.

Wu Z, Raven PH. Ulmaceae through Basellaceae. In: Wu ZY, Raven PH, editors. Flora of China. Volume 5. Beijing: Science; Saint Louis: Missouri Botanical Garden Press;; 2003. p. 367.

Shi XJ, Zhang ML. Phylogeographical structure inferred from cpDNA sequence variation of Zygophyllum xanthoxylon across north-west China. J Plant Res. 2015;28(2):269–82.

IUCN. The IUCN red list of threatened species, version 2022-2. Gland: IUCN; 2023. http://www.iucnredlist.org/documents/RedListGuidelines.pdf . Accessed 7 May 2018 .

Sukhorukov AP. Fruit anatomy and its significance in Corispermum (Corispermoideae, Chenopodiaceae). Willdenowia. 2007;37(1):63–87.

Shepherd KA, Macfarlane TD, Colmer TD. Morphology, anatomy and histochemistry of Salicornioideae (Chenopodiaceae) fruits and seeds. Ann Bot. 2005;95(6):917–33.

Sukhorukov AP. Karpologische Untersuchung Der Axyris -Arten (Chenopodiaceae) Im Zusammenhang Mit Ihrer Diagnostik Und Taxonomie. Feddes Repert. 2005;116(3–4):168–76.

Sukhorukov AP. Fruit anatomy of the genus Anabasis (Salsoloideae, Chenopodiaceae). Aust Syst Bot. 2008;21(6):431–42.

Sukhorukov AP, Zhang M. Fruit and seed anatomy of Chenopodium and related genera (Chenopodioideae, Chenopodiaceae/Amaranthaceae): implications for evolution and taxonomy. PLoS One. 2013;8(4):e6190.

Sukhorukov AP, Shiposha VD, Kushunina M, Zaika MA. Biogeography and systematics of the genus Axyris (Amaranthaceae s.l). Plants (Basel). 2022;11(21):2873.

Chu GL. On systematic position of Baolia Kung et G.L.Chu in Chenopodiaceae. Acta Phytotax Sin. 1988;26(4):299–300.

Doyle JJ, Doyle JL. A rapid DNA isolation procedure for small quantities of fresh leaf tissue. Phytochem Bull. 1987;19:11–5.

Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.

Meyer M, Kircher M. Illumina sequencing library preparation for highly multiplexed target capture and sequencing. Cold Spring Harb Protoc. 2010;2010(6):1–11.

Jin JJ, Yu WB, Yang JB, Song Y, Depamphilis CW, Yi TS, et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020;21(1):241.

Wick RR, Schultz MB, Zobel J, Holt KE. Bandage: interactive visualization of de novo genome assemblies. Bioinformatics. 2015;31(20):3350–2.

Tillich M, Lehwark P, Pellizzer T, Ulbricht-Jones ES, Fischer A, Bock R, et al. Geseq - versatile and accurate annotation of organelle genomes. Nucleic Acids Res. 2017;45(W1):W6–11.

Qu XJ, Moore MJ, Li DZ, Yi TS. PGA: a software package for rapid, accurate, and flexible batch annotation of plastomes. Plant Methods. 2019;15:1–12.

Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10.

Kearse M, Moir R, Wilson A, Havas SS, Cheung M, Sturrock S, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28(12):1647–9.

Zheng S, Poczai P, Hyvönen J, Tang J, Amiryousefi A. Chloroplot: an online program for the versatile plotting of organelle genomes. Front Genet. 2020;11:576124.

Lohse M, Drechsel O, Bock R. OrganellarGenomeDRAW (OGDRAW): a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr Genet. 2007;52:267–74.

Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.

Frazer KA, Pachter L, Poliakov A, Rubin ME, Dubchak I. VISTA: computational tools for comparative genomics. Nucleic Acids Res. 2004;32(W1):W273–9.

Amiryousefi A, Hyvönen J, Poczai P. IRscope: an online program to visualize the junction sites of chloroplast genomes. Bioinformatics. 2018;34(17):3030–1.

Darling AE, Mau B, Perna NT. Progressive Mauve: multiple genome alignment with gene gain, loss and rearrangement. PLoS One. 2010;5(6):e11147.

Librado P, Rozas J. DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics. 2009;25(11):1451–2.

Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29(22):4633–42.

Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17(4):540–52.

Suwazono S, Arao H. A newly developed free software tool set for averaging electroencephalogram implemented in the Perl programming language. Heliyon. 2020;6(11):1–12.

Zhang D, Gao FL, Jakovlić I, Zou H, Zhang J, Li WX, et al. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour. 2020;20(1):348–55.

Posada D, Crandall KA. MODELTEST: testing the model of DNA substitution. Bioinformatics. 1998;14(9):817–8.

Darriba D, Taboada GL, Doallo R, Posada D. jModelTest 2: more models, new heuristics and parallel computing. Nat Methods. 2012;9(8):772.

Miller MA, Schwartz T, Pickett BE, He S, Klem EB, Scheuermann RH, et al. A RESTful API for access to phylogenetic tools via the CIPRES science gateway. Evol Bioinform. 2015;11:43–8.

Ronquist F, Huelsenbeck JP. MrBayes 3: bayesian phylogenetic inference under mixed models. Bioinformatics. 2003;19(12):1572–4.

Odago WO, Waswa EN, Nanjala C, Mutinda ES, Wanga VO, Mkala EM, et al. Analysis of the complete plastomes of 31 Species of Hoya group: insights into their comparative genomics and phylogenetic relationships. Front Plant Sci. 2021;12:814833.

Letunic I, Bork P. Interactive tree of life (iTOL) v5: an online tool for phylogenetic tree display and annotation. Nucleic Acids Res. 2021;49(W1):W293–6.

Drummond AJ, Suchard MA, Xie D, Rambaut A. Bayesian phylogenetics with BEAUti and the BEAST 1.7. Mol Biol Evol. 2012;29(8):1969–73.

Drummond AJ, Rambaut A. BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol Biol. 2007;7:214.

Kadereit G, Mavrodiev EV, Zacharias EH, Sukhorukov AP. Molecular phylogeny of Atripliceae (Chenopodioideae, Chenopodiaceae): implications for systematics, biogeography, flower and fruit evolution, and the origin of C 4 photosynthesis. Am J Bot. 2010;97(10):1664–87.

Rambaut A, Drummond AJ, Xie D, Baele G, Suchard MA. Posterior summarization in Bayesian phylogenetics using Tracer 1.7. Syst Biol. 2018;67(5):901–4.

Paradis E, Claude J, Strimmer K. APE: Analyses of phylogenetics and evolution in R language. Bioinformatics. 2004;20(2):289–90.

Heritage S. MBASR: workflow-simplified ancestral state reconstruction of discrete traits with MrBayes in the R environment. bioRxiv. 2021. https://doi.org/10.1101/2021.01.10.426107 .

Sukhorukov AP, Kushunina MA. Taxonomic revision of Chenopodiaceae in Nepal. Phytotaxa. 2014;191(1):10–44.

Download references

Author information

Authors and affiliations.

College of Life Sciences, Xinjiang Agricultural University, Urumqi, 830052, China

Shuai Liu & Jannathan Mamut

Biodiversität und Evolution der Pflanzen, Prinzessin Therese von Bayern-Lehrstuhl für Systematik, Ludwig-Maximilians-Universität München, Menzinger Str. 67, 830052, München, Germany

Marie Claire Veranso-Libalah

Department of Higher Plants, Biological Faculty, Lomonosov Moscow State University, Moscow, 119234, Russian Federation

Alexander P. Sukhorukov & Maya V. Nilova

Laboratory Herbarium (TK), Tomsk State University, Tomsk, 634050,, Russian Federation

Alexander P. Sukhorukov & Maria Kushunina

College of Forestry, Gansu Agricultural University, Lanzhou, 730070, China

Xuegang Sun

Department of Plant Physiology, Biological Faculty, Lomonosov Moscow State University, Moscow, 119234, Russian Federation

Maria Kushunina

State Key Laboratory of Desert and Oasis Ecology, Key Laboratory of Ecological Safety and Sustainable Development in Arid Lands, Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, China

Xinjiang Key Lab of Conservation and Utilization of Plant Gene Resources, Urumqi, 830011, China

Sino-Tajikistan Joint Laboratory for Conservation and Utilization of Biological Resources, Urumqi, 830011, China

The Specimen Museum of Xinjiang Institute of Ecology and Geography, Chinese Academy of Sciences, Urumqi, 830011, China

You can also search for this author in PubMed   Google Scholar

Contributions

ZBW, APS designed the research. SL, MCVL, XGS, ZBW conducted sample collection and data analysis, and drafted the manuscript. XGS, ZBW, APS provided guidance on taxonomy. MJ, MVN conducted some of the data processing. MVN conducted lab experiments. MCVL, APS, ZBW, MK revised the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Alexander P. Sukhorukov or Zhibin Wen .

Ethics declarations

Ethics approval and consent to participate.

This study’s material collections and experimental research followed the relevant institutional, national, and international guidelines and legislation. No specific permissions or licenses were needed.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary material 1., supplementary material 2., rights and permissions.

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ . The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/ ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Cite this article.

Liu, S., Veranso-Libalah, M.C., Sukhorukov, A.P. et al. Phylogenetic placement of the monotypic Baolia (Amaranthaceae s.l.) based on morphological and molecular evidence. BMC Plant Biol 24 , 456 (2024). https://doi.org/10.1186/s12870-024-05164-8

Download citation

Received : 06 November 2023

Accepted : 17 May 2024

Published : 25 May 2024

DOI : https://doi.org/10.1186/s12870-024-05164-8

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Corispermoideae
  • Morphoanatomical character
  • Plastid genome
  • Chenopodiaceae

BMC Plant Biology

ISSN: 1471-2229

case study on biology

  • Publications
  • Conferences & Events
  • Professional Learning
  • Science Standards
  • Awards & Competitions
  • Instructional Materials
  • Free Resources
  • American Rescue Plan
  • For Preservice Teachers
  • NCCSTS Case Collection
  • Science and STEM Education Jobs
  • Interactive eBooks+
  • Digital Catalog
  • Regional Product Representatives
  • e-Newsletters
  • Bestselling Books
  • Latest Books
  • Popular Book Series
  • Prospective Authors
  • Web Seminars
  • Exhibits & Sponsorship
  • Conference Reviewers
  • National Conference • Denver 24
  • Leaders Institute 2024
  • National Conference • New Orleans 24
  • Submit a Proposal
  • Latest Resources
  • Professional Learning Units & Courses
  • For Districts
  • Online Course Providers
  • Schools & Districts
  • College Professors & Students
  • The Standards
  • Teachers and Admin
  • eCYBERMISSION
  • Toshiba/NSTA ExploraVision
  • Junior Science & Humanities Symposium
  • Teaching Awards
  • Climate Change
  • Earth & Space Science
  • New Science Teachers
  • Early Childhood
  • Middle School
  • High School
  • Postsecondary
  • Informal Education
  • Journal Articles
  • Lesson Plans
  • e-newsletters
  • Science & Children
  • Science Scope
  • The Science Teacher
  • Journal of College Sci. Teaching
  • Connected Science Learning
  • NSTA Reports
  • Next-Gen Navigator
  • Science Update
  • Teacher Tip Tuesday
  • Trans. Sci. Learning

MyNSTA Community

  • My Collections

Bringing Mammoths Back from Extinction

Developing Scientific and Information Literacies

By Andrea M.-K. Bierema, Sara D. Miller, Claudia E. Vergara

Share Start a Discussion

Bringing Mammoths Back from Extinction

The focus of this case study is on the development of scientific and information literacies. Working on activities that address audience, purpose, language, authority, and use of evidence, students examine how the ways in which information sources are constructed can impact readers’ perceptions of scientific content. The case study was created for a flipped classroom in which students learn basic information literacy concepts before class and then work in teams during class or online to apply those concepts. Students analyze two articles related to “bringing back” mammoths by cloning ancient mammoth DNA: one article is an original research paper and the other is a news article discussing the research. The case is the first of a two-case study sequence (the second case study is “The Stakeholders of Gorongosa National Park: Intersecting Scientific and Information Literacies”), and can be taught either as a stand-alone activity or as the first in this two-case sequence. Both case studies focus on information literacy rather than scientific content and can be used in a wide variety of science courses.

Download Case

   

Date Posted

  • Describe different sources of information, considering the information needs of different users and the appropriate uses for different types of sources.
  • Describe the ways in which content (e.g., claims and supporting evidence) is used in different types of sources.
  • Explain that the ways in which information sources are constructed and the formats in which they are presented can impact readers’ perceptions of scientific content.
  • Explain the factors that guide their choice when matching information products (e.g., popular media, scientific sources) with information needs.
  • Explain the ways in which scientific information clarifies claims from popular media.

Article analysis; de-extinction; information literacy; mammoth; popular media; primary literature; scientific article; scientific literacy; secondary literature;

  

Subject Headings

EDUCATIONAL LEVEL

High school, Undergraduate lower division

TOPICAL AREAS

Science and the media

TYPE/METHODS

Teaching Notes & Answer Key

Teaching notes.

Case teaching notes are protected and access to them is limited to paid subscribed instructors. To become a paid subscriber, purchase a subscription here .

Teaching notes are intended to help teachers select and adopt a case. They typically include a summary of the case, teaching objectives, information about the intended audience, details about how the case may be taught, and a list of references and resources.

Download Notes

Answer Keys are protected and access to them is limited to paid subscribed instructors. To become a paid subscriber, purchase a subscription here .

Download Answer Key

Materials & Media

Supplemental materials.

The file below is a Microsoft Word version of the case that uses a comparison table for learners to complete.

  • info_lit_mammoth_sup.docx (~ 27KB)

You may also like

Web Seminar

Join us on Tuesday, June 4, 2024, from 7:00 PM to 8:30 PM ET, to learn about the free lesson plans and storyline units designed for high school studen...

Join us on Thursday, October 24, 2024, from 7:00 PM to 8:00 PM ET, to learn about all NSTA Teacher Awards available and how to apply.Did you come up w...

Study finds widespread ‘cell cannibalism,’ related phenomena across tree of life

Research has implications for human health, cancer treatment.

Digital rendering of cells.

In addition to competing for resources, living cells actively kill and eat each other. New explorations of these "cell-in-cell" phenomena show they are not restricted to cancer cells but are a common facet of living organisms, across the tree of life. Graphic by Jason Drees

In a new review paper, Carlo Maley and Arizona State University colleagues describe cell-in-cell phenomena in which one cell engulfs and sometimes consumes another. The study shows that cases of this behavior, including cell cannibalism, are widespread across the tree of life. 

The findings challenge the common perception that cell-in-cell events are largely restricted to cancer cells. Rather, these events appear to be common across diverse organisms, from single-celled amoebas to complex multicellular animals.

Carlo Maley

The widespread occurrence of such interactions in non-cancer cells suggests that these events are not inherently "selfish" or "cancerous" behaviors. Rather, the researchers propose that cell-in-cell phenomena may play crucial roles in normal development, homeostasis and stress response across a wide range of organisms.

The study argues that targeting cell-in-cell events as an approach to treating cancer should be abandoned, as these phenomena are not unique to malignancy. 

By demonstrating that occurrences span a wide array of life forms and are deeply rooted in our genetic makeup, the research invites us to reconsider fundamental concepts of cellular cooperation, competition and the intricate nature of multicellularity. The study opens new avenues for research in evolutionary biology, oncology and regenerative medicine.

The new research , published in the journal Scientific Reports, is the first to systematically investigate cell-in-cell phenomena across the tree of life. The group’s findings could help redefine the understanding of cellular behavior and its implications for multicellularity, cancer and the evolutionary journey of life itself.

“We first got into this work because we learned that cells don’t just compete for resources — they actively kill and eat each other,” Maley says. “That’s a fascinating aspect of the ecology of cancer cells. But further exploration revealed that these phenomena happen in normal cells, and sometimes neither cell dies, resulting in an entirely new type of hybrid cell.”

Maley is a researcher with the Biodesign Center for Biocomputing, Security and Society ; professor in the School of Life Sciences at ASU; and director of the Arizona Cancer Evolution Center .

The study was conducted in collaboration with first author Stefania E. Kapsetaki, formerly with ASU and now a researcher at Tufts University, and Luis Cisneros , formerly with ASU and currently a researcher at Mayo Clinic.  

From selfish to cooperative cell interactions

Cell-in-cell events have long been observed but remain poorly understood, especially outside the context of immune responses or cancer. The earliest genes responsible for cell-in-cell behavior date back over 2 billion years, suggesting the phenomena play an important, though yet-to-be-determined, role in living organisms. Understanding the diverse functions of cell-in-cell events, both in normal physiology and disease, is important for developing more effective cancer therapies.

The review delves into the occurrence, genetic underpinnings and evolutionary history of cell-in-cell phenomena, shedding light on a behavior once thought to be an anomaly. The researchers reviewed more than 500 articles to catalog the various forms of cell-in-cell phenomena observed across the tree of life.

The study describes 16 different taxonomic groups in which cell-in-cell behavior is found to occur. The cell-in-cell events were classified into six distinct categories based on the degree of relatedness between the host and prey cells, as well as the outcome of the interaction (whether one or both cells survived).

A spectrum of cell-in-cell behaviors are highlighted in the study, ranging from completely selfish acts, where one cell kills and consumes another, to more cooperative interactions, where both cells remain alive. For example, the researchers found evidence of "heterospecific killing," where a cell engulfs and kills a cell of a different species, across a wide range of unicellular, facultatively multicellular, and obligate multicellular organisms. In contrast, "conspecific killing," where a cell consumes another cell of the same species, was less common, observed in only three of the seven major taxonomic groups examined.  

Obligate multicellular organisms are those that must exist in a multicellular form throughout their life cycle. They cannot survive or function as single cells. Examples include most animals and plants. Facultative multicellular organisms are organisms that can exist either as single cells or in a multicellular form depending on environmental conditions. For example, certain types of algae may live as single cells in some conditions but form multicellular colonies in others.

The team also documented cases of cell-in-cell phenomena where both the host and prey cells remained alive after the interaction, suggesting these events may serve important biological functions beyond just killing competitors.

“Our categorization of cell-in-cell phenomena across the tree of life is important for better understanding the evolution and mechanism of these phenomena,” Kapsetaki says. “Why and how exactly do they happen? This is a question that requires further investigation across millions of living organisms, including organisms where cell-in-cell phenomena may not yet have been searched for.”

Ancient genes

In addition to cataloging the diverse cell-in-cell behaviors, the researchers also investigated the evolutionary origins of the genes involved in these processes. Surprisingly, they found that many of the key cell-in-cell genes emerged long before the evolution of obligate multicellularity.

“When we look at genes associated with known cell-in-cell mechanisms in species that diverged from the human lineage a very long time ago, it turns out that the human orthologs (genes that evolved from a common ancestral gene) are typically associated with normal functions of multicellularity, like immune surveillance,” Cisneros says.    

In total, 38 genes associated with cell-in-cell phenomena were identified, and 14 of these originated over 2.2 billion years ago, predating the common ancestor of some facultatively multicellular organisms. This suggests that the molecular machinery for cell cannibalism evolved before the major transitions to complex multicellularity.

The ancient cell-in-cell genes identified in the study are involved in a variety of cellular processes, including cell–cell adhesion, phagocytosis (engulfment), intracellular killing of pathogens and regulation of energy metabolism. This diversity of functions indicates that cell-in-cell events likely served important roles even in single-celled and simple multicellular organisms well before the emergence of complex multicellular life.

More Science and technology

Two teenagers hug and smile at each other.

ASU study: Support from romantic partners protects against negative relationship stress in teens

Adolescents regularly deal with high levels of stress, which can increase the risk of substance use and experiencing mental health challenges such as anxiety or depression. Stress can also affect…

A large bluish-white planet in space.

ASU scientists help resolve 'missing methane' problem of giant exoplanet

In the quest to understand the enigmatic nature of a warm gas-giant exoplanet, Arizona State University researchers have played a pivotal role in uncovering its secrets. WASP-107b has puzzled…

A machine in the Instrument Design and Fabrication Core Facility

ASU now certificated to build sensitive aerospace, defense instruments in-house

When Christopher Groppi needs a new tool for work, he can’t just go to the hardware store. Groppi is an experimental astrophysicist at Arizona State University. His research demands unique and…

IMAGES

  1. Important guidelines and tips in writing a high-quality biology case study

    case study on biology

  2. Class 11 Biology Case Study Questions

    case study on biology

  3. IGCSE Biology Study Guide

    case study on biology

  4. Chemical Biology: Learning Through Case Studies: Buy Chemical Biology

    case study on biology

  5. 😀 Microbiology case study examples. Microbiology (Case Study

    case study on biology

  6. Biology Case study

    case study on biology

VIDEO

  1. Solving a biological problem

  2. Biology Tips for Case based Questions and Long answered Questions for tomorrow CBSE exam

  3. Biology Board GAP Strategy To Score 95% 🔥

  4. A-Level Biology

  5. University Vlog: Life of a Biomedical Science International student in the UK. Research MSc Edition

  6. Top 10

COMMENTS

  1. NCCSTS Case Studies

    The NCCSTS Case Collection, created and curated by the National Center for Case Study Teaching in Science, on behalf of the University at Buffalo, contains over a thousand peer-reviewed case studies on a variety of topics in all areas of science. Cases (only) are freely accessible; subscription is required for access to teaching notes and ...

  2. Case Studies

    These four case studies are interactivities based on actual scientific research projects carried out by leading teams in the field. Each case study takes the participant through a series of steps in a research project — just follow the step-by-step instructions to test a hypothesis or analyze data. The case studies provide an interactive ...

  3. Case Studies: Diseases

    ISBN -618-09524-1. Available in Berkeley Public Library. For more about this book, see the listing of it under Cloning and stem cells. The "aging" parts of the book largely deal with telomerase, a fascinating scientific topic which probably is not a key limiting factor in human aging.

  4. Learn Biology with Case Studies at HHMI Biointeractive

    HHMI and Case Studies. HHMI Biointeractive has many interactive resources, case studies, and data analysis. Here is a list of my favorites: The National Center for Case Study Teaching in Science is collaborating with HHMI BioInteractive to pair case studies in our collection with their resources.

  5. Using Case Studies with Large Classes

    I always allow students to talk and help one another during clicker questions to enhance their interaction and give them a choice to go along with a group opinion or answer based on their individual thinking. 4. Flip the classroom. Another effective way to use BioInteractive resources in large classes is to use videos to flip a class session.

  6. 8.1: Case Study: Genes and Inheritance

    This page titled 8.1: Case Study: Genes and Inheritance is shared under a CK-12 license and was authored, remixed, and/or curated by Suzanne Wakim & Mandeep Grewal via source content that was edited to the style and standards of the LibreTexts platform; a detailed edit history is available upon request. People tend to look similar to their ...

  7. Life

    The General and Human Biology Case Studies present topics that will make you think. Each case study includes an overview of the case, along with thought-provoking questions, references, and related web links. Go to the Biology Case Studies: 2002 McGraw-Hill Higher Education

  8. Mitosis, Cancer and the Cell Cycle

    Case Study - Mitosis, Cancer, and the HPV Vaccine. Students in my anatomy class get a quick review of the cell and mitosis. This activity on HPV shows how the cell cycle relates to overall health. In fact, many of the chapters in anatomy have anchoring phenomena on diseases and health. For example, cystic fibrosis is a cellular transport ...

  9. 5.1 Case Study: Genes and Inheritance

    Figure 5.1.1 Family tree - three generations. People tend to carry similar traits to their biological parents, as illustrated by the family tree. Beyond just appearance, you can also inherit traits from your parents that you can't see. Rebecca becomes very aware of this fact when she visits her new doctor for a physical exam.

  10. Case Study Teaching Method Improves Student Performance and Perceptions

    Although case studies were considered a novel method of science education just 20 years ago, the case study teaching method has gained popularity in recent years among an array of scientific disciplines such as biology, chemistry, nursing, and psychology (5-7, 9, 11, 13, 15-17, 21, 22, 24).

  11. Student Designed Case Studies for Anatomy

    Students in my anatomy class complete many case studies throughout the year focused on body system units. Case studies are a way to add a personal story to (sometimes) technical information about physiology. For my high school students, I try to find cases that are about younger people or even children, cases like " A Tiny Heart ," which ...

  12. Browse By Case Study

    Case Study: Painful, Purulent Eye of a 56-Year-Old Male. A 56-year-old male presented with 1-day history of pruritic, painful right eye with associated mucopurulent discharge, blurry vision, headache and photosensitivity. ASM is a nonprofit professional society that publishes scientific journals and advances microbiology through advocacy ...

  13. Case Studies in Systems Biology

    This book provides case studies that can be used in Systems Biology related classes. Each case study has the same structure which answers the following questions: What is the biological problem and why is it interesting? What are the relevant details with regard to cell physiology and molecular mechanisms?

  14. Case Studies in Biology: Climate and Health Exploration Course

    Using case studies focused on climate and it's connection to health, we will analyze data and apply biology concepts to learn about how to form a solid argument, supported by evidence from published research. This is your chance to learn how to conduct systematic literature reviews and meta-analyses to analyze scientific controversies and ...

  15. Integrating population genetics, stem cell biology and cellular

    This is the case for the expression levels of THUMPD1 ... D.R., Cuomo, A.S.E. et al. Integrating population genetics, stem cell biology and cellular genomics to study complex human diseases ...

  16. Free

    The NCCSTS Case Collection, created and curated by the National Center for Case Study Teaching in Science, on behalf of the University at Buffalo, contains nearly a thousand peer-reviewed case studies on a variety of topics in all areas of science. ... Discovery Engineering in Biology: Case Studies for Grades 6-12. Free chapter: The Triumph ...

  17. A Case Study Documenting the Process by Which Biology Instructors

    In this study, we used a case study approach to obtain an in-depth understanding of the change process of two university instructors (Julie and Alex) who were involved with redesigning a biology course. The instructors sought to transform the course from a teacher-centered, lecture-style class to one that incorporated learner-centered teaching.

  18. Case Study: Unusual microbes

    Imagine then the surprise of the scientist who, in 1980, found square bacteria with sharp corners, in concentrated salt solutions. They are not only square, but very thin -- about 200 nm (0.2 μm) thick. They seem to grow as two dimensional objects, increasing the size of their squares, but not their thickness.

  19. Importance of Biology for Engineers: A Case Study

    Further, the covid-19 pandemic that shook the whole world and brought all works to stand still is the best case study to analyze how biology has played a significant role in transitioning through this phase. Virologists from various research institutes started with the genome sequencing of the SARS-COV2 virus strain that kept mutating repeatedly.

  20. Assessing the evolution of research topics in a biological field using

    For example, plant science records from China tend to be enriched in hyperspectral imaging and modeling (topic 9), gene family studies (topic 23), stress biology (topic 28), and research on new plant compounds associated with herbal medicine (topic 69), but less emphasis on population genetics and evolution (topic 86, Fig 5F). In the US, there ...

  21. Biology

    Feature papers represent the most advanced research with significant potential for high impact in the field. A Feature Paper should be a substantial original Article that involves several techniques or approaches, provides an outlook for future research directions and describes possible research applications.

  22. New rule of biology? Instability may be vital to life

    A rule of biology, sometimes called a biological law, describes a recognized pattern or truism among living organisms. Allen's rule, for example, states that among warm-blooded animals, those found in colder areas have shorter, thicker limbs (to conserve body heat) than those in hotter regions, which need more body surface area to dissipate heat.

  23. Re-centering social justice in conservation science: Progressive

    Abstract. From the inception of the field itself, conservation biology has been described as a mission-driven discipline. While the mission orientation has been aligned to protect and recover biodiversity, the manner in which conservation practice has been implemented has, at various times and in various places, come at a cost to the basic rights of local people.

  24. Case Study

    This case study was revised in 2023, get the NEW VERSION! This case study focuses on a baby boy who was born with a problem with his heart. The story is based on a real scenario, though some of the names have been changed, and the parents gave permission to include photos of the infant. Students will read about symptoms that occur when a baby ...

  25. Phylogenetic placement of the monotypic Baolia (Amaranthaceae s.l

    Baolia H.W.Kung & G.L.Chu is a monotypic genus only known in Diebu County, Gansu Province, China. Its systematic position is contradictory, and its morphoanatomical characters deviate from all other Chenopodiaceae. Recent study has regarded Baolia as a sister group to Corispermoideae. We therefore sequenced and compared the chloroplast genomes of this species, and resolved its phylogenetic ...

  26. Bringing Mammoths Back from Extinction

    Students analyze two articles related to "bringing back" mammoths by cloning ancient mammoth DNA: one article is an original research paper and the other is a news article discussing the research. The case is the first of a two-case study sequence (the second case study is "The Stakeholders of Gorongosa National Park: Intersecting ...

  27. Study finds widespread 'cell cannibalism,' related phenomena across

    The study opens new avenues for research in evolutionary biology, oncology and regenerative medicine. The new research, published in the journal Scientific Reports, is the first to systematically investigate cell-in-cell phenomena across the tree of life. The group's findings could help redefine the understanding of cellular behavior and its ...