The accompanying discussion guide for this D4P event can be found here.
In this installment of D4P, Dr. Andrés Mansisidor presents a review article that reviews data from over 30 peer-reviewed publications that examine the genetic lineage and sequence similarities of the SARS-CoV-2 virus in humans. This paper, “The Proximal Origin of SARS-CoV-2” was published in Nature Medicine in April 2020, and has already been viewed nearly FIVE MILLION times.
What you need to know:
Since the beginning of the COVID-19 global pandemic, many conspiracies around the origins of the SARS-CoV-2 have arisen. To help assuage these concerns, scientists from multiple institutions across the globe came together to synthesize all of the genomic data that is currently available for the new coronavirus, and how it compares to genomic data to related coronaviruses (in humans and otherwise). The authors clearly show that SARS-CoV-2 is a naturally occurring virus, and was not made in a laboratory (i.e. this virus was not synthetically made by humans).
The following are the key terms defined in this D4P presentation, and are important for understanding the data presented.
The phrase “central dogma” describes the flow of genetic information within a biological system, from DNA to RNA to protein (or, for SARS-CoV-2: RNA to mRNA to protein). By understanding how this process takes place in nature, scientists have developed a variety of molecular technologies that provide insights into biological processes.
Receptor Binding Domain
As described by the central dogma of molecular biology, DNA is translated into RNA, which is then transcribed into protein. We can think about proteins in terms of their amino acid sequence, which will dictate the protein’s three dimensional structure. Within this three dimensional structure there are regions — or subdomains — that can serve a specific function. The receptor binding domain (rbd) is an example of a protein subdomain that is important for interacting with another type of protein (a receptor). In the case of SARS-CoV-2, the surface spike protein has a subdomain that is important for interacting with the human ACE2 receptor. This region of amino acids (including the 3D shape) is known as the RBD of the SARS-CoV-2 spike protein, and has been a key player for the COVID-19 pandemic.
Sequence alignment is a mechanism that allows for the identification of similarities and differences between related molecules of DNA, RNA, or protein. For example, we can examine a specific enzyme (protein) from a variety of organisms, and align sequences to see which amino acids are conserved, and which amino acids are changed. Sequence alignment is a powerful tool that provides clues into evolutionary relationships, as well as insights into structure-function relationships. In the paper discussed here, researchers used sequence alignment for the spike proteins from a variety of difference coronaviruses (from bats, pangolins, and humans) as a means for comparison.
In molecular biology, the “molecular clock” refers to the rate at which mutations are incorporated into a genomic sequence. When we compare specific genomic sequences over time, or across species, it can give a sense of timing (i.e. which version of the sequence came first), and help reveal evolutionary divergence. Here, scientists apply the molecular clock to the RBD of the spike protein to determine if bats or pangolins might have been the source for the SARS-CoV-2 virus that can now infect humans.
About our D4P Fellow
Andrés Mansisidor, PhD (they/them)
Dr. Mansisidor is a postdoctoral researcher in the Laboratory of Genome Architecture and Dynamics at The Rockefeller University. Andrés hearts all things DNA. During PhD training, they studied how DNA breaks and repair help genome evolution. As a postdoc, they are continuing to peer into the genome and it’s organization. Andrés is a proponent of variation in it’s many contexts, and they adore the famous Dobzhansky quote, “Nothing in Biology Makes Sense Except in the Light of Evolution.”