Originally posted by Cerebrum123
View Post
I've got an undergrad in bioengineering, but I'm not any kind of biologist. This is hard.
I hit these papers, dig in for a bit, and then bounce. And then I hit them again, get a little deeper, and bounce again. We're into double-digits, and weeks now, on my attempts to make sense of this one paper. I've finally made it all the way through, but there's big gaps in what I'm understanding. I've been wicked busy this entire time, too, especially last week, finals week, which is always crazy, even without the extra-special load from doing it all remote. I went three days without sleeping last week. I've been doing this for the last decade. I pull all-nighters pretty regularly, but no, not that, never.
So here's the big picture, Figure 1 from Andersen.
Andersen 1.jpg
SARS-CoV-2 is RNA-based, 29,903 bases long, or about ten thousand amino acids.
From Zheng-Li Shi, head of the bat lab at WIV, its closest match at 96.2 percent is RaTG13, captured in 2013 for WIV in Pu'er, Yunnan Province about 1200 miles from Wuhan.
Simplot analysis showed that 2019-nCoV was highly similar throughout the genome to RaTG13 (Fig. 1c), with an overall genome sequence identity of 96.2%.
RaTG13 was published to the gene bank in late January. From Susanna K.P. Lau out of Hong Kong, it's at 96.1 percent similarity to SARS-CoV-2 with another close match at 89.7 percent, Pangolin-SARSr-CoV/Guangdong/1/2019, captured in 2019 in Guangzhou, Guangdong Province about 600 miles from Wuhan.
Wuhan Yunnan Guangdong.jpg
The difference between SARS-CoV-2 and RaTG13 is just under four percent, 1200 bases, or 400 amino acids.
In news reporting posted earlier in-thread, it was estimated that difference would require 20 to 50 years to evolve in the wild. WIV has had RaTG13 for seven years. In the same article a researcher speculated that because most of the differences from RaTG13 are in the receptor binding domain and show selection for the human-ACE2 receptor, the gap might have been bridged by growing it in human lung cell cultures. That's called a "gain of function" experiment. The goal is to discover mutations which could occur in nature and negatively impact human health.
Yes, that's as dangerous as it sounds.
So let's look at the receptor binding domain.
Andersen, Figure 1a.
Andersen 1a.jpg
The spike region codes for 1285 amino acids. The receptor binding domain spans 59 amino acids. Six target human-ACE2. One is found in RaTG13. All are found in a consensus pangolin coronavirus.
The pangolin coronavirus sequences are a consensus generated from SRR10168377 and SRR10168378 (NCBI BioProject PRJNA573298).
The pangolin differs in the RBD by only one amino acid, but it takes a consensus of two viruses to get there. RaTG13 is off by eleven. Eleven is nowhere close to the entire difference, but still, that's nearly 20 percent off here vs 4 percent off overall. If SARS-CoV-2 came from RaTG13, that points to selection pressure.
Now, about those O-linked glycans.
Figure 1b.
Andersen 1b.jpg
That's a blow-out of 27 amino acids between the S1 and S2 subunits. There is no difference between RaTG13 and SARS-CoV-2 outside the cleavage site. There are two differences between them and the consensus pangolin. All three match at the O-linked glycan residues, which obviates an important issue:
Finally, the generation of the predicted O-linked glycans is also unlikely to have occurred due to cell-culture passage, as such features suggest the involvement of an immune system.
If SARS-CoV-2 came from RaTG13, that leaves only the polybasic cleavage site unaccounted for.
Comment