Examining the envelope protein of SARS-CoV-2
by Dr. Liji Thomas, MDThe novel coronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is taking an immense toll in terms of disease and death on much of the inhabited world, with over 5.8 million confirmed cases and more than 359,000 deaths in just under five months of viral transmission.
A new study by researchers at the University of Valencia and published on the preprint online server bioRxiv* in May 2020 reports the topology of the envelope protein of the virus, which could contribute to a better understanding of how the virus interacts with other cell components and hopefully help to fight the disease better.
The large genome of this RNA virus encodes up to 29 proteins, not all of which are expressed. The structural genes encode the proteins that envelope and package the viral RNA. These include the spike (S) surface protein, the membrane (M) protein, the nucleocapsid (N) protein, and the envelope (E) protein.
SARS-CoV-2 viruses binding to ACE-2 receptors on a human cell, the initial stage of COVID-19 infection, conceptual 3D illustration credit: Kateryna Kon / Shutterstock
The E Protein
The current study focuses on the E protein, which is the smallest and has the lowest number of copies in mature viruses. It has, however, been found to be essential for the disease-causing ability of other coronaviruses (CoVs). It is encoded by a sgRNA, which is among the transcripts with the highest copy number.
The E protein is 75 residues long, with 27 of them being either leucine or valine, making it hydropathic. There is an insignificant similarity between the E proteins in different CoVs known to infect humans. Overall, the E proteins in SARS-CoV and SARS-CoV-2 share 94.7% of sequences.
Which End is Up?
The E protein was subjected to computer-aided analysis of the amino acid sequence by 7 different methods in widespread use. The results suggest that there is a single segment around residues 12-39 that extends through the membrane – a transmembrane (TM) segment. It is not a cleavable signal sequence according to the predictions.
The N-terminal end of this segment is predicted to be at the cytosolic side according to 2 programs, while the remaining methods predict that it will be on the luminal side. To probe this, the researchers used a method called N-linked glycosylation that is used widely to show the topology of a feature, for over 20 years.
N-Linked Glycosylation Reporter
A eukaryotic cell can glycosylate a protein only on the luminal side of the endoplasmic reticulum (ER) due to the location of the oligosaccharyltransferase, the protein that performs this reaction, on this side. The predicted TM segment has two sites that could undergo N-linked glycosylation to the C-terminal side in the wildtype sequence.
One of these positions, however, is not modifiable because the glycosylation acceptor site is too close to the membrane. Therefore, if the single remaining N66 site is glycosylated, it would indicate that C-terminal translocation had occurred. This would, in turn, indicate the ER luminal orientation of the C-terminal end.
The investigators built a construct with a very efficient glycosylation acceptor site at the N-terminal end. They allowed the E protein to be translated in vitro with microsomes present.
The translation of a control without a glycosylation acceptor site or not possessing the wildtype sequences will lead to minimal glycosylation. However, the experiment showed that significant glycosylation occurred only if the glycosylation site was present at the N-terminal end.
The researchers took into account the multiple topologies that have been found with other CoVs, which means the E protein of SARS-CoV-2 may be inserted into the microsomal membrane either with the C-terminal or with the N-terminal end facing the cytosol.
They, therefore, considered that the dominant orientation was with the N-terminal end oriented to the lumen and the C-terminal end to the cytosol.
Topology of the E Protein in A Mammalian Cell
Next, they transfected a set of E protein variants in which the c-myc epitope was tagged at the C-terminal end into a mammalian cell culture. They found that only if the acceptor site was at the N-terminal end did the E protein undergo efficient glycosylation. Obviously, this indicates that the N-terminal end is localized to the ER lumen.
The Effect of Charge Distribution on Topology
The determinants of the topology of a membrane protein include the distribution of positively charged amino acids in the cytosol, called the “positive-inside rule”. This is the main parameter, as has been established both by experiments and by statistical analysis. This is explained thus: the E protein is a TM structure with a charge being distributed equally on both sides of the membrane. With only 8 charged residues, there are 2 negatively charged residues before the TM segment, and 1 at the C-terminal end. This end, which is predicted to be oriented to the cytosol, also has 5 positively charged residues. This is in agreement with the “positive-inside rule.”
However, negatively charged residues also affect topology. To confirm the above hypothesis, the researchers added a glycosylation tag and changed the 2 negatively charged residues to the N-terminal side with 2 lysine residues. This mutant E protein continued to have its C-terminal end oriented to the cytosolic side of the microsomal membrane, without the occurrence of any glycosylation.
Topology of Viral Membrane Protein Depends on Host Protein
This observation led them to conclude that viral membrane protein topology is only slightly affected by the topology of these determinants. Rather, they hypothesize, the topology of a viral membrane protein is likely to change as the host cell’s protein environment changes so that the former will be in the right orientation concerning the latter.
The researchers conclude that the E protein of the SARS-CoV-2 is a TM protein that has the N-terminal end oriented luminally and the C-terminal end oriented cytosolically. This is the same topology seen with the E protein of the SARS-CoV when the virus is infecting a mammalian cell.
This also agrees with the recent studies that show the E protein of the SARS-CoV to be a pentameric structure when the protein is contained within micelles. Here the C-terminal end of the protein forms an extramembranous alpha-helix.
In addition, this proposed topology is compatible with an interaction between the cytosolically oriented C-terminal end of the E protein and the C-terminal ends of the M or S proteins of SARS-CoV-2. It also allows the interaction of this end with Golgi scaffold proteins, which is important for virus budding, and to collect viral membrane proteins so that they can be assembled at the Golgi membranes, thus regulating the movement of vesicles through the Golgi complex. The researchers expect further studies to discover to what extent such functions require the E protein of the virus.
*Important Notice
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.
Journal reference:
- Duart, G. et al. (2020). SARS-CoV-2 Envelope Protein Topology in Eukaryotic Membranes. bioRxiv preprint. doi: https://doi.org/10.1101/2020.05.27.118752. https://www.biorxiv.org/content/10.1101/2020.05.27.118752v1