Coronavirus SARS-CoV-2: scientifically accurate 3D model
Date:
12.05.2020
3D model of the SARS-CoV-2 virus at atomic resolution
  • Share

Biomedical visualization studio Visual Science has created the most detailed and scientifically accurate 3D model of the SARS-CoV-2 virus at atomic resolution. The model is based on the latest scientific research into the structure of coronaviruses, as well as input from expert virologists involved in the research. This is the most accurate model of the SARS-CoV-2 viral particle currently available. To produce it, Visual Science employed the same techniques of structural bioinformatics used in basic research and drug development.

The SARS-CoV-2 virus model is a part of Visual Science’s non-commercial Viral Park project. Viral Park’s past successes include models of HIV, influenza A/H1N1, Ebola, papilloma, and Zika virions.

We use the same color scheme throughout the whole Viral Park project. Bright colors show the proteins encoded by the viral genome. Shades of gray correspond to the structures taken by virus from the host cell. Thus we emphasize the parasitic and non-autonomous nature of the viruses.

Models and visualizations created for the Viral Park have received prizes from Science magazine and the National Science Foundation, and have been featured in leading media outlets such as Science, Nature Medicine, The New York Times, The Washington Post, Scientific American, Wired UK, Der Spiegel, Stern, National Geographic, GEO, and more.

In their usual stunning style, the talented illustrators at Visual Science have created a model of the SARS-CoV-2 virus particle. It shows, in great detail, the intact particle with its spike glycoproteins embedded in the membrane. A cutaway view reveals the viral nuclecapsid inside the particle. These gorgeous images will enhance our understanding of the virus particle and for the non-scientist will make even more palpaple the virus that is infecting millions of people.

— Vincent Racaniello, Ph.D. Higgins Professor Department of Microbiology and Immunology Columbia University, New York

Great work on the spike and the images! Really well done. The spikes are about as accurate as they can be given our current knowledge. The images are striking.

— Dr. Jason S. McLellan, Associate Professor, Molecular Biosciences Department at the University of Texas at Austin

Wow, that’s really neat — I love the model where it is open and you can see the RNP — new favorite SARS-CoV-2 model.

— Dr. Benjamin Neuman, Professor and Chair of Biological Sciences, Texas A-M University-Texarkana

The model was built based on assumptions derived from all available experimental data. It is a good model. Nonetheless, one should keep in mind that the original crystal structure of CTD was obtained in the absence of RNA. Binding of RNA is likely to compact the structure due to strong protein-RNA interaction.

— Prof. Dr. Tai-huang Huang, Distinguished Research Fellow, Division of Structural Biology, Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei

Structure of the virus

Coronaviral particles are pleiomorphic, which means that the virion’s shape can vary. However, most particles are oval or roughly spherical, with diameters ranging between 50 and 150 nanometers [23, 24]. The morphology of the particle in our 3D model is based on cryo-EM images of SARS-CoV virions, which are very closely related to the novel coronavirus SARS-CoV-2 [25, 26].

Surface proteins

Located on the virion’s surface, spike proteins form trimeric structures which fuse with the host cell, starting the infection. Each monomer of the spike glycoprotein consists of three parts: S1 at the top, which interacts with the host cell receptor, opposed by S2 and S2′, which mediate the virion’s fusion with cellular membranes.

  • Surface proteins of SARS-CoV-2

  • Surface proteins of SARS-CoV-2

The cryo-EM structure, SARS-CoV-2 spike glycoprotein trimeric complex 6VSB, published recently by Wrapp et. al from McLellan Lab, only includes the top domains of the spike [27]. The full structure should also have a “stem”, with a transmembrane region and its endodomains located inside the particle. According to the authors of the PDB structure, when engaging a host cell receptor, the receptor-binding domain (RBD) of S1 undergoes hinge-like conformational movements that transiently hide or expose the determinants of the receptor binding. These two states are referred to as the “down” conformation and the “up” conformation, with down corresponding to the receptor-inaccessible state, and up to the receptor-accessible state. Our model shows not only the “up” conformation from the PDB entry, but also the RBD intermediate positions.

While the S-protein is heavily glycosylated, structural biologists continue to study — and debate — the glycans’ exact sites and composition [28].

Spike proteins include transmembrane domains, but are also anchored in the virion membrane by palmitic acid residues [29, 30].

The average SARS-CoV-2 particle has about 90 spike trimers [24].

Matrix proteins

The membrane protein (M) facilitates viral assembly by interacting with other components of the particle, termed S, E and N. The M protein forms dimers of two spatial conformations: long and compact, with both embedded in the membrane [31].

Compact M dimers are believed to differ from long M dimers by having more amino acids integrated into the membrane’s inner layer. When these parts of the protein are pulled out, membrane phospholipids enter the free space and the membrane becomes curved [32, 33].

M protein dimers interact with each other, along with the lipid membrane taken from the host cell, to form the main layer of the viral envelope. The distance between the dimers is about 3-4 nm. M dimers interact with viral spikes. The long forms of M proteins also interact with nucleocapsid via their C-terminal domains, located inside the virion.

The outer part of the M protein is glycosylated. There are approximately 1100 M protein molecules in a virion, most of which take the long form [31].

Envelope proteins

The envelope protein (E) is a minor component of the viral particle, forming pentameric ion channels that disrupt host cell membranes during viral budding [34].

E-proteins interact with M and N proteins during viral budding, as well as with the non-structural proteins 3a and 7a, in some cases found packed into the virion (24).

Nucleocapsid and genome structure

Nucleocapsid proteins package the viral genome RNA and play a fundamental role during virus assembly. Little is known so far about the exact structure of the nucleocapsid of coronaviruses. Our model shows one of the possibilities, but we have tried to consider as many of the established facts as possible

  • Cutaway of the virus’s membrane. Fragment of the nucleocapsid. Envelope protein.

  • Cutaway of the virus’s membrane. Fragment of the nucleocapsid. Envelope protein.

  • Cutaway of the virus’s membrane. Fragment of the nucleocapsid. Envelope protein.

The nucleocapsid protein of coronaviruses contains many regions, with a high level of intrinsic disorder. Essentially, the protein consists of two globular domains: the C-terminal domain (CTD) and the N-terminal domain (NTD), as well as a linker between the two domains, and some flexible disordered tails. [35].

The facts established by cryo-EM, biochemical and crystallography works include the following:

  • — CTDs form tightly connected dimers, but the dimerization of NTDs has not yet been substantiated [36];
  • — Both domains can interact with the RNA;
  • — A nucleocapsid helix or chain can be detected inside the virions. It is thought to be about 15 nm wide [37]. and to occupy less than the particle’s entire volume, leaving the very center of the particle free [24];
  • — CTD is thought to mediate oligomerization of the nucleocapsid;
  • — Under the virus envelope, the nucleocapsid helix interacts with M-protein dimers via the flexible ends of the CTD domains [24];
  • — Not all the RNA may be bound to the nucleocapsid.

Many CTD and NTD structures have been determined and are available in the Protein Data Bank (PDB). Some provide notions about the possible interactions between these domains.

Host proteins and non-structural proteins of the virus

Studies on SARS-Co-V2 have sometimes detected other host and virus-encoded proteins, in addition to the main structural proteins S, M, N, and E [24]. To illustrate this, we included several such structures in our model, using models already available in databases without no additional modifications.

Non-structural protein 3

NSP3 is a large multidomain virus-encoded protein with RNA-binding and peptidase activities. Acting as a “hub” molecule that interacts with many viral proteins [38].

Cyclophilin A (Peptidyl-prolyl cis-trans isomerase A)

Proteins of this group, known as PPIases, accelerate protein folding. Cyclophilin A catalyzes the cis-trans isomerization of proline imidic peptide bonds in oligopeptides. Human cyclophilin A is reported to bind N proteins in coronaviruses [39].

Actin

Actin is an abundant protein in cellular cytoplasm. As such, it is frequently found in many viral particles, along with other structural proteins [38].

Heat shock protein HSP 90-alpha

Heat shock proteins are present in cellular cytoplasm in high copy numbers. HSPs promote maturation and structural maintenance of various proteins, and can therefore often be accidentally incorporated into budding viral particles [38].

Protein Kinase CK2

Coronavirus nucleoproteins are phosphorylated by host protein kinases. This interaction often leads kinases to become packaged into the virions [39].

Modeling process

Surface protein

We used molecular modeling approaches to add the transmembrane and intravirion sections, as well as the amino acid chains that connect them and form the spike’s “stem”. The stem is shown as a coil of three alpha-helices made of different monomers. We added small flexible loops in the upper part of the model, which were missing from the cryo-EM structure.

There are about 60 residues that should likely form a triple helix before the TM region starts at residue 1208.

— Dr. Jason S. McLellan, Associate Professor, Molecular Biosciences Department at the University of Texas at Austin

We made different versions of the spikes, one with the RBD “up”, and with all RBDs “down”. In the model, most of the spikes are of the first type, in accordance with the published data.

At this point, based on cryo-EM, we just know that the majority of spikes have 1 RBD up, although the degree to which it is up seems to vary.

— Dr. Jason S. McLellan, Associate Professor, Molecular Biosciences Department at the University of Texas at Austin

Glycosylation was partly taken from the 6VSB and partly added using the Glyprot server. The 1241st cysteine in each chain was palmitoylated.

Matrix proteins

Modeling of the M proteins was realised de novo using the I-TASSER server. The probable dimeric forms were selected based on molecular docking simulations with truncated forms of M monomers lacking N- and C-terminal regions. After docking, the missing parts were added to form the full structures of the proteins. The long and compact M proteins were modeled using different positions of the probable amphipathic helix. The N-terminal ends of the proteins were glycosylated at the 5th asparagine position using Glyprot.

We know that the C-terminal makes contact with the N protein, and that seems to happen around 4-5 nanometers from the phosphates on the inner side of the membrane.

— Dr. Benjamin Neuman, Professor and Chair of Biological Sciences, Texas A-M University-Texarkana

In the model, the compact M proteins are positioned on the cylindrical part of the virion, where the membrane is less curved, and their long counterparts are clustered at the virion’s hemispherical ends.

Envelope protein

We used PDB structures available for related coronavirus E proteins as a template to model that of SARS-Co-V2, using I-Tasser 5.0, Pymol, i3DRefine, and Gromacs.

Nucleocapsid

We used homology modeling based on structures available in the PDB to model the N- and C-terminal domains. We then created docking simulations for the domains to identify how they might be oriented and positioned relative to one another. The orientations revealed possible sites for linkers, RNA, and N3 domains, which could interact with M-proteins in the membrane. Our model is based on the structures of the SARS-CoV CTD dimers and NTD domains as available in the PDB. Using homology-based molecular modeling, de novo modeling and molecular docking, we built a model of the full SARS-CoV nucleocapsid protein, and assembled it into an elongated chain including the viral RNA.

In constructing the nucleocapsid assembly, we assumed that CTD dimers form the chain’s oligomerized inner core, and that the NTD domains secure the genome RNA from the exterior of the nucleocapsid. The RNA’s position in our model, inside the nucleocapsid chain, is based on the molecular docking simulations, in which the RNA oligonucleotide (13 nt) probe remained flexible. The simulations allowed us to chose positions on the N protein’s surface that would likely bind RNA fragments. The linear fragments from the simulation were subsequently united into one RNA chain.

Host proteins and non-structural proteins of the virus

For these minor elements of the particle we took already available models of the structures from ModBase, PDB, and Zhang Lab website [40].

Read more about other aspects of coronaviruses biology on the section with the scientific animation of SARS-CoV-2.

Show references
  • Head of the project, Look Dev:
    Ivan Konstantinov
  • Project coordinator, scientific consultant:
    Yury Stefanov, Ph.D.
  • Literature review
    Anna Zyrina, Ph.D., Yury Stefanov, Ph.D.
  • Molecular modeling:
    Dmitrii Shcherbinin, Ph.D., Anastasia Bakulina, Ph.D., Marina Pak
  • 3D-modeling and animation:
    Maxim Kulemza
  • Animation:
    Yury Stefanov
  • Animation superviser:
    Sergey Ivanchuk
  • Design:
    Elizaveta Oreshkina
  • Sound:
    Bad Zu, Sounds like a plan Studio
  • Project managers:
    Ivan Konstantinov, Yury Stefanov

We would like to thank Dr. Benjamin Neuman, Professor and Chair of Biological Sciences, Texas A&M University-Texarkana; Dr. Tai-huang Huang, Distinguished Research Fellow, Division of Structural Biology, Institute of Biomedical Sciences, Academia Sinica, Nankang, Taipei; and Dr. Jason S. McLellan, Associate Professor, Molecular Biosciences Department at the University of Texas at Austin; authors of multiple research papers on the structure of coronaviruses or viral proteins, for their recommendations and helpful discussion of various aspects of the organization and genome packing of the SARS-CoV-2 virion.