Since the beginning of the coronavirus disease 2019 (COVID-19) pandemic, scientists have focussed on the immune response to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and vaccination. Understanding public responses to particular antigens are critical for uncovering the molecular features of recurring antibodies within the diverse antibody repertoire. It is also important for the development of effective vaccines to combat the pandemic.
Study: A large-scale systematic survey of SARS-CoV-2 antibodies reveals recurring molecular features. Image Credit: Andrii Vodolazhskyi/Shutterstock
In a new study, published on the bioRxiv * preprint server, scientists have mined information from several research publications and patents to create a dataset of approximately 8000 human antibodies to the SARS-CoV-2 spike from more than 200 donors.
Key findings
As mentioned previously, scientists collected 8048 antibodies from 215 donors. The data were compiled from 88 research publications and 13 patents that characterized antibodies to SARS-CoV-2. 99.4% of the human antibodies were found to react with SARS-CoV-2, while the remaining reacted with SARS-CoV or seasonal coronaviruses.
Out of the 7997 antibodies for SARS-CoV-2, 7923 bound to the spike (S) protein, 49 bound to the nucleocapsid (N) protein, and 25 bound to the ORF8. Further, epitope information was available for most SARS-CoV-2 S antibodies. Researchers also collected (where available) information on other features, such as germline gene usage, sequence, structure, bait for isolation, etc.
The vast dataset enabled scientists to study patterns of germline gene usages in antibodies targeting different domains on the S protein, such as the RBD, NTD, and S2. The nature of public antibody response was observed to vary across domains. The current study successfully demonstrated the diversity of sequence features that could constitute a public antibody response against a single antigen.
Researchers highlighted that a public antibody response may not always involve a defined pair of IGHV/IGK(L)V genes. This is especially true when either IGHV or IGK(L)V gene-encoded residues only contribute marginally to the paratope. The highly conserved stem region of influenza hemagglutinin possesses a paratope, which is completely attributed to the IGHV1-69 heavy chain.
A structural analysis might be needed to confirm fully, but IGHV3-30/IGHD1-26 antibodies to S2 in the current study could represent an identical IGK(L)V-independent public antibody response. The types of public antibody responses to SARS-CoV-2 S protein are diverse. This implies that using the conventional strict definition of public clonotype to study public antibody responses may not be adequate.
Research has shown that the public antibody response to different antigens can have very different sequence features. As an example, IGHV6-1 and IGHD3-9 are indicators of public antibody response to the Influenza virus, while IGHV3-23 is often used in antibodies to Dengue and Zika viruses. However, these germline genes are rarely used in the antibody response to SARS-CoV-2.
The amino acid sequence determines the structure of an antibody, which determines its binding capacity. Due to this reason, the antigen specificity of an antibody can theoretically be identified based on its sequence. The current study provides a proof-of-concept by training a deep learning model to differentiate between SARS-CoV-2 S antibodies and influenza HA antibodies, primarily based on primary sequence information.
Technological innovations and next steps
The speed of antibody discovery and characterization has been significantly accelerated by technological advancements. The development of a single-cell high-throughput screen using the Berkeley Lights Beacon optofluidics device and advances in paired B-cell receptor sequencing has been quite instrumental in this regard. Researchers are optimistic that as larger amounts of sequence information on antibodies to different antigens is accumulated, they may construct a generalized sequence-based model to predict the antigen specificity of any antibody accurately.
Conclusion
The amount of publicly available information on SARS-CoV-2 antibodies has provided novel and invaluable insights. Such insights have not been readily available for other pathogens. One of the reasons for this wide availability of SARS-CoV-2 antibodies is that the pandemic has been so severe that it has led scientists from many fields and around the globe to work intensively on SARS-CoV-2. This parallel effort by many different research groups has led to the advancement of our collective knowledge at an unprecedented speed and scale.
Scientists hope that the knowledge of the molecular features of the antibody response to SARS-CoV-2 will increase as more antibodies are isolated and characterized. This will help in answering some of the fundamental questions about antigenicity and immunogenicity. It will also help us understand how the human immune repertoire has evolved to respond to particular viral pathogens that have coexisted with humans for several hundred years.
*Important notice
bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.