The heterogeneous landscape and early evolution of pathogen-associated CpG dinucleotides in SARS-CoV-2

Archive ouverte

Di Gioacchino, Andrea | Sulc, Petr | Komarova, Anastassia V | Greenbaum, Benjamin D | Monasson, Remi | Cocco, Simona

Edité par CCSD ; Oxford University Press (OUP) -

International audience. SARS-CoV-2 infection can lead to acute respiratory syndrome in patients, which can be due in part to dysregulated immune signalling. We analyze here the occurrences of CpG dinucleotides, which are putative pathogen-associated molecular patterns, along the viral sequence. Carrying out a comparative analysis with other ssRNA viruses and within the Coronaviridae family, we find the CpG content of SARS-CoV-2, while low compared to other betacoronaviruses, widely fluctuates along its primary sequence. While the CpG relative abundance and its associated CpG force parameter [1] are low for the spike protein (S) and comparable to circulating seasonal coronaviruses such as HKU1, they are much greater and comparable to SARS and MERS for the 3’-end of the viral genome. In particular, the nucleocapsid protein (N), whose transcripts are relatively abundant in the cytoplasm of infected cells and present in the 3’UTRs of all subgenomic RNA, has high CpG content. We speculate this dual nature of CpG content can confer to SARS-CoV-2 high ability to both enter the host and trigger pattern recognition receptors (PRRs) in different contexts. We then investigate the evolution of synonymous mutations since the outbreak of the COVID-19 pandemic. Using a new application of selective forces on dinucleotides to estimate context driven mutational processes, we find that synonymous mutations seem driven both by the viral codon bias and by the high value of the CpG force in the N protein, leading to a loss in CpG content. Sequence motifs preceding these CpG-loss-associated loci match recently identified binding patterns of the Zinc Finger anti-viral Protein (ZAP) protein.

Suggestions

Du même auteur

A transfer-learning approach to predict antigen immunogenicity and T-cell receptor specificity

Archive ouverte | Bravi, Barbara | CCSD

International audience. Antigen immunogenicity and the specificity of binding of T-cell receptors to antigens are key properties underlying effective immune responses. Here we propose diffRBM, an approach based on t...

Generative and interpretable machine learning for aptamer design and analysis of in vitro sequence selection

Archive ouverte | Di Gioacchino, Andrea | CCSD

International audience. Selection protocols such as SELEX, where molecules are selected over multiple rounds for their ability to bind to a target of interest, are popular methods for obtaining binders for diagnosti...

sgDI-tector: defective interfering viral genome bioinformatics for detection of coronavirus subgenomic RNAs

Archive ouverte | Di Gioacchino, Andrea | CCSD

International audience. Coronavirus RNA-dependent RNA polymerases produce subgenomic RNAs (sgRNAs) that encode viral structural and accessory proteins. User-friendly bioinformatic tools to detect and quantify sgRNA ...

Chargement des enrichissements...