Improving sequence-based modeling of protein families using secondary structure quality assessment

Archive ouverte

Malbranke, Cyril | Bikard, David | Cocco, Simona | Monasson, Rémi

Edité par CCSD -

Motivation: Modeling of protein family sequence distribution from homologous sequence data recently received considerable attention, in particular for structure and function predictions, as well as for protein design. In particular, Direct Coupling Analysis, a method to infer effective pairwise interactions between residues, was shown to capture important structural constraints and to successfully generate functional protein sequences. Building on this and other graphical models, we introduce a new framework to assess the quality of the secondary structures of the generated sequences with respect to reference structures for the family. Results: We introduce two scoring functions characterizing the likeliness of the secondary structure of a protein sequence to match a reference structure, called Dot Product and Pattern Matching. We test these scores on published experimental protein mutagenesis and design dataset, and show improvement in the detection of non-functional sequences. We also show that use of these scores help rejecting non-functional sequences generated by graphical models (Restricted Boltzmann Machines) learned from homologous sequence alignments.

Suggestions

Du même auteur

Computational design of novel Cas9 PAM-interacting domains using evolution-based modelling and structural quality assessment

Archive ouverte | Malbranke, Cyril | CCSD

International audience. We present here an approach to protein design that combines (i) scarce functional information such as experimental data (ii) evolutionary information learned from a natural sequence variants ...

Parameters and determinants of responses to selection in antibody libraries

Archive ouverte | Schulz, Steven | CCSD

International audience. The sequences of antibodies from a given repertoire are highly diverse at few sites located on the surface of a genome-encoded larger scaffold. The scaffold is often considered to play a less...

Protein and RNA Structure Prediction by Integration of Co-Evolutionary Information into Molecular Simulation

Archive ouverte | de Leonardis, Eleonora | CCSD

59th Annual Meeting of the Biophysical-Society, Baltimore, MD, FEB 07-11, 2015. International audience. no abstract

Chargement des enrichissements...