The Pfam protein families database: towards a more sustainable future

Archive ouverte

Finn, Robert D. | Coggill, Penelope | Eberhardt, Ruth Y. | Eddy, Sean R. | Mistry, Jaina | Mitchell, Alex L. | Potter, Simon C. | Punta, Marco | Qureshi, Matloob | Sangrador-Vegas, Amaia | Salazar, Gustavo A. | Tate, John | Bateman, Alex

Edité par CCSD ; Oxford University Press -

International audience. In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool.

Suggestions

Du même auteur

Establishing the ELIXIR microbiome community

Archive ouverte | Batut, Bérénice | CCSD

International audience. Microbiome research has grown substantially over the past decade in terms of the range of biomes sampled, taxa identified, and the volume of sequence data derived from the samples. In particu...

InterPro in 2011: new developments in the family and domain prediction database

Archive ouverte | Hunter, Sarah | CCSD

International audience. InterPro (http://www.ebi.ac.uk/interpro/) is a database that integrates diverse information about protein families, domains and functional sites, and makes it freely available to the public v...

Rfam 14: expanded coverage of metagenomic, viral and microRNA families

Archive ouverte | Kalvari, Ioanna | CCSD

International audience. Rfam is a database of RNA families where each of the 3444 families is represented by a multiple sequence alignment of known RNA sequences and a covariance model that can be used to search for...

Chargement des enrichissements...