Parallel Position Weight Matrices Algorithms

Archive ouverte

Giraud, Mathieu | Varré, Jean-Stéphane

Edité par CCSD ; Elsevier -

International audience. Position Weight Matrices (PWMs) are broadly used in computational biology. The basic problems, Scan and MultipleScan, aim to find all the occurrences of a given PWM or a set of PWMs in long sequences. Some other PWM tasks share a common NP-hard subproblem, ScoreDistribution. The existing algorithms rely on the enumeration on a large set of scores or words, and they are mostly not suitable for parallelization. We propose a new algorithm, BucketScoreDistribution, that is both very efficient and suitable for parallelization. We bound the error induced by this algorithm. We realized a GPU prototype for Scan, MultipleScan and BucketScoreDistribution with the CUDA libraries, and report for the different problems speedups larger than 10× on several Nvidia cards.

Suggestions

Du même auteur

Manycore high-performance computing in bioinformatics

Archive ouverte | Varré, Jean-Stéphane | CCSD

Mining the increasing amount of genomic data requires having very efficient tools. Increasing the efficiency can be obtained with better algorithms, but one could also take advantage of the hardware itself to reduce the applicatio...

Biomanycores, open-source parallel code for many-core bioinformatics

Archive ouverte | Giraud, Mathieu | CCSD

International audience. Biomanycores is a collection of bioinformatics tools, designed to bridge the gap between researches in OpenCL/CUDA high-performance computing on GPU and other "manycore processors" and usual ...

Suivi de la leucémie résiduelle par séquençage haut-débit

Archive ouverte | Giraud, Mathieu | CCSD

National audience. Le séquençage à haut débit offre de nouvelles perspectives pour le suivi de la leucémie. Nous proposons un algorithme pouvant traiter des millions de séquences, capable de différencier les réarran...

Chargement des enrichissements...