Parallel Position Weight Matrices Algorithms

Archive ouverte

Giraud, Mathieu | Varré, Jean-Stéphane

Edité par CCSD -

International audience. Position Weight Matrices (PWMs) are broadly used in computation biology. The basic problem, Scan, aims to find the occurrences of a given PWM in large sequences. A number of other PWMs tasks share a common subproblem, ScoreDistribution, that has been shown to be NP-hard. The existing algorithms rely on the enumeration on a large set of scores or words, and they are mostly not suitable for parallelization. We propose a new algorithm, BucketScoreDistribution, that is both very efficient and suitable for parallelization. We bound the error induced by this algorithm. We realized a GPU prototype for Scan and BucketScoreDistribution with the CUDA libraries, and report for the different problems speedups of 21x and 77x on a Nvidia GTX 280.

Suggestions

Du même auteur

Manycore high-performance computing in bioinformatics

Archive ouverte | Varré, Jean-Stéphane | CCSD

Mining the increasing amount of genomic data requires having very efficient tools. Increasing the efficiency can be obtained with better algorithms, but one could also take advantage of the hardware itself to reduce the applicatio...

Biomanycores, open-source parallel code for many-core bioinformatics

Archive ouverte | Giraud, Mathieu | CCSD

International audience. Biomanycores is a collection of bioinformatics tools, designed to bridge the gap between researches in OpenCL/CUDA high-performance computing on GPU and other "manycore processors" and usual ...

Suivi de la leucémie résiduelle par séquençage haut-débit

Archive ouverte | Giraud, Mathieu | CCSD

National audience. Le séquençage à haut débit offre de nouvelles perspectives pour le suivi de la leucémie. Nous proposons un algorithme pouvant traiter des millions de séquences, capable de différencier les réarran...

Chargement des enrichissements...