Search and clustering orders of magnitude faster than BLAST

Abstract:

MOTIVATION: Biological sequence data is accumulating rapidly, motivating the development of improved high-throughput methods for sequence classification. RESULTS: UBLAST and USEARCH are new algorithms enabling sensitive local and global search of large sequence databases at exceptionally high speeds. They are often orders of magnitude faster than BLAST in practical applications, though sensitivity to distant protein relationships is lower. UCLUST is a new clustering method that exploits USEARCH to assign sequences to clusters. UCLUST offers several advantages over the widely used program CD-HIT, including higher speed, lower memory use, improved sensitivity, clustering at lower identities and classification of much larger datasets. AVAILABILITY: Binaries are available at no charge for non-commercial use at http://www.drive5.com/usearch.

SEEK ID: https://fairdomhub.org/publications/264

PubMed ID: 20709691

Projects: GenoSysFat

Publication type: Not specified

Journal: Bioinformatics

Citation: Bioinformatics. 2010 Oct 1;26(19):2460-1. doi: 10.1093/bioinformatics/btq461. Epub 2010 Aug 12.

Date Published: 12th Aug 2010

Registered Mode: Not specified

Author: R. C. Edgar

help Submitter
Activity

Views: 5377

Created: 8th Jul 2016 at 07:42

Last updated: 8th Dec 2022 at 17:26

help Tags

This item has not yet been tagged.

help Attributions

None

Powered by
(v.1.14.2)
Copyright © 2008 - 2023 The University of Manchester and HITS gGmbH