美文网首页
Index parameters信息收集

Index parameters信息收集

作者: 代号北极能 | 来源:发表于2019-06-28 23:15 被阅读0次

    Most USEARCH commands use a database index to enable fast searching. There are two types of index: one for finding matching seeds for the UBLAST algorithm, and another for fast calculation of common word counts for the USEARCH algorithm. Clustering uses a USEARCH-style index. Indexing parameters apply to both types of index.

    During search and clustering, indexes are always accessed directly in memory rather than being retrieved from a disk file, in order to maximize speed. The amount of RAM required to store the index is approximately the same as the size of a UDB file created with the same sequences and options. The physical RAM in the computer should be bigger than the index, otherwise virtual memory paging will cause much slower execution.

    Indexes are constructed in three different ways:

    (1) Loaded from in a UDB file.

    (2) Built from a FASTA file.

    (3) Built dynamically during clustering. The index is initially empty, then grows as centroid sequences are added to the database.

    Indexing options

    In the following table, "word" refers generically to the fixed-length segment of the database sequence that is indexed. It may be a k-mer or a pattern. The effective word length is the length of the k-mer or the number of 1s in the pattern.

    相关文章

      网友评论

          本文标题:Index parameters信息收集

          本文链接:https://www.haomeiwen.com/subject/ffiycctx.html