随着三代测序的兴起,长片段比对需求增加。前面一直在用GMAP,这个软件有个缺点,一条带有polyA序列的可能比对不上参考序列,当把polyA去掉之后,又可以比对上。意思就是,当序列开头比对不上的时候,GMAP可能就认为比对不上。有的时候GMAP也不能找全所有可能的hit。使用GMAP时,对于那些map不上的序列,最好随机选择一些,看看是不是真的map不上。今天发现一个新的长片段比对软件 alfalfa (苜蓿)。这款软件号称也兼容短序列比对。文章测评下来,当然要比BWA-MEM, BWA-SW, Bowtie 2, CUSHAW3要好。下面简单介绍下安装及使用
安装
$ git clone git://github.com/readmapping/alfalfa.git $ cd alfalfa $ make ```就是这样简单,安装之后在当前目录下会出现alfalfa可执行命令文件,可以将他copy至全局变量路径里。 <div class="se-preview-section-delimiter"></div> 使用 <div class="se-preview-section-delimiter"></div>
讯享网
@PG ID:alfalfa VN:0.8.1
Usage: alfalfa [] [option…]
Command should be index, align or evaluate
Subcommand is only required for the evaluate command
commands:
index is used to construct the data structures for indexing a given reference
genome.
align is used for mapping and aligning a read set onto a reference genome.
evaluate is used for evaluating the accuracy of simulated reads and summarizing
statistics from the SAM-formatted alignments reported by a read mapper.
讯享网 <div class="se-preview-section-delimiter"></div> Usage: alfalfa index [option...] <div class="se-preview-section-delimiter"></div>
options
-r/–reference (file).
Specifies the location of a file that contains the reference
genome in multi-fasta format.
-s/–sparseness (int, 12).
Specifies the sparseness of the index structure as a way to
control part of the speed-memory trade-off.
-p/–prefix (string, filename passed to the -r option).
Specifies the prefix that will be used to name all generated
index files. The same prefix has to be passed to the -i option
of the align command to load the index structure when mapping
reads.
–no-child .
By default, a sparse child array is constructed and stored in an
index file with extension .child. The construction of this
sparse child array is skipped when the –no-child option is set.
This data structure speeds up seed-finding at the cost of (4/s)
bytes per base in the reference genome. As the data structure
provides a major speed-up, it is advised to have it constructed.
–suflink .
Suffix link support is disabled by default. Suffix link support
is enabled when the –suflink option is set, resulting in an
index file with extension .isa to be generated. This data
structure speeds up seed-finding at the cost of (4/s) bytes per
base. It is only useful when sparseness is less than four and
minimum seed length is very low (less than 10), because it
conflicts with skipping suffixes in matching the read. In
practice, this is rarely the case.
–no-kmer .
By default, a 10-mer lookup table is constructed that contains
the suffix array interval positions to depth 10 in the virtual
suffix tree. It is stored in

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容,请联系我们,一经查实,本站将立刻删除。
如需转载请保留出处:https://51itzy.com/kjqy/56694.html