megablast 参数

2021/07/02

megablast 2.2.25 arguments:

./megablast --help

  • -d Database [String] default = nr
  • -i 输入文件 [File In]
  • -e E值 [Real] default = 10.0
  • -m 比对文件格式:
    (1)0 = pairwise,
MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database: /home/user/database/hg19.fa
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

>5 
          Length = 180915260

 Score = 63.9 bits (32), Expect = 3e-09
 Identities = 32/32 (100%)
 Strand = Plus / Plus

                                              
Query: 1      aaaataatgcatttgaaatagagatctagcaa 32
              ||||||||||||||||||||||||||||||||
Sbjct: 233526 aaaataatgcatttgaaatagagatctagcaa 233557


  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(2)1 = query-anchored showing identities,

MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database:/home/user/database/hg19.fa
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

1_0 1      aaaataatgcatttgaaatagagatctagcaa 32
5   233526 ................................ 233557
  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(3)2 = query-anchored no identities,

MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database: /home/user/database/hg19.fa
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

1_0 1      aaaataatgcatttgaaatagagatctagcaa 32
5   233526 aaaataatgcatttgaaatagagatctagcaa 233557
  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(4)3 = flat query-anchored, show identities,

MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database: /home/user/database/hg19.fa 
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

1_0 1      aaaataatgcatttgaaatagagatctagcaa 32
5   233526 ................................ 233557
  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(5)4 = flat query-anchored, no identities,

MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database: /home/user/database/hg19.fa
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

1_0 1      aaaataatgcatttgaaatagagatctagcaa 32
5   233526 aaaataatgcatttgaaatagagatctagcaa 233557
  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(6)5 = query-anchored no identities and blunt ends,

MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database: /home/user/database/hg19.fa
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

1_0 1      aaaataatgcatttgaaatagagatctagcaa 32
5   233526 aaaataatgcatttgaaatagagatctagcaa 233557
  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(7)6 = flat query-anchored, no identities and blunt ends,

MEGABLAST 2.2.25 [Feb-01-2011]


Reference: Zheng Zhang, Scott Schwartz, Lukas Wagner, and Webb Miller (2000), 
"A greedy algorithm for aligning DNA sequences", 
J Comput Biol 2000; 7(1-2):203-14.

Database: /home/user/database/hg19.fa 
           93 sequences; 3,137,161,264 total letters

Searching..................................................done

Query= seqname
         (32 letters)



                                                                 Score    E
Sequences producing significant alignments:                      (bits) Value

5                                                                      64   3e-09

1_0 1      aaaataatgcatttgaaatagagatctagcaa 32
5   233526 aaaataatgcatttgaaatagagatctagcaa 233557
  Database: /home/user/database/hg19.fa
    Posted date:  Jun 27, 2018  11:28 AM
  Number of letters in database: 3,137,161,264
  Number of sequences in database:  93
  
Lambda     K      H
    1.37    0.711     1.31 

Gapped
Lambda     K      H
    1.37    0.711     1.31 


Matrix: blastn matrix:1 -3
Gap Penalties: Existence: 0, Extension:  3.5
Number of Sequences: 93
Number of Hits to DB: 259,093
Number of extensions: 1
Number of successful extensions: 1
Number of sequences better than 10.0: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 32
Length of database: 3,137,161,264
Length adjustment: 18
Effective length of query: 14
Effective length of database: 3,137,159,590
Effective search space: 43920234260
Effective search space used: 43920234260
X1: 11 (21.8 bits)
X2: 20 (39.6 bits)
X3: 51 (101.1 bits)
S1: 16 (32.2 bits)
S2: 16 (32.2 bits)

(8)7 = XML Blast output,

<?xml version="1.0"?>
<!DOCTYPE BlastOutput PUBLIC "-//NCBI//NCBI BlastOutput/EN" "http://www.ncbi.nlm.nih.gov/dtd/NCBI_BlastOutput.dtd">
<BlastOutput>
  <BlastOutput_program>blastn</BlastOutput_program>
  <BlastOutput_version>blastn 2.2.25 [Feb-01-2011]</BlastOutput_version>
  <BlastOutput_reference>~Reference: Altschul, Stephen F., Thomas L. Madden, Alejandro A. Schaffer, ~Jinghui Zhang, Zheng Zhang, Webb Miller, and David J. Lipman (1997), ~&quot;Gapped BLAST and PSI-BLAST: a new generation of protein database search~programs&quot;,  Nucleic Acids Res. 25:3389-3402.</BlastOutput_reference>
  <BlastOutput_db>/home/user/database/hg19.fa</BlastOutput_db>
  <BlastOutput_query-ID>lcl|1_0</BlastOutput_query-ID>
  <BlastOutput_query-def>seqname</BlastOutput_query-def>
  <BlastOutput_query-len>32</BlastOutput_query-len>
  <BlastOutput_param>
    <Parameters>
      <Parameters_expect>10</Parameters_expect>
      <Parameters_sc-match>1</Parameters_sc-match>
      <Parameters_sc-mismatch>-3</Parameters_sc-mismatch>
      <Parameters_gap-open>0</Parameters_gap-open>
      <Parameters_gap-extend>0</Parameters_gap-extend>
      <Parameters_filter>F</Parameters_filter>
    </Parameters>
  </BlastOutput_param>
  <BlastOutput_iterations>
    <Iteration>
      <Iteration_iter-num>1</Iteration_iter-num>
      <Iteration_query-ID>lcl|1_0</Iteration_query-ID>
      <Iteration_query-def>seqname</Iteration_query-def>
      <Iteration_query-len>32</Iteration_query-len>
      <Iteration_hits>
        <Hit>
          <Hit_num>1</Hit_num>
          <Hit_id>lcl|5</Hit_id>
          <Hit_def>No definition line found</Hit_def>
          <Hit_accession>5</Hit_accession>
          <Hit_len>180915260</Hit_len>
          <Hit_hsps>
            <Hsp>
              <Hsp_num>1</Hsp_num>
              <Hsp_bit-score>63.9245</Hsp_bit-score>
              <Hsp_score>32</Hsp_score>
              <Hsp_evalue>2.50885e-09</Hsp_evalue>
              <Hsp_query-from>1</Hsp_query-from>
              <Hsp_query-to>32</Hsp_query-to>
              <Hsp_hit-from>233526</Hsp_hit-from>
              <Hsp_hit-to>233557</Hsp_hit-to>
              <Hsp_query-frame>1</Hsp_query-frame>
              <Hsp_hit-frame>1</Hsp_hit-frame>
              <Hsp_identity>32</Hsp_identity>
              <Hsp_positive>32</Hsp_positive>
              <Hsp_align-len>32</Hsp_align-len>
              <Hsp_qseq>AAAATAATGCATTTGAAATAGAGATCTAGCAA</Hsp_qseq>
              <Hsp_hseq>AAAATAATGCATTTGAAATAGAGATCTAGCAA</Hsp_hseq>
              <Hsp_midline>||||||||||||||||||||||||||||||||</Hsp_midline>
            </Hsp>
          </Hit_hsps>
        </Hit>
      </Iteration_hits>
      <Iteration_stat>
        <Statistics>
          <Statistics_db-num>93</Statistics_db-num>
          <Statistics_db-len>3137161264</Statistics_db-len>
          <Statistics_hsp-len>18</Statistics_hsp-len>
          <Statistics_eff-space>4.39202e+10</Statistics_eff-space>
          <Statistics_kappa>0.711</Statistics_kappa>
          <Statistics_lambda>1.374</Statistics_lambda>
          <Statistics_entropy>1.31</Statistics_entropy>
        </Statistics>
      </Iteration_stat>
    </Iteration>
  </BlastOutput_iterations>
</BlastOutput>

(9)8 = tabular,

seqname 5   100.00  32  0   0   1   32  233526  233557  3e-09   63.9

(10)9 tabular with comment lines,

# BLASTN 2.2.25 [Feb-01-2011]
# Query: seqname
# Database: /home/user/database/hg19.fa
# Fields: Query id, Subject id, % identity, alignment length, mismatches, gap openings, q. start, q. end, s. start, s. end, e-value, bit score
seqname 5   100.00  32  0   0   1   32  233526  233557  3e-09   63.9

(11)10 ASN, text
(12)11 ASN, binary [Integer] default = 0 range from 0 to 11

  • -o BLAST报告输出文件名,默认 = stdout
  • -F 过滤输入序列 [String],【T / F】
  • -X 间隙对齐的X衰减值(位)[整数],默认=20 (X dropoff value for gapped alignment (in bits) [Integer] default = 20)
  • -I 显示GI在deflines 【T / F
  • -q 对核苷酸错配的惩罚 【-3】
  • -r 核苷酸匹配奖励 【1】
  • -v 显示(V)的一行描述的数据库序列数 【500】
  • -b 显示(B)比对的数据库序列数 【250】
  • -D 输出类型:
    0 - 对齐终点和分数,
    1 - all ungapped segments endpoints,
    2 - traditional BLAST output,
    3 - tab-delimited one line format,
    4 - incremental text ASN.1,
    5 - incremental binary ASN.1 [Integer] default = 2
  • -a 要使用的处理器数量 【1】
  • -O ASN.1 SeqAlign文件;必须与-D2选项结合使用
  • -J 相信查询定义 【T / F
  • -M 单次搜索的最大查询总长度 【5000000】
  • -W word大小(最佳完美匹配的长度)【28】
  • -z 数据库的有效长度 【0】
  • -Y 搜索空间的有效长度【0】
  • -P 散列值的最大位置数(设置为0以忽略)[整数] 【0】
  • -S 查询链搜索数据库:3是两者,1是顶部,2是底部 【3】
  • -T 输出HTML 【T / F
  • -l 将数据库搜索限制为GI的列表[String]
  • -G 打开gap的成本 【-1】
  • -E 扩展gap的成本 【-1】
  • -s 报告的最小命中分数 【0】
  • -Q 屏蔽查询输出,必须与-D 2选项结合使用
  • -f 在输出中显示完整的ID(默认-仅限GIs或加入)【T / F
  • -U 使用FASTA序列的小写过滤 【T / F
  • -R 在输出结束时报告日志信息 【T / F
  • -p 相似性百分比cut 【0】
  • -L 查询序列上的位置
  • -A 多次点击窗口大小;对于不连续的模板,默认值为0(即单击扩展)或40(负数覆盖此)[整数] 【0】
  • -y 无上限扩展的X dropoff值[整数] 【10】
  • -Z X动态编程间隙扩展的压差值[整数] 【50】
  • -t 不连续单词模板的长度(如果为0,则为连续单词)[整数] 【0】
  • -g 使不连续的megablast为数据库的每个基础生成单词(当前的BLAST引擎是强制性的)【T / F】
  • -n 对亲合差距分数使用非贪婪(动态规划)扩展 【T / F
  • -N 不连续词模板的类型 【0 - 编码 / 1 - 最优 / 2 - 两个同时】
  • -H 每个数据库序列要保存的最大HSP数 【0】
  • -V 强制使用遗留blast引擎 【T/F
©著作权归作者所有,转载或内容合作请联系作者
平台声明:文章内容(如有图片或视频亦包括在内)由作者上传并发布,文章内容仅代表作者本人观点,简书系信息发布平台,仅提供信息存储服务。

推荐阅读更多精彩内容