基因组注释之软件使用

Posted djx571

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了基因组注释之软件使用相关的知识,希望对你有一定的参考价值。

1、RepeatMasker

1.1、输入

输入格式为fasta序列,不接受其它 GenBank, Staden,等格式。它既可以处理一个批文件(一个文件包含许多条序列),也可以批处理许多文件(每个文件含有一条序列)。

RepeatMasker *.fasta

 该命令将mask当前目录下所有的以.fasta文件结尾,并为每个文件提供单独的报告。虽然处理批文件更快,但是处理单个文件更精准。

This command will mask all files that end with .fasta in the current directory and give separate reports for each file. Note that if you have
multiple small sequences it is considerably faster to run RepeatMasker on one batch file than on many single sequence files. The summary file 
will be more informative as well. However, analysis on single files (when larger than 2 kb each) can be slightly more accurate, since GC levels
 for each sequence will be calculated and used to choose appropriate parameters.

 1.2、输出

RepeatMasker返回3个文件:

.mask文件:其中包含所有已标识的重复和低复杂度序列,即mask后得基因组。

.out文件:列出被mask的序列,及其注释文件。序列按提交文件中的顺序打印,而序列在注释表中按字母顺序表示。

tbl文件是所分析序列的重复程度得摘要统计。

RepeatMasker returns a .masked file containing the query sequence(s) with all identified repeats and low complexity sequences masked. 
These masked sequences are listed and annotated in the .out file. The masked sequences are printed in the same order as they are in the
submitted file, whereas the sequences are presented alphabetically in the annotation table. The .tbl file is a summary of the repeat
content of the analyzed sequence.

 

以上是关于基因组注释之软件使用的主要内容,如果未能解决你的问题,请参考以下文章

基因组注释之软件使用

4️⃣ 核酸序列特征分析(8):重复序列的查找

基因结构注释(1):从头注释

使用BRAKER2进行基因组注释

基因注释与功能的分类(3)

annotation非人类物种基因组注释(MSU为例)