比对记录align bwa bowtie soap

Posted YoungAaron

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了比对记录align bwa bowtie soap相关的知识,希望对你有一定的参考价值。

一 BWA安装使用

下载编译BWA
#tar -jxvf bwa-0.5.7.tar.bz2
#make

BWA使用流程
Index the database file in the FASTA format
Find the suffix array (SA) coordinates of good hits of each individual read
Convert SA coordinates to chromosomal coordinate and pair reads

准备资料
Reference genome data (*.fa)
NGS Short reads data (*.fastq)

建立 Index
#bwa index reference.fa

寻找 SA coordinates
#bwa aln reference.fa leftRead.fastq > leftRead.sai
#bwa aln reference.fa rightRead.fastq > rightRead.sai
若是希望使用 multi threads 跑指令的话
#./bwa aln -c -t 3 -f leftreads.sai reference.fa leftreads.fastq
参数说明
* -f file:file to write output to instead of stdout
* -c:input sequences are in the color space
* -t num :number of threads. (初始值:1)

转换 SA coordinates
#bwa sampe reference.fa leftRead.sai rightRead.sai leftRead.fastq rightread.fastq > human.sam
Generate alignments in the SAM format given single-end reads
#./bwa samse -f leftreads.sam reference.fa leftreads.sai leftreads.fastq
#./bwa samse -f rightreads.sam reference.fa rightreads.sai rightreads.fastq
参数说明
* -f file:输出档案
* -n num: Maximum number of alignments to output in the XA tag for reads paired properly.(默认值为:3)

sam结果(bwa比对结果)
每行为一个read的比对结果,分为12字段
1 QNAME Query (pair) NAME
2 FLAG bitwise FLAG
3 RNAME Reference sequence NAME
4 POS 1-based leftmost POSition/coordinate of clipped sequence
5 MAPQ MAPping Quality (Phred-scaled)
6 CIAGR extended CIGAR string
7 MRNM Mate Reference sequence NaMe (‘=’ if same as RNAME)
8 MPOS 1-based Mate POSistion
9 ISIZE Inferred insert SIZE
10 SEQ query SEQuence on the same strand as the reference
11 QUAL query QUALity (ASCII-33 gives the Phred base quality)
12 OPT variable OPTional fields in the format TAG:VTYPE:VALUE
第12字段为比对结果详细记录,分类如下
NM Edit distance
MD Mismatching positions/bases
AS Alignment score
BC Barcode sequence
X0 Number of best hits
X1 Number of suboptimal hits found by BWA
XN Number of ambiguous bases in the referenece
XM Number of mismatches in the alignment
XO Number of gap opens
XG Number of gap extentions
XT Type: Unique/Repeat/N/Mate-sw
XA Alternative hits; format: (chr,pos,CIGAR,NM;)*
XS Suboptimal alignment score
XF Support from forward/reverse alignment
XE Number of supporting seeds

二 Bowtie安装使用

 

以上是关于比对记录align bwa bowtie soap的主要内容,如果未能解决你的问题,请参考以下文章

BOWTIE2 进行基因组比对

NGS中的一些软件功能介绍

RNA-seq中的基因表达量计算和表达差异分析

[生物信息比对软件列表]

RNA-seq中的基因表达量计算和表达差异分析

NGS数据比对之BWA