perl 语言一些常见生信处理脚本
Posted 01说
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了perl 语言一些常见生信处理脚本相关的知识,希望对你有一定的参考价值。
反向互补序列:
open IN,"c:/Users/11852/Desktop/seqs.fa";
my$seq="";
my$key;
open OUT,">c:/Users/11852/Desktop/seqs_.fa";
while (my $line =<IN>){
chomp $line;
if ($line=~ ">"){
if (length($seq)!= 0){
$seq =join("",reverse (split(//,$seq)));
$seq=~ tr/[ATGC]/[TACG]/;
print OUT $key."\n".$seq."\n";
$seq="";
}
$key=$line;
}else{$seq.=$line;}
}
$seq =join("",reverse (split(//,$seq)));
$seq=~ tr/[ATGC]/[TACG]/;
print OUT $key."\n".$seq."\n";
把cDNA序列转录为RNA,再翻译为蛋白质:
use strict;
open IN,"c:/Users/11852/Desktop/human_TP53_cDNA.fasta";
my $cdna;
while (my $line=<IN>){
chomp $line;
if ($line=~ ">"){
next;
}
$cdna.=$line;
}
print "the length of cdna\t".length($cdna)."\n".$cdna."\n";
$cdna=~ tr/T/U/;
close IN;
open IN,"c:/Users/11852/Desktop/codon_table.txt";
my %codon_dic;
while (my$line =<IN>){
chomp $line;
my @li=split(/\t/,$line);
my$codon=$li[0];
$codon=~tr/T/U/;
$codon_dic{$codon}=$li[1];
}
my $protein;
for (my$i=0;$i<length($cdna);$i+=3){
my$j=substr($cdna,$i,3);
my$a= $codon_dic{$j};
if ($a eq "Stop"){
next;
}
$protein .= $a;
}
print $protein;
计算GC含量:
open IN,"c:/Users/11852/Desktop/Ecoli.fa";
my $sequence_length;
my $gc_count;
while (my $line=<IN>){
chomp $line;
if ($line =~ />/){
next
}
$gc_count+=$line=~ tr/[GC]/#/;
$sequence_length+=length($line);
}
my$percent=$gc_count/$sequence_length;
print "GC content is $percent";
根据位置提取序列
my $header;
my $seq = "";
my %fasta;
open IN, "c:/Users/11852/Desktop/grape38.fa" or die "$!";
while(my $line=<IN>){
chomp($line);
if ($line=~/^>/){
if (length $seq > 0){
$fasta{$header} = $seq;
}
$header = substr($line,1,1);
$seq = "";
}else{
$seq.=$line;
}
}
$fasta{$header} = $seq;
close IN;
open IN, "c:/Users/11852/Desktop/位置.txt" or die "$!";
open OUT, ">c:/Users/11852/Desktop/result_file" or die "$!";
while (my $line = <IN>){
chomp ($line);
my ($chr,$start,$end)=split (/\t/,$line);
my $seq = $fasta{$chr};
my $part = substr($seq,$start, $end-$start);
print OUT ">$line\n$part\n";
}
close IN;
close OUT;
利用hash计数
open(IN,"c:/Users/11852/Desktop/exp.txt") || die "$!";
open(OUT,">c:/Users/11852/Desktop/test.txt") ||die "$!";
my%stat=(
"1-3"=>0,
"3-5"=>0
);
while (my$line=<IN>){
chomp $line;
my@tmp=split(/\t/,$line);
if ($tmp[1]>=1 and $tmp[1]<=3){
$stat{"1-3"}++;
}
if ($tmp[1]>3 and $tmp[1]<=5){
$stat{"3-5"}++;
}
}
close(IN);
for my$k(keys %stat){
print OUT "$k\t$stat{$k}\n";
}
close(OUT);
根据id提取表达量
open(IN,"c:/Users/11852/Desktop/exp.txt") || die "$!";
open(INID,"c:/Users/11852/Desktop/ID.txt") || die "$!";
open(OUT,">c:/Users/11852/Desktop/test.txt") ||die "$!";
my %keep_id=();
while (my$line=<INID>){
chomp $line;
$keep_id{$line}=1;
}
close INID;
while (my$line=<IN>){
chomp $line;
my @tmp=split(/\t/,$line);
if (exists $keep_id{$tmp[0]}){
print OUT $line."\n"
}
}
close(OUT);
close(IN);
以上是关于perl 语言一些常见生信处理脚本的主要内容,如果未能解决你的问题,请参考以下文章