r query_tblastn.R
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了r query_tblastn.R相关的知识,希望对你有一定的参考价值。
library(dplyr)
library(readr)
library(tidyr)
library(stringr)
library(reutils)
library(XML)
library(reutils)
tblastn <- function(fasta, db = "refseq_genomic")
{
query_name <- paste0(fasta, ".blast.txt")
comm <- paste("export BLASTDB=/usr/local/share/blast; tblastn -query ", fasta,
"-db", db,
"-task tblastn",
"-max_target_seqs 20000",
"-outfmt '6 sscinames scomnames staxids qseqid sseqid pident length mismatch gapopen qstart qend sstart send evalue bitscore'",
"-remote",
"-entrez_query txid6237[ORGN]",
">",
query_name)
print(comm)
system(comm)
r <- read_tsv( query_name, col_names = c("Species",
"Name",
"TaxID",
"QueryID",
"SubjectID",
"Percent_Identity",
"Alignment_Length",
"Mismatches",
"Gap_Openings",
"Q.Start",
"Q.End",
"S.Start",
"S.End",
"E",
"Bits") ) %>%
separate(SubjectID, into = c("name_drop", "gi", "ref_drop","accession"), sep = "\\|", extra = "drop", convert = T) %>%
dplyr::select(-name_drop, -ref_drop) %>%
dplyr::mutate(Name = sapply(unlist(efetch(accession, db="nuccore", "docsum")['//Item[@Name="Title"]/text()']), xmlValue) ) %>%
dplyr::rename(POS_Start = S.Start, POS_End = S.End) %>%
dplyr::mutate(CHROM = str_match(Name, "chromosome ([A-Za-z0-9])")[,2]) %>%
dplyr::select(Species, Name, TaxID, QueryID, CHROM, POS_Start, POS_End, accession, everything())
}
results <- tblastn("~/Desktop/Sap.fasta")
以上是关于r query_tblastn.R的主要内容,如果未能解决你的问题,请参考以下文章
——R的数据组织
+-r, +-s 的所有排列
shinydashboard ui.R 和 server.R 未读取 Global.R
R语言计算回归模型R方(R-Squared)实战
r语言中r-studio怎么调用
R电子书资料《学习R》+《R语言实战第2版》+《R数据科学》学习推荐